Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments have been considered but are moot because the new ground of rejection does not rely on the same combination references applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 10 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al (US 2017/0132516) in view of Crockett et al (US 2012/0310539), Ray 
For claim 1, Zhang teaches a computer-implemented method (see [0045]) for lead scoring (estimating likelihood that a product will be purchased, see Abstract and [0017]), comprising: 
performing random sampling of the data set (via 160 of Figure 1, see [0031] and [0043]) to generate training and test data (see [0040]); 
building a model based on the training and test data (320 of Figure 3, see [0051], as explained below); 
It is noted that Zhang teaches in claim 1 that “the combined sample list [comprises] a random sample of the imbalanced large scale data set” and further teaches “feeding the combined sample list into a prediction model enabling the prediction model to provide predictive capabilities with negligible variance for the imbalanced large scale data set” but is silent as to the details of the prediction model.
Zhang does not distinctly disclose:
preparing a data set by assigning a median value to variables having missing values wherein the variables with missing values are within a threshold level;
refining the model by using a true positive rate and a true negative rate; and
validating the model by simultaneously optimizing a difference between the TPR and a validation TPR and a difference between the TNR and a validation TNR.
However, Crockett teaches using k-fold cross validation to determine classifier performance ([0056]) wherein “the original dataset is partitioned into k samples and of the k samples, k-1 subsamples are used as training data for training the classifier. The 
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to optimize the performance of Zang’s predictive model as a function of TPR and TNR via k-fold cross-validation since the particular known technique was recognized as part of the ordinary capabilities of one skilled in the art, as evidenced by Crockett.
Further, the combination of Zhang and Crockett does not distinctly disclose:
preparing a data set by assigning a median value to variables having missing values wherein the variables with missing values are within a threshold level.
However, Ray teaches that median imputation is a method to fill in missing values with estimated ones by “replacing the missing data for a given attribute by the mean or median (quantitative attribute) …of all known values of that variable” (¶2 of “Methods to treat Missing Values”).  Therefore, Ray teaches preparing a data set by using a median value for one or more missing values of variables.
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to assign a median value (i.e., median imputation) to variables having missing values from the data set of the combination of Zhang and Crockett 
The combination of Zhang, Crockett and Ray does not distinctly disclose 
a number of missing values of the variables is within a threshold level.
However, Kala teaches in [0036] that “the data preprocessing engine 201 may automatically use imputation methods to fill the missing values on the attributes with less than the threshold missing values. The data preprocessing engine 201 may prompt the user to upload cleaner data for the attributes where the missing values are more than the threshold value.”
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to use median imputation on attributes with less than a threshold number of missing values in the combination of Zhang, Crockett and Ray as defined above since the particular known technique was recognized as part of the ordinary capabilities of one skilled in the art, as evidenced by Kala.
For claim 10, Zhang teaches a system to score lead (estimating likelihood that a product will be purchased, see Abstract and [0017]) comprising:
perform random sampling of the data set (via 160 of Figure 1, see [0031] and [0043]) to generate training and test data (see [0040]); 
build a model based on the training and test data (320 of Figure 3, see [0051], as explained below); 
It is noted that Zhang teaches in claim 1 that “the combined sample list [comprises] a random sample of the imbalanced large scale data set” and further teaches “feeding the combined sample list into a prediction model enabling the 
Zhang does not distinctly disclose:
preparation of a data set by assigning a median value to variables having missing values wherein the variables with missing values are within a threshold level;
refine the model by using a true positive rate and a true negative rate; and
validate the model by simultaneously optimizing a difference between the TPR and a validation TPR and a difference between the TNR and a validation TNR.
However, Crockett teaches using k-fold cross validation to determine classifier performance ([0056]) wherein “the original dataset is partitioned into k samples and of the k samples, k-1 subsamples are used as training data for training the classifier. The cross validation is repeated k times, during which each of the k samples used once.  The resulting k outcomes are averaged to produce a single estimation. In one embodiment, the weighted average from a three-fold cross validation of sensitivity (i.e., k=3, true positive rate), specificity (true negative rate), and positive predictive value (precision) may be calculated for each classifier algorithm… The performance of the classifier (i.e., predictive positive value) may be measured as a function of the sensitivity and specificity.”
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to optimize the performance of Zang’s predictive model as a function of TPR and TNR via k-fold cross-validation since the particular known technique was recognized as part of the ordinary capabilities of one skilled in the art, as evidenced by Crockett.
Further, the combination of Zhang and Crockett does not distinctly disclose:
preparation of a data set by assigning a median value to variables having missing values wherein the variables with missing values are within a threshold level.
However, Ray teaches that median imputation is a method to fill in missing values with estimated ones by “replacing the missing data for a given attribute by the mean or median (quantitative attribute) …of all known values of that variable” (¶2 of “Methods to treat Missing Values”).  Therefore, Ray teaches preparing a data set by using a median value for one or more missing values of variables.
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to assign a median value (i.e., median imputation) to variables having missing values from the data set of the combination of Zhang and Crockett before performing random sampling in order to estimate missing values by known methods, as evidenced by Ray.
The combination of Zhang, Crockett and Ray does not distinctly disclose 
a number of missing values of the variables is within a threshold level.
However, Kala teaches in [0036] that “the data preprocessing engine 201 may automatically use imputation methods to fill the missing values on the attributes with less than the threshold missing values. The data preprocessing engine 201 may prompt the user to upload cleaner data for the attributes where the missing values are more than the threshold value.”
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to use median imputation on attributes with less than a threshold number of missing values in the combination of Zhang, Crockett and Ray as 
For claim 19, Zhang as modified by Crockett, Ray and Kala teaches all of the limitations of claim 1 and Crockett further teaches:
the model is validated by decreasing the difference between the TPR and the validated TPR and decreasing the difference between the TNR and the validated TNR iteratively (as understood from performing k-fold validation, [0056]).
Claims 2-5 and 11-14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhang, Crockett, Ray, Kala and Yahalom et al (US 2012/0202226).
For claims 2 and 11, Zhang as modified by Crockett and Ray teaches all of the limitations of claims 1 and 10 as cited above, but does not distinctly disclose:
                        
                            T
                            P
                            R
                            =
                            
                                
                                    N
                                    u
                                    m
                                    b
                                    e
                                    r
                                     
                                    o
                                    f
                                     
                                    E
                                    n
                                    q
                                    u
                                    i
                                    r
                                    i
                                    e
                                    s
                                     
                                    P
                                    r
                                    e
                                    d
                                    i
                                    c
                                    t
                                    e
                                    d
                                     
                                    a
                                    s
                                     
                                    P
                                    o
                                    t
                                    e
                                    n
                                    t
                                    i
                                    a
                                    l
                                     
                                    O
                                    r
                                    d
                                    e
                                    r
                                    s
                                     
                                    (
                                    i
                                    .
                                    e
                                    .
                                    ,
                                     
                                    T
                                    r
                                    u
                                    e
                                     
                                    P
                                    o
                                    s
                                    i
                                    t
                                    i
                                    v
                                    e
                                    s
                                    )
                                
                                
                                    N
                                    u
                                    m
                                    b
                                    e
                                    r
                                     
                                    o
                                    f
                                     
                                    E
                                    n
                                    q
                                    u
                                    i
                                    r
                                    i
                                    e
                                    s
                                     
                                    A
                                    c
                                    t
                                    u
                                    a
                                    l
                                    l
                                    y
                                     
                                    C
                                    o
                                    n
                                    v
                                    e
                                    r
                                    t
                                    e
                                    d
                                     
                                    t
                                    o
                                     
                                    O
                                    r
                                    d
                                    e
                                    r
                                    s
                                     
                                    (
                                    i
                                    .
                                    e
                                    .
                                    ,
                                     
                                    T
                                    r
                                    u
                                    e
                                     
                                    P
                                    o
                                    s
                                    i
                                    t
                                    i
                                    v
                                    e
                                    s
                                    +
                                    F
                                    a
                                    l
                                    s
                                    e
                                     
                                    N
                                    e
                                    g
                                    a
                                    t
                                    a
                                    i
                                    v
                                    e
                                    s
                                    )
                                
                            
                        
                    
However, Yahalom teaches in [0129]-[0130] that sensitivity (i.e., true positive rate) is defined as: 
                        
                            T
                            P
                            R
                            =
                            
                                
                                    T
                                    r
                                    u
                                    e
                                     
                                    P
                                    o
                                    s
                                    i
                                    t
                                    i
                                    v
                                    e
                                
                                
                                    T
                                    r
                                    u
                                    e
                                     
                                    P
                                    o
                                    s
                                    i
                                    t
                                    i
                                    v
                                    e
                                    +
                                    F
                                    a
                                    l
                                    s
                                    e
                                     
                                    N
                                    e
                                    g
                                    a
                                    t
                                    i
                                    v
                                    e
                                
                            
                        
                    
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to calculate TPR in the combination of Zhang, Vaughan and Ray as taught by Yahalom since it was recognized as part of the ordinary capabilities of one skilled in the art, as evidenced by Yahalom.
For claims 3 and 12, Zhang as modified by Crockett, Ray and Kala teaches all of the limitations of claims 1 and 10 as cited above, but does not distinctly disclose:
                        
                            T
                            N
                            R
                            =
                            
                                
                                    N
                                    u
                                    m
                                    b
                                    e
                                    r
                                     
                                    o
                                    f
                                     
                                    E
                                    n
                                    q
                                    u
                                    i
                                    r
                                    i
                                    e
                                    s
                                     
                                    P
                                    r
                                    e
                                    d
                                    i
                                    c
                                    t
                                    e
                                    d
                                     
                                    a
                                    s
                                     
                                    P
                                    o
                                    t
                                    e
                                    n
                                    t
                                    i
                                    a
                                    l
                                     
                                    D
                                    r
                                    o
                                    p
                                    s
                                     
                                    (
                                    i
                                    .
                                    e
                                    .
                                    ,
                                     
                                    T
                                    r
                                    u
                                    e
                                     
                                    N
                                    e
                                    g
                                    a
                                    t
                                    i
                                    v
                                    e
                                    s
                                    )
                                
                                
                                    N
                                    u
                                    m
                                    b
                                    e
                                    r
                                     
                                    o
                                    f
                                     
                                    E
                                    n
                                    q
                                    u
                                    i
                                    r
                                    i
                                    e
                                    s
                                     
                                    A
                                    c
                                    t
                                    u
                                    a
                                    l
                                    l
                                    y
                                     
                                    C
                                    o
                                    n
                                    v
                                    e
                                    r
                                    t
                                    e
                                    d
                                     
                                    t
                                    o
                                     
                                    D
                                    r
                                    o
                                    p
                                    s
                                     
                                    (
                                    i
                                    .
                                    e
                                    .
                                    ,
                                     
                                    T
                                    r
                                    u
                                    e
                                     
                                    N
                                    e
                                    g
                                    a
                                    t
                                    i
                                    v
                                    e
                                    s
                                    +
                                    F
                                    a
                                    l
                                    s
                                    e
                                     
                                    P
                                    o
                                    s
                                    i
                                    t
                                    i
                                    v
                                    e
                                    s
                                    )
                                
                            
                        
                    
However, Yahalom teaches in [0129]-[0130] that specificity (i.e., true negative rate) is defined as: 
                        
                            T
                            N
                            R
                            =
                            
                                
                                    T
                                    r
                                    u
                                    e
                                     
                                    N
                                    e
                                    g
                                    a
                                    t
                                    i
                                    v
                                    e
                                
                                
                                    T
                                    r
                                    u
                                    e
                                     
                                    N
                                    e
                                    g
                                    a
                                    t
                                    i
                                    v
                                    e
                                    +
                                    F
                                    a
                                    l
                                    s
                                    e
                                     
                                    P
                                    o
                                    s
                                    i
                                    t
                                    i
                                    v
                                    e
                                
                            
                        
                    
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to calculate TNR in the combination of Zhang, Crockett, Ray and Kala as taught by Yahalom since it was recognized as part of the ordinary capabilities of one skilled in the art, as evidenced by Yahalom.
Claims 4, 5, 13 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhang, Crockett, Ray, Kala and Beniwal et al (NPL: “Classification and Feature Selection Techniques in Data Mining”).
For claims 4 and 13, Zhang as modified by Crockett, Ray and Kala teaches all of the limitations of claim 1 and 10 as cited above, but does not distinctly disclose the data set is further filtered to remove variables based on their usefulness.
However, Beniwal teaches data preprocessing and feature selection (see “3. Feature Selection”) which comprises removing irrelevant attributes via filter approach, wherein “a feature relevance score is calculated, and low-scoring features are removed. The subset of features left after feature removal is presented as input to the classification algorithm” (2nd ¶ of “3. Feature Selection”).
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to use Beniwal’s filter approach to remove variables in the data set of the combination Zhang, Crockett, Ray and Kala in order to avoid overfitting, improve model performance and to provide faster and more cost-effective models (see 1st
For claims 5 and 14, Zhang as modified by Crockett, Ray and Kala teaches all of the limitations of claims 1 and 10 as cited above, but does not distinctly disclose identifying independent variables and dependent variables.
However, Beniwal teaches data preprocessing and feature selection (see “3. Feature Selection”) using a wrapper approach which requires identifying independent variables and dependent variables (“advantages of wrapper approaches include…the ability to take into account feature dependencies”, see “3. Feature Selection”).
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to use Beniwal’s wrapper approach to remove variables in the data set of the combination of Zhang, Crocket, Ray and Kala in order to avoid overfitting, improve model performance and to provide faster and more cost-effective models (see 1st ¶ of “3. Feature Selection”).
Claims 6 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhang, Vaughan, Ray, Beniwal and Chen (NPL: “Advances in Knowledge Discovery and Data Mining”)
For claims 6 and 15, Zhang as modified by Crockett, Ray and Kala teaches all of the limitations of claim 1 as cited above and Zhang as modified by Crockett, Ray, Kala and Beniwal teaches all of the limitations of claim 14 above, but do not distinctly disclose the limitations of claim 6 and 15, respectively.
However, Beniwal teaches:
creating new variables (by correcting errors, see Data Cleaning of “2. Data Preprocessing”) and dummy variables
grouping the variables based on their conversion levels (see Discretization of “2. Data Preprocessing”).
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to use Beniwal’s data preprocessing on the data set of the combination of Zhang, Crockett, Ray and Kala in order to clean the data (see Data Cleaning of “2. Data Preprocessing”) and improve comprehensibility (see Discretization of “2. Data Preprocessing”).
The combination of Zhang, Crockett, Ray, Kala and Beniwal does not distinctly disclose:
ignoring variables with missing values up to a threshold percentage level.
However, Chen teaches selecting variables which includes: ignoring variables with missing values up to a threshold percentage level (see 1st ¶ of “I. Wisconsin Breast Cancer Diagnostic Data Set”, page 98); 
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to ignore variables with missing values up to a threshold percentage level within the data set of the combination Zhang, Crockett, Ray, Kala and Beniwal in order to increase performance and/or accuracy as evidenced by Chen (see “I. Wisconsin Breast Cancer Diagnostic Data Set”, page 98-100).
Claims 7-9 and 16-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhang, Crockett, Ray, Kala and Pinto et al (US 2005/0234753).
For claims 7 and 16
However, Pinto teaches that the validation of the model further comprises: 
determining a model from a plurality of models using a lift chart (see [0003]); and 
performing a concordance test to validate the model (see [0198]).
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to incorporate a graphical user interface which displays a lift chart and concordance information to the combination of Zhang, Crocket, Ray and Kala as defined above to allow an “analyst [to use] his experience and skill to create a custom model using available model building software applied to currently available data”, (see [0002]).
For claims 8 and 17, Zhang as modified by Crocket, Ray, Kala and Pinto teaches all of the limitations of claim 7 and 16 as cited above and Pinto further teaches:
succeeding the validation of the built model:
prediction of lead scores for the data set (via “Select Validation Dataset” of Fig. 25A, see [0205]); and 
tracking the predicted lead scores against an actual data (see Figures 25A-C and [0032]).
Before the effective filing date of the invention it would have been obvious to one of ordinary skill in the art to predict lead scores and track lead scores in the combination of Zhang, Crockett, Ray, and Kala as taught by Pinto in order to allow a user to generate, update, change, review and deploy models at a low cost with better results (see [0035]).
For claims 9 and 18
building a new model (via Validate model, Figure 25A) or updating the built model (via Reconsider Model, Figure 25A) based on a determination if the predicted lead scores are below a threshold (as determined by Figures 25B-C).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL CALRISSIAN PUENTES whose telephone number is (571)270-5070.  The examiner can normally be reached on M-F 9-6:30 (flex).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bob Pascal can be reached on 571-272-1769.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access 






/DANIEL C PUENTES/Primary Examiner, Art Unit 2849