DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 09/14/2018 was filed.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-6, 8-14, and 16-22 are rejected under 35 U.S.C. 103 as being unpatentable over Drissi (US PAT: 6728689, Filed Date: Nov. 14, 2000 hereinafter “Drissi”) in view of Chu (US PGPUB: 20130226842, Filed Date: Apr. 12, 2012 hereinafter Chu) and in view of Dirac (US PGPUB: 20150379430, Filed Date: Dec. 12, 2014 hereinafter “Dirac”) .
Regarding independent claim 1, Drissi
a non-transitory computer readable storage medium storing thereon a dataset having data from which a machine learning model is buildable; (Drissi – [Col. ll. 12-26] FIG. 3 illustrates an exemplary table from the domain dataset 300 that includes training examples, each labeled with a specific class. As previously indicated, the domain dataset 300 contains a record for each object and indicates the class associated with each object. The domain dataset 300 maintains a plurality of records, such as records 305 through 320, each associated with a different object. For each object, the domain dataset 300 indicates a number of features in fields 350 through 365, describing each object in the dataset. The last field 370 corresponds to the class assigned to each object.)
processing resources including at least one hardware processor operably coupled to a memory, the processing resources being configured to execute instructions to perform functionality comprising:  (Drissi – [Col. 4 ll. 48-65] FIG. 1 is a schematic block diagram showing the architecture of an illustrative data classification system 100 in accordance with the present invention. The data classification system 100 may be embodied as a general purpose computing system, such as the general purpose computing system shown in FIG. 1. The data classification system 100 includes a processor 110 and related memory, such as a data storage device 120, which may be distributed or local. The processor 110 may be embodied as a single processor, or a number of local or distributed processors operating in parallel. The data storage device 120 and/or a read only memory (ROM) are operable to store one or more instructions, which the processor 110 is operable to retrieve, interpret and execute. As shown in FIG. 1, the data classification system 100 optionally includes a connection to a computer network (not shown))
accessing at least a portion of the dataset; (Drissi – [Col. 5 ll. 35-41] FIG. 2 provides a global view of the data classification system 100. a domain dataset 300, discussed below in conjunction with FIG. 3, serves as input to the system 100. The domain dataset 300 is applied to a self-adaptive learning process 900. System 100 is receiving/accessing the domain dataset 300.)
for each of a plurality of independent variables in the accessed portion of the dataset: generating meta-features for the respective independent variable; (Drissi – [Col. 5 ll. 55-60] The meta-feature generation process 600 executed during step 240 represents the domain dataset 300 as a set of meta-features. [Col. 7 ll. 28-47] FIG. 6 is a flow chart describing the meta-feature generation process 600. As previously indicated, the meta-feature generation process 600 processes each set of domain data to represent the domain as a set of meta-features. As shown in FIG. 6, the meta-feature generation process 600 initially processes the domain dataset 300 during step 610 to store the information in a table. Thereafter, the meta-feature generation process 600 extracts statistics from the dataset 300 during step 620 that are then used to generate meta-features during step 630. Drissi independent variables is the feature columns (355,360,365,…n) in the dataset that is used for generating meta-features.)
providing, as input to at least first and second pre-trained classification models that are different from one another, the generated meta-features for the respective independent variable; (Drissi – [Col. 6 ll. 30-46] FIG. 4 illustrates an exemplary table from the performance dataset 400. As previously indicated, the performance dataset 400 indicates the performance for each model on a domain. The performance dataset 400 maintains a plurality of records, such as records 405 through 415, each associated with a different model. For each model, the performance dataset 400 identifies the domain on which the model was utilized in field 450, as well as the underlying bias embodied in the model in field 455 and the performance assessment in field 460. The meta-features domain associated with different model. Each domain can be identified in field 450, for example, using a vector of meta-features characterizing each domain (as produced by the meta-feature generation process 600). As previously indicated, for each self-adaptive learning algorithm 900-1 and 900-2, there will be a corresponding performance dataset 400-N.)
receiving, as output from the first pre-trained classification model, (Drissi – [Col. 9 ll. 51-56] As shown in FIG. 10, the process begins during step 1005 by characterizing an input domain dataset 300 according to a set of meta-features, using the meta-feature generation process 600 (FIG. 6). The self-adaptive learning algorithms 900-1 and 900-2 select a corresponding model during step 1015, in the manner described above in conjunction with FIG. 9. Model A is the first model.)
and receiving, as output from the second pre-trained classification model, (Drissi – [Col. 9 ll. 51-56] As shown in FIG. 10, the process begins during step 1005 by characterizing an input domain dataset 300 according to a set of meta-features, using the meta-feature generation process 600 (FIG. 6). The self-adaptive learning algorithms 900-1 and 900-2 select a corresponding model during step 1015, in the manner described above in conjunction with FIG. 9. Model B is the second model.
Drissi does not explicitly teach: an indication of one or more missing value imputation operations appropriate for the respective independent variable; an indication of one or more other preprocessing data cleansing related operations appropriate for the respective independent variable;
However, Chu teaches: receiving, as output from the first pre-trained classification model, an indication of one or more missing value imputation operations appropriate for the respective independent variable; (Chu − [0037-0038] The imputation strategy may be defined according to how the completed data set (for all possible predictor variables) would be generated. One possible strategy is to impute the missing value for each of the one or more predictor variables by the mean of K predicted values for the continuous predictor variable from the final ensemble model and by the mode of K predicted values for the categorical predictor variable. Another possible strategy is to impute the missing value for each of the one or more predictor variables by the predicted value from a randomly selected imputation model out of K models for the predictor variable to be imputed. FIG. 4 illustrates, in a flow diagram, further details of missing value imputation for large and distributed data sources in accordance with certain embodiments. [0046] FIG. 5 illustrates, in a flow diagram, processing to build one or more imputation models in accordance with certain embodiments. FIG. 5 is formed by FIGS. 5A and 5B. [0047] Control begins at block 500 with the missing value imputation system 110 determining whether the measured level of the predictor variable X is continuous. If so, processing continues to block 502, otherwise, processing continues to block 512 (FIG. 5B). Chu teaches building models by filling in the missing data of continuous variables.
and receiving, as output from the second pre-trained classification model, an indication of one or more other preprocessing data cleansing related operations appropriate for the respective independent variable; (Chu − [0047] Control begins at block 500 with the missing value imputation system 110 determining whether the measured level of the predictor variable X is continuous. If so, processing continues to block 502, otherwise, processing continues to block 512 (FIG. 5B). [0070] In block 512, the missing value imputation system 110 determines whether the measured level of the target variable Y is continuous. If so, processing continues to block 514, otherwise, processing continues to block 518. [0071-0072] In block 514, the missing value imputation system 110 collects statistics, for each category of the predictor variable X, including: (1) a mean of the target variable Y and (2) a variance of the target variable Y. In block 516, the missing value imputation system 110 builds one or more minimum z-score category imputation models using the statistics collected in block 514. For a categorical predictor variable with a continuous target variable, the mean and variance of a target variable in each category of the predictor variable are collected to represent the target variable's distribution in the corresponding category. Chu teaches building models by filling in the missing data of categorical variables.)
transforming the data in the dataset by selectively applying to the data the one or more missing value imputation operations and the one or more other preprocessing data cleansing-related operations, in accordance with the independent variables associated with the data; (Chu − [0049] In block 506, the missing value imputation system 110 builds one or more piecewise linear regression imputation models using the statistics collected in block 504. In block 516, the missing value imputation system 110 builds one or more minimum z-score category imputation models using the statistics collected in block 514.)
building the machine learning model based on the transformed data; (Chu − 0007] Multiple imputation builds imputation models for a variable that has missing values on other variables. The imputation model is a linear or logistic regression model for a continuous or categorical variable that has missing values, respectively. Multiple imputation imputes multiple, complete data sets by its imputation process. [0049] In block 506, the missing value imputation system 110 builds one or more piecewise linear regression imputation models using the statistics collected in block 504. In block 516, the missing value imputation system 110 builds one or more minimum z-score category imputation models using the statistics collected in block 514.)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, and Chu as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Chu provides Drissi with the tuning a model having missing value imputation in a dataset. Therefore, providing the benefit of improving the accuracy of predictive model that have missing values in the dataset.
Drissi
However, Dirac teaches: an electronic computer-mediated interface configured to receive a query processable in connection with a machine learning model; (Dirac − [0086] According to some embodiments, a number of different types of entities related to machine learning tasks may be generated, modified, read, executed, and/or queried/searched via MLS programmatic interfaces.)
and enabling queries received over the electronic interface to be processed using the built machine learning model. (Dirac − [0069] FIG. 63 illustrates an example view of results of an evaluation run of a binary classification model that may be provided via an interactive graphical interface, according to at least some embodiments.)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, Chu and Dirac as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Dirac, provides Drissi and Chu with an interface for reviewing the output of newly trained model. Therefore, providing the benefit of improving the accuracy of predictive model that have missing values in the dataset.
Regarding dependent claim 2, discloses all the features with respect to claim 1 as outlined above
Drissi teaches: wherein the dataset is a database and the data thereof is stored in a tabular structure of the database. (Drissi − The domain dataset 300 maintains a plurality of records, such as records 305 through 320, each associated with a different object. For each object, the domain dataset 300 indicates a number of features in fields 350 through 365, describing each object in the dataset.
Regarding dependent claim 3, discloses all the features with respect to claim 2 as outlined above
Drissi teaches: wherein the independent N variables correspond to different columns in the database. (Drissi − [Col. 6 ll. 12-30] The domain dataset 300 maintains a plurality of records, such as records 305 through 320, each associated with a different object. For each object, the domain dataset 300 indicates a number of features in fields 350 through 365, describing each object in the dataset.)
Regarding dependent claim 4, discloses all the features with respect to claim 3 as outlined above
Drissi teaches: wherein all columns in the database are treated as independent variables, except for a column including data of a type on which predictions are to be made (Drissi – [Col. 6 ll. 12-30] The domain dataset 300 maintains a plurality of records, such as records 305 through 320, each associated with a different object. For each object, the domain dataset 300 indicates a number of features in fields 350 through 365, describing each object in the dataset. The last field 370 corresponds to the class assigned to each object.)
Drissi does not explicitly teach: in response to queries received over the electronic interface.
However, Dirac teaches: in response to queries received over the electronic interface. (Dirac − [0086] According to some embodiments, a number of different types of entities related to machine learning tasks may be generated, modified, read, executed, and/or queried/searched via MLS programmatic interfaces.
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, Chu and Dirac as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Dirac, provides Drissi and Chu with an interface for reviewing the output of newly trained model. Therefore, providing the benefit of improving the accuracy of predictive model that have missing values in the dataset.
Regarding dependent claim 5, discloses all the features with respect to claim 1 as outlined above
Drissi teaches: wherein the generated meta-features for a given independent variable include basic statistics for the data associated with that independent variable. (Drissi – [Col. 7 ll. 28-40] As shown in FIG. 6, the meta-feature generation process 600 initially processes the domain dataset 300 during step 610 to store the information in a table. Thereafter, the meta-feature generation process 600 extracts statistics from the dataset 300 during step 620 that are then used to generate meta-features during step 630.)
Regarding dependent claim 6, discloses all the features with respect to claim 1 as outlined above
Drissi does not explicitly teach: wherein the generated meta-features for a given independent variable include an indication as to whether a seeming numerical variable likely is a categorical variable.
However, Chu teaches: wherein the generated meta-features for a given independent variable include an indication as to whether a seeming numerical variable Chu − [0047] Control begins at block 500 with the missing value imputation system 110 determining whether the measured level of the predictor variable X is continuous. If so, processing continues to block 502, otherwise, processing continues to block 512 (FIG. 5B). Continuous is numerical variable is flowchart 5A, categorical variable is flowchart 5B.)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, Chu and Dirac as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Dirac, provides Drissi and Chu with an interface for reviewing the output of newly trained model. Therefore, providing the benefit of improving the accuracy of predictive model that have missing values in the dataset.
Regarding dependent claim 8, discloses all the features with respect to claim 1 as outlined above
Drissi teaches: wherein the first and/or second pre-trained classification models is/are able to generate output indicating that no operations are appropriate for a given independent variable. (Drissi – [Col. 9 ll. 51-56] As shown in FIG. 10, the process begins during step 1005 by characterizing an input domain dataset 300 according to a set of meta-features, using the meta-feature generation process 600 (FIG. 6). The self-adaptive learning algorithms 900-1 and 900-2 select a corresponding model during step 1015, in the manner described above in conjunction with FIG. 9. Model A is the first model. [Col. 9 ll. 51-56] As shown in FIG. 10, the process begins during step 1005 by characterizing an input domain dataset 300 according to a set of meta-features, using the meta-feature generation process 600 (FIG. 6). The self-adaptive learning algorithms 900-1 and 900-2 select a corresponding model during step 1015, in the manner described above in conjunction with FIG. 9.  Model B is the second model.)
Regarding dependent claim 9, discloses all the features with respect to claim 1 as outlined above
Drissi teaches: wherein the first and second pre-trained classification models are generated independently from one another but are based on a common set of meta-features generated from at least one training dataset. (Drissi – [Col. 6 ll. 30-46] FIG. 4 illustrates an exemplary table from the performance dataset 400. As previously indicated, the performance dataset 400 indicates the performance for each model on a domain. The performance dataset 400 maintains a plurality of records, such as records 405 through 415, each associated with a different model. For each model, the performance dataset 400 identifies the domain on which the model was utilized in field 450, as well as the underlying bias embodied in the model in field 455 and the performance assessment in field 460. The meta-features domain associated with different model. Each domain can be identified in field 450, for example, using a vector of meta-features characterizing each domain (as produced by the meta-feature generation process 600). As previously indicated, for each self-adaptive learning algorithm 900-1 and 900-2, there will be a corresponding performance dataset 400-N.)
Regarding dependent claim 10, discloses all the features with respect to claim 9 as outlined above
Drissi teaches: wherein the at least one training dataset is different from the dataset stored on the non-transitory computer readable storage medium. (Drissi – [Col. 6 ll. 30-46] FIG. 4 illustrates an exemplary table from the performance dataset 400. As previously indicated, the performance dataset 400 indicates the performance for each model on a domain. The performance dataset 400 maintains a plurality of records, such as records 405 through 415, each associated with a different model. For each model, the performance dataset 400 identifies the domain on which the model was utilized in field 450, as well as the underlying bias embodied in the model in field 455 and the performance assessment in field 460. The meta-features domain associated with different model. Each domain can be identified in field 450, for example, using a vector of meta-features characterizing each domain (as produced by the meta-feature generation process 600). As previously indicated, for each self-adaptive learning algorithm 900-1 and 900-2, there will be a corresponding performance dataset 400-N.)
Regarding dependent claim 11, discloses all the features with respect to claim 9 as outlined above
Drissi does not explicitly teach: wherein independent variables in the at least one training dataset have one or more missing value imputation operations and one or more other preprocessing data cleansing-related operations, manually assigned thereto.
However, Chu teaches: wherein independent variables in the at least one training dataset have one or more missing value imputation operations and one or more other preprocessing data cleansing-related operations, manually assigned thereto. (Chu − [0037-0038] The imputation strategy may be defined according to how the completed data set (for all possible predictor variables) would be generated. One possible strategy is to impute the missing value for each of the one or more predictor variables by the mean of K predicted values for the continuous predictor variable from the final ensemble model and by the mode of K predicted values for the categorical predictor variable. Another possible strategy is to impute the missing value for each of the one or more predictor variables by the predicted value from a randomly selected imputation model out of K models for the predictor variable to be imputed. FIG. 4 illustrates, in a flow diagram, further details of missing value imputation for large and distributed data sources in accordance with certain embodiments. [0046] FIG. 5 illustrates, in a flow diagram, processing to build one or more imputation models in accordance with certain embodiments. FIG. 5 is formed by FIGS. 5A and 5B. [0047] Control begins at block 500 with the missing value imputation system 110 determining whether the measured level of the predictor variable X is continuous. If so, processing continues to block 502, otherwise, processing continues to block 512 (FIG. 5B). Chu teaches building models by filling in the missing data of continuous variables.)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, Chu and Dirac as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Dirac, provides Drissi and Chu with an interface for reviewing the output of newly trained model. Therefore, providing the benefit of improving the accuracy of predictive model that have missing values in the dataset.
Regarding independent claim 12, Drissi teaches: A method of configuring a machine learning system, the method comprising: 
accessing at least a portion of a dataset having data from which a machine learning model is buildable; (Drissi – [Col. ll. 12-26] FIG. 3 illustrates an exemplary table from the domain dataset 300 that includes training examples, each labeled with a specific class. As previously indicated, the domain dataset 300 contains a record for each object and indicates the class associated with each object. The domain dataset 300 maintains a plurality of records, such as records 305 through 320, each associated with a different object. For each object, the domain dataset 300 indicates a number of features in fields 350 through 365, describing each object in the dataset. The last field 370 corresponds to the class assigned to each object. [Col. 5 ll. 35-41] FIG. 2 provides a global view of the data classification system 100. a domain dataset 300, discussed below in conjunction with FIG. 3, serves as input to the system 100. The domain dataset 300 is applied to a self-adaptive learning process 900. System 100 is receiving/accessing the domain dataset 300.)
for each of a plurality of independent variables in the accessed portion of the dataset, and using at least one processor: generating meta-features for the respective independent variable; (Drissi – [Col. 5 ll. 55-60] The meta-feature generation process 600 executed during step 240 represents the domain dataset 300 as a set of meta-features. [Col. 7 ll. 28-47] FIG. 6 is a flow chart describing the meta-feature generation process 600. As previously indicated, the meta-feature generation process 600 processes each set of domain data to represent the domain as a set of meta-features. As shown in FIG. 6, the meta-feature generation process 600 initially processes the domain dataset 300 during step 610 to store the information in a table. Thereafter, the meta-feature generation process 600 extracts statistics from the dataset 300 during step 620 that are then used to generate meta-features during step 630. Drissi independent variables is the feature columns (355,360,365,…n) in the dataset that is used for generating meta-features.)
providing, as input to at least first and second pre-trained classification models that are different from one another, the generated meta-features for the respective independent variable; (Drissi – [Col. 6 ll. 30-46] FIG. 4 illustrates an exemplary table from the performance dataset 400. As previously indicated, the performance dataset 400 indicates the performance for each model on a domain. The performance dataset 400 maintains a plurality of records, such as records 405 through 415, each associated with a different model. For each model, the performance dataset 400 identifies the domain on which the model was utilized in field 450, as well as the underlying bias embodied in the model in field 455 and the performance assessment in field 460. The meta-features domain associated with different model. Each domain can be identified in field 450, for example, using a vector of meta-features characterizing each domain (as produced by the meta-feature generation process 600). As previously indicated, for each self-adaptive learning algorithm 900-1 and 900-2, there will be a corresponding performance dataset 400-N.)
receiving, as output from the first pre-trained classification model, (Drissi – [Col. 9 ll. 51-56] As shown in FIG. 10, the process begins during step 1005 by characterizing an input domain dataset 300 according to a set of meta-features, using the meta-feature generation process 600 (FIG. 6). The self-adaptive learning algorithms 900-1 and 900-2 select a corresponding model during step 1015, in the manner described above in conjunction with FIG. 9. Model A is the first model.
and receiving, as output from the second pre-trained classification model, (Drissi – [Col. 9 ll. 51-56] As shown in FIG. 10, the process begins during step 1005 by characterizing an input domain dataset 300 according to a set of meta-features, using the meta-feature generation process 600 (FIG. 6). The self-adaptive learning algorithms 900-1 and 900-2 select a corresponding model during step 1015, in the manner described above in conjunction with FIG. 9. Model B is the second model.)
Drissi does not explicitly teach: an indication of one or more missing value imputation operations appropriate for the respective independent variable; an indication of one or more other preprocessing data cleansing related operations appropriate for the respective independent variable;
However, Chu teaches: receiving, as output from the first pre-trained classification model, an indication of one or more missing value imputation operations appropriate for the respective independent variable; (Chu − [0037-0038] The imputation strategy may be defined according to how the completed data set (for all possible predictor variables) would be generated. One possible strategy is to impute the missing value for each of the one or more predictor variables by the mean of K predicted values for the continuous predictor variable from the final ensemble model and by the mode of K predicted values for the categorical predictor variable. Another possible strategy is to impute the missing value for each of the one or more predictor variables by the predicted value from a randomly selected imputation model out of K models for the predictor variable to be imputed. FIG. 4 illustrates, in a flow diagram, further details of missing value imputation for large and distributed data sources in accordance with certain embodiments. [0046] FIG. 5 illustrates, in a flow diagram, processing to build one or more imputation models in accordance with certain embodiments. FIG. 5 is formed by FIGS. 5A and 5B. [0047] Control begins at block 500 with the missing value imputation system 110 determining whether the measured level of the predictor variable X is continuous. If so, processing continues to block 502, otherwise, processing continues to block 512 (FIG. 5B). Chu teaches building models by filling in the missing data of continuous variables.)
and receiving, as output from the second pre-trained classification model, an indication of one or more other preprocessing data cleansing related operations appropriate for the respective independent variable; (Chu − [0047] Control begins at block 500 with the missing value imputation system 110 determining whether the measured level of the predictor variable X is continuous. If so, processing continues to block 502, otherwise, processing continues to block 512 (FIG. 5B). [0070] In block 512, the missing value imputation system 110 determines whether the measured level of the target variable Y is continuous. If so, processing continues to block 514, otherwise, processing continues to block 518. [0071-0072] In block 514, the missing value imputation system 110 collects statistics, for each category of the predictor variable X, including: (1) a mean of the target variable Y and (2) a variance of the target variable Y. In block 516, the missing value imputation system 110 builds one or more minimum z-score category imputation models using the statistics collected in block 514. For a categorical predictor variable with a continuous target variable, the mean and variance of a target variable in each category of the predictor variable are collected to represent the target variable's distribution in the corresponding category. Chu teaches building models by filling in the missing data of categorical variables.
transforming the data in the dataset by selectively applying to the data the one or more missing value imputation operations and the one or more other preprocessing data cleansing- related operations, in accordance with the independent variables associated with the data; (Chu − [0049] In block 506, the missing value imputation system 110 builds one or more piecewise linear regression imputation models using the statistics collected in block 504. In block 516, the missing value imputation system 110 builds one or more minimum z-score category imputation models using the statistics collected in block 514.)
building the machine learning model based on the transformed data; (Chu − 0007] Multiple imputation builds imputation models for a variable that has missing values on other variables. The imputation model is a linear or logistic regression model for a continuous or categorical variable that has missing values, respectively. Multiple imputation imputes multiple, complete data sets by its imputation process. [0049] In block 506, the missing value imputation system 110 builds one or more piecewise linear regression imputation models using the statistics collected in block 504. In block 516, the missing value imputation system 110 builds one or more minimum z-score category imputation models using the statistics collected in block 514.)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, and Chu as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Chu provides Drissi with the tuning a model having missing value imputation in a dataset. Therefore, 
Drissi does not explicitly teach: and enabling queries received over a computer-mediated interface to be processed using the built machine learning model.
However, Dirac teaches: and enabling queries received over a computer-mediated interface to be processed using the built machine learning model. (Dirac − [0069] FIG. 63 illustrates an example view of results of an evaluation run of a binary classification model that may be provided via an interactive graphical interface, according to at least some embodiments.)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, Chu and Dirac as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Dirac, provides Drissi and Chu with an interface for reviewing the output of newly trained model. Therefore, providing the benefit of improving the accuracy of predictive model that have missing values in the dataset.
Regarding dependent claim 13, discloses all the features with respect to claim 12 as outlined above
Drissi teaches: wherein the dataset is a database and the data thereof is stored in a tabular structure of the database, and wherein the independent variables correspond to different columns in the database. (Drissi − The domain dataset 300 maintains a plurality of records, such as records 305 through 320, each associated with a different object. For each object, the domain dataset 300 indicates a number of features in fields 350 through 365, describing each object in the dataset.)
Regarding dependent claim 14, discloses all the features with respect to claim 12 as outlined above
Drissi teaches: wherein the generated meta-features for a given independent variable include (a) basic statistics computed for the data associated with that independent variable, (Drissi – [Col. 7 ll. 28-40] As shown in FIG. 6, the meta-feature generation process 600 initially processes the domain dataset 300 during step 610 to store the information in a table. Thereafter, the meta-feature generation process 600 extracts statistics from the dataset 300 during step 620 that are then used to generate meta-features during step 630.)
Drissi does not explicitly teach: and (b) an indication as to whether a seeming numerical variable likely is a categorical variable. 
However, Chu teaches: and (b) an indication as to whether a seeming numerical variable likely is a categorical variable. (Chu − [0047] Control begins at block 500 with the missing value imputation system 110 determining whether the measured level of the predictor variable X is continuous. If so, processing continues to block 502, otherwise, processing continues to block 512 (FIG. 5B). Continuous is numerical variable is flowchart 5A, categorical variable is flowchart 5B.)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, Chu and Dirac as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Dirac, provides Drissi 
Regarding dependent claim 16, discloses all the features with respect to claim 12 as outlined above
Drissi teaches: wherein the first and second pre-trained classification models are generated independently from one another but are based on a common set of meta-features generated from at least one training dataset. (Drissi – [Col. 6 ll. 30-46] FIG. 4 illustrates an exemplary table from the performance dataset 400. As previously indicated, the performance dataset 400 indicates the performance for each model on a domain. The performance dataset 400 maintains a plurality of records, such as records 405 through 415, each associated with a different model. For each model, the performance dataset 400 identifies the domain on which the model was utilized in field 450, as well as the underlying bias embodied in the model in field 455 and the performance assessment in field 460. The meta-features domain associated with different model. Each domain can be identified in field 450, for example, using a vector of meta-features characterizing each domain (as produced by the meta-feature generation process 600). As previously indicated, for each self-adaptive learning algorithm 900-1 and 900-2, there will be a corresponding performance dataset 400-N.)
Regarding dependent claim 17
Drissi does not explicitly teach: wherein independent variables in the at least one training dataset have one or more missing value imputation operations and one or more other preprocessing data cleansing-related operations, manually assigned thereto.
However, Chu teaches: wherein independent variables in the at least one training dataset have one or more missing value imputation operations and one or more other preprocessing data cleansing-related operations, manually assigned thereto. (Chu − [0037-0038] The imputation strategy may be defined according to how the completed data set (for all possible predictor variables) would be generated. One possible strategy is to impute the missing value for each of the one or more predictor variables by the mean of K predicted values for the continuous predictor variable from the final ensemble model and by the mode of K predicted values for the categorical predictor variable. Another possible strategy is to impute the missing value for each of the one or more predictor variables by the predicted value from a randomly selected imputation model out of K models for the predictor variable to be imputed. FIG. 4 illustrates, in a flow diagram, further details of missing value imputation for large and distributed data sources in accordance with certain embodiments. [0046] FIG. 5 illustrates, in a flow diagram, processing to build one or more imputation models in accordance with certain embodiments. FIG. 5 is formed by FIGS. 5A and 5B. [0047] Control begins at block 500 with the missing value imputation system 110 determining whether the measured level of the predictor variable X is continuous. If so, processing continues to block 502, otherwise, processing continues to block 512 (FIG. 5B). Chu teaches building models by filling in the missing data of continuous variables.
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, Chu and Dirac as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Dirac, provides Drissi and Chu with an interface for reviewing the output of newly trained model. Therefore, providing the benefit of improving the accuracy of predictive model that have missing values in the dataset.
Regarding independent claim 18, Drissi teaches: A non-transitory computer-readable storage medium tangibly storing a program, when executed by a processor of a computing system, performs instructions comprising: 
accessing at least a portion of a dataset having data from which a machine learning model is buildable; (Drissi – [Col. ll. 12-26] FIG. 3 illustrates an exemplary table from the domain dataset 300 that includes training examples, each labeled with a specific class. As previously indicated, the domain dataset 300 contains a record for each object and indicates the class associated with each object. The domain dataset 300 maintains a plurality of records, such as records 305 through 320, each associated with a different object. For each object, the domain dataset 300 indicates a number of features in fields 350 through 365, describing each object in the dataset. The last field 370 corresponds to the class assigned to each object. [Col. 5 ll. 35-41] FIG. 2 provides a global view of the data classification system 100. a domain dataset 300, discussed below in conjunction with FIG. 3, serves as input to the system 100. The domain dataset 300 is applied to a self-adaptive learning process 900. System 100 is receiving/accessing the domain dataset 300.
for each of a plurality of independent variables in the accessed portion of the dataset, and using at least one processor: generating meta-features for the respective independent variable; (Drissi – [Col. 5 ll. 55-60] The meta-feature generation process 600 executed during step 240 represents the domain dataset 300 as a set of meta-features. [Col. 7 ll. 28-47] FIG. 6 is a flow chart describing the meta-feature generation process 600. As previously indicated, the meta-feature generation process 600 processes each set of domain data to represent the domain as a set of meta-features. As shown in FIG. 6, the meta-feature generation process 600 initially processes the domain dataset 300 during step 610 to store the information in a table. Thereafter, the meta-feature generation process 600 extracts statistics from the dataset 300 during step 620 that are then used to generate meta-features during step 630. Drissi independent variables is the feature columns (355,360,365,…n) in the dataset that is used for generating meta-features.)
providing, as input to at least first and second pre-trained classification models that are different from one another, the generated meta-features for the respective independent variable; (Drissi – [Col. 6 ll. 30-46] FIG. 4 illustrates an exemplary table from the performance dataset 400. As previously indicated, the performance dataset 400 indicates the performance for each model on a domain. The performance dataset 400 maintains a plurality of records, such as records 405 through 415, each associated with a different model. For each model, the performance dataset 400 identifies the domain on which the model was utilized in field 450, as well as the underlying bias embodied in the model in field 455 and the performance assessment in field 460. The meta-features domain associated with different model. Each domain can be identified in field 450, for example, using a vector of meta-features characterizing each domain (as produced by the meta-feature generation process 600). As previously indicated, for each self-adaptive learning algorithm 900-1 and 900-2, there will be a corresponding performance dataset 400-N.)
receiving, as output from the first pre-trained classification model, (Drissi – [Col. 9 ll. 51-56] As shown in FIG. 10, the process begins during step 1005 by characterizing an input domain dataset 300 according to a set of meta-features, using the meta-feature generation process 600 (FIG. 6). The self-adaptive learning algorithms 900-1 and 900-2 select a corresponding model during step 1015, in the manner described above in conjunction with FIG. 9. Model A is the first model.)
and receiving, as output from the second pre-trained classification model, (Drissi – [Col. 9 ll. 51-56] As shown in FIG. 10, the process begins during step 1005 by characterizing an input domain dataset 300 according to a set of meta-features, using the meta-feature generation process 600 (FIG. 6). The self-adaptive learning algorithms 900-1 and 900-2 select a corresponding model during step 1015, in the manner described above in conjunction with FIG. 9. Model B is the second model.)
Drissi does not explicitly teach: an indication of one or more missing value imputation operations appropriate for the respective independent variable; an indication of one or more other preprocessing data cleansing related operations appropriate for the respective independent variable;
However, Chu teaches: receiving, as output from the first pre-trained classification model, an indication of one or more missing value imputation operations appropriate for the respective independent variable; (Chu − [0037-0038] The imputation strategy may be defined according to how the completed data set (for all possible predictor variables) would be generated. One possible strategy is to impute the missing value for each of the one or more predictor variables by the mean of K predicted values for the continuous predictor variable from the final ensemble model and by the mode of K predicted values for the categorical predictor variable. Another possible strategy is to impute the missing value for each of the one or more predictor variables by the predicted value from a randomly selected imputation model out of K models for the predictor variable to be imputed. FIG. 4 illustrates, in a flow diagram, further details of missing value imputation for large and distributed data sources in accordance with certain embodiments. [0046] FIG. 5 illustrates, in a flow diagram, processing to build one or more imputation models in accordance with certain embodiments. FIG. 5 is formed by FIGS. 5A and 5B. [0047] Control begins at block 500 with the missing value imputation system 110 determining whether the measured level of the predictor variable X is continuous. If so, processing continues to block 502, otherwise, processing continues to block 512 (FIG. 5B). Chu teaches building models by filling in the missing data of continuous variables.)
and receiving, as output from the second pre-trained classification model, an indication of one or more other preprocessing data cleansing related operations appropriate for the respective independent variable; (Chu − [0047] Control begins at block 500 with the missing value imputation system 110 determining whether the measured level of the predictor variable X is continuous. If so, processing continues to block 502, otherwise, processing continues to block 512 (FIG. 5B). [0070] In block 512, the missing value imputation system 110 determines whether the measured level of the target variable Y is continuous. If so, processing continues to block 514, otherwise, processing continues to block 518. [0071-0072] In block 514, the missing value imputation system 110 collects statistics, for each category of the predictor variable X, including: (1) a mean of the target variable Y and (2) a variance of the target variable Y. In block 516, the missing value imputation system 110 builds one or more minimum z-score category imputation models using the statistics collected in block 514. For a categorical predictor variable with a continuous target variable, the mean and variance of a target variable in each category of the predictor variable are collected to represent the target variable's distribution in the corresponding category. Chu teaches building models by filling in the missing data of categorical variables.)
transforming the data in the dataset by selectively applying to the data the one or more missing value imputation operations and the one or more other preprocessing data cleansing- related operations, in accordance with the independent variables associated with the data; (Chu − [0049] In block 506, the missing value imputation system 110 builds one or more piecewise linear regression imputation models using the statistics collected in block 504. In block 516, the missing value imputation system 110 builds one or more minimum z-score category imputation models using the statistics collected in block 514.)
building the machine learning model based on the transformed data; (Chu − 0007] Multiple imputation builds imputation models for a variable that has missing values on other variables. The imputation model is a linear or logistic regression model for a continuous or categorical variable that has missing values, respectively. Multiple imputation imputes multiple, complete data sets by its imputation process. [0049] In block 506, the missing value imputation system 110 builds one or more piecewise linear regression imputation models using the statistics collected in block 504. In block 516, the missing value imputation system 110 builds one or more minimum z-score category imputation models using the statistics collected in block 514.)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, and Chu as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Chu provides Drissi with the tuning a model having missing value imputation in a dataset. Therefore, providing the benefit of improving the accuracy of predictive model that have missing values in the dataset.
Drissi does not explicitly teach: and enabling queries received over a computer-mediated interface to be processed using the built machine learning model.
However, Dirac teaches: and enabling queries received over a computer-mediated interface to be processed using the built machine learning model. (Dirac − [0069] FIG. 63 illustrates an example view of results of an evaluation run of a binary classification model that may be provided via an interactive graphical interface, according to at least some embodiments.)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, Chu and Dirac as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Dirac, provides Drissi and Chu with an interface for reviewing the output of newly trained model. Therefore, 
Regarding dependent claim 19, discloses all the features with respect to claim 18 as outlined above
Drissi teaches: wherein the generated meta-features for a given independent variable include (a) basic statistics computed for the data associated with that independent variable, (Drissi – [Col. 7 ll. 28-40] As shown in FIG. 6, the meta-feature generation process 600 initially processes the domain dataset 300 during step 610 to store the information in a table. Thereafter, the meta-feature generation process 600 extracts statistics from the dataset 300 during step 620 that are then used to generate meta-features during step 630.)
Drissi does not explicitly teach: and (b) an indication as to whether a seeming numerical variable likely is a categorical variable. 
However, Chu teaches: and (b) an indication as to whether a seeming numerical variable likely is a categorical variable. (Chu − [0047] Control begins at block 500 with the missing value imputation system 110 determining whether the measured level of the predictor variable X is continuous. If so, processing continues to block 502, otherwise, processing continues to block 512 (FIG. 5B). Continuous is numerical variable is flowchart 5A, categorical variable is flowchart 5B.)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, Chu and Dirac as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Dirac, provides Drissi 
Regarding dependent claim 20, discloses all the features with respect to claim 18 as outlined above
Drissi teaches: wherein the first and second pre-trained classification models are generated independently from one another but are based on a common set of meta-features generated from at least one training dataset. (Drissi – [Col. 6 ll. 30-46] FIG. 4 illustrates an exemplary table from the performance dataset 400. As previously indicated, the performance dataset 400 indicates the performance for each model on a domain. The performance dataset 400 maintains a plurality of records, such as records 405 through 415, each associated with a different model. For each model, the performance dataset 400 identifies the domain on which the model was utilized in field 450, as well as the underlying bias embodied in the model in field 455 and the performance assessment in field 460. The meta-features domain associated with different model. Each domain can be identified in field 450, for example, using a vector of meta-features characterizing each domain (as produced by the meta-feature generation process 600). As previously indicated, for each self-adaptive learning algorithm 900-1 and 900-2, there will be a corresponding performance dataset 400-N.)
Regarding dependent claim 21
Drissi does not explicitly teach: wherein independent variables in the at least one training dataset have one or more missing value imputation operations and one or more other preprocessing data cleansing-related operations, manually assigned thereto.
However, Chu teaches: wherein independent variables in the at least one training dataset have one or more missing value imputation operations and one or more other preprocessing data cleansing-related operations, manually assigned thereto. (Chu − [0037-0038] The imputation strategy may be defined according to how the completed data set (for all possible predictor variables) would be generated. One possible strategy is to impute the missing value for each of the one or more predictor variables by the mean of K predicted values for the continuous predictor variable from the final ensemble model and by the mode of K predicted values for the categorical predictor variable. Another possible strategy is to impute the missing value for each of the one or more predictor variables by the predicted value from a randomly selected imputation model out of K models for the predictor variable to be imputed. FIG. 4 illustrates, in a flow diagram, further details of missing value imputation for large and distributed data sources in accordance with certain embodiments. [0046] FIG. 5 illustrates, in a flow diagram, processing to build one or more imputation models in accordance with certain embodiments. FIG. 5 is formed by FIGS. 5A and 5B. [0047] Control begins at block 500 with the missing value imputation system 110 determining whether the measured level of the predictor variable X is continuous. If so, processing continues to block 502, otherwise, processing continues to block 512 (FIG. 5B). Chu teaches building models by filling in the missing data of continuous variables.
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, Chu and Dirac as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Dirac, provides Drissi and Chu with an interface for reviewing the output of newly trained model. Therefore, providing the benefit of improving the accuracy of predictive model that have missing values in the dataset.
Regarding dependent claim 22, discloses all the features with respect to claim 20 as outlined above
Drissi does not explicitly teach: wherein missing value imputation operations are performed on the at least one training dataset prior to generation of the common set of meta-features.
However, Chu teaches: wherein missing value imputation operations are performed on the at least one training dataset prior to generation of the common set of meta-features. (Chu − [0037-0038] The imputation strategy may be defined according to how the completed data set (for all possible predictor variables) would be generated. One possible strategy is to impute the missing value for each of the one or more predictor variables by the mean of K predicted values for the continuous predictor variable from the final ensemble model and by the mode of K predicted values for the categorical predictor variable. Another possible strategy is to impute the missing value for each of the one or more predictor variables by the predicted value from a randomly selected imputation model out of K models for the predictor variable to be imputed. FIG. 4 illustrates, in a flow diagram, further details of missing value imputation for large and distributed data sources in accordance with certain embodiments. [0046] FIG. 5 illustrates, in a flow diagram, processing to build one or more imputation models in accordance with certain embodiments. FIG. 5 is formed by FIGS. 5A and 5B. [0047] Control begins at block 500 with the missing value imputation system 110 determining whether the measured level of the predictor variable X is continuous. If so, processing continues to block 502, otherwise, processing continues to block 512 (FIG. 5B). Chu teaches building models by filling in the missing data of continuous variables.)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, Chu and Dirac as each inventions relates using a learning algorithm for training a model with machine learning techniques. Adding the teaching of Dirac, provides Drissi and Chu with an interface for reviewing the output of newly trained model. Therefore, providing the benefit of improving the accuracy of predictive model that have missing values in the dataset.

Claims 7 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Drissi in view of Chu in view of Dirac, as applied to claims 1-6, 8-14, and 16-22 above, and further in view of Nerurkar (US PGPUB: 20170126694, Filed Date: Jul. 7, 2016, hereinafter “Nerurkar”).
Regarding dependent claim 7, discloses all the features with respect to claim 6 as outlined above
Drissi does not explicitly teach: wherein, for a given independent variable, the indication as to whether a seeming numerical variable likely is a categorical variable is 
However, Nerurkar teaches: wherein, for a given independent variable. the indication as to whether a seeming numerical variable likely is a categorical variable is based on a determination as to whether a count of the unique data entries thereof divided by the total number of data entries is less than a threshold value. (Nerurkar − [0105] the estimate of the category for an entry is determined by applying a cutoff threshold to a numerical. The entry may be classified into category "0" if the associated output is below a cutoff threshold of 0.5, 0.4, or 0.3.)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, Chu, Dirac and Nerurkar as each inventions relates using a learning algorithm for training a model with machine learning techniques. Therefore, providing the benefit of improving the accuracy of predictive model that have missing values in the dataset.
Regarding dependent claim 15, discloses all the features with respect to claim 14 as outlined above
Drissi does not explicitly teach: wherein, for a given independent variable, the indication as to whether a seeming numerical variable likely is a categorical variable is based on a determination as to whether a count of the unique data entries thereof divided by the total number of data entries is less than a threshold value. 
However, Nerurkar teaches: wherein, for a given independent variable. the indication as to whether a seeming numerical variable likely is a categorical variable is based on a determination as to whether a count of the unique data entries thereof Nerurkar − [0105] the estimate of the category for an entry is determined by applying a cutoff threshold to a numerical. The entry may be classified into category "0" if the associated output is below a cutoff threshold of 0.5, 0.4, or 0.3.)
Accordingly, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have combined the teaching of Drissi, Chu, Dirac and Nerurkar as each inventions relates using a learning algorithm for training a model with machine learning techniques. Therefore, providing the benefit of improving the accuracy of predictive model that have missing values in the dataset.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CARL E BARNES JR whose telephone number is (571)270-3395. The examiner can normally be reached Monday-Friday 9am-3pm, 6pm-9pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Cesar Paula can be reached on 571-272-4128. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

/CARL E BARNES JR/Examiner, Art Unit 2177      

/CESAR B PAULA/Supervisory Patent Examiner, Art Unit 2177