DETAILED ACTION
This is the response to applicant’s amendment action regarding application number 15/788,795, filed October 19, 2017.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments
The amendment filed February 10, 2022 has been entered. Examiner acknowledges receipt of Amendments to Application 15/788,795, which include: Amendments to the Claims, and Remarks containing Applicant’s amendments. 
Regarding Applicant’s Remarks and Amendments to the Claims, Examiner acknowledges Claims 1, 8, and 15 have been amended, with Claims 3, 5, 10, 12, and 17-18 previously canceled. Claims 1-2, 4, 6-9, 11, 13-16, and 19-26 remain pending in the application. 

Response to Arguments
Examiner acknowledges receipt of Arguments to Application 15/788,795, which include: Remarks containing Applicant’s arguments. 
Regarding Applicant's arguments for Claims 1, 6, 8, 13, and 15 under 35 U.S.C. 103 as being unpatentable over Bradley et al., U.S. PGPUB 2009/0193039, published 7/30/2009 [hereafter referred as Bradley] in view of Hyong, “Doing Data Science and AI with SQL Server”, July 2017 [hereafter referred as Hyong]; for Claims 2, 4, 7, 9, 11, 14, 16, 19-21, 23, and 25-26 under 35 U.S.C. 103 as being unpatentable over Bradley in view of Hyong, in further view of Natekin & Knoll “Gradient boosting machines, a tutorial”, December 4, 2013 [hereafter referred as Natekin], Examiner acknowledges Applicant’s arguments and have considered them, and have found them to be not persuasive. Examiner notes that the Applicant has made considerable amendments to the independent claims, where these amendments necessitate further examination and re-evaluation of the amended and original claims. The updated claim mappings according to the Applicant’s amended claims are provided in the sections indicated below. Examiner also further notes that the majority of the Applicant’s prior art arguments are directed to the newly added claim limitations recited in the respective independent claims. However, Examiner has noted Applicant’s arguments contain a sub-argument regarding certain terms in the independent claim, which will be addressed in the following paragraphs.
Regarding Applicant’s Remarks:
“Hyong is an article that shows how one can use data science and artificial intelligence with an SQL Server. Hyong, page 3. Hyong teaches that "SQL Server 2017 CTP2 added Python support. You now have the best of multiple worlds: the ability to write code in R or Python, leverage the rich set of R and Python libraries for machine learning and deep learning, and consume the predictive models in any application." Hyong, pages 4-5. Hyong further teaches steps to get started with the SQL Server, data science, and artificial intelligence. The steps include "Install SQL Server 2016 or SQL Server 2017 CTP2. When installing SQL Server 2017 CTP2, you select the type of in-database machine learning services that you want to install. You can choose to use R, Python or both. Once SQL Server completes setup, you'll be able to start using R or Python code as part of stored procedures." Hyong, page 6. 
Thus, Hyong teaches that SQL Server 2017 is updated to have support for R and Python code. Hyong explains that the machine learning and deep learning models are written in R and Python code. Since the machine learning models and deep learning models are written in R and Python code, which is supported by SQL Server 2017, there is no conversion of the model’s code from a non-compatible format for SQL into a compatible format for SQL.”
Examiner has considered this argument, and finds the argument to be not persuasive. Examiner notes that the Applicant’s sub-argument is directed to the original claim limitation found in the independent claims: “convert the trained machine learning model in an SQL query”, where the Applicant is arguing that the change from Python or R machine code to SQL query taught in Hyong is not the same as the conversion in the claimed invention, as the conversion taught in Hyong represents a “compatible” conversion. Applicant is reminded that the claims must be given their broadest reasonable interpretation in light of the specification. See MPEP 2111. According to the Merriam-Webster dictionary, the term “compatible” broadly defines something that is designed to work or operate without modification, and the term “convert” broadly defines something that is changed from one form or function to another. Given the above definitions and the context of Applicant’s sub-argument, Applicant appears to argue that the Python/R-to-SQL conversion taught in Hyong represents a conversion that requires no modification (i.e., a native support), which the Examiner finds to be not persuasive. 
Examiner notes that Applicant acknowledges that the Hyong reference teaches that the machine learning models are written in either R or Python code, which represent programming languages that contain support for machine learning/data science applications. Examiner further notes that these programming languages are different from SQL, which is a standardized programming language specifically designed for performing data operations with relational databases. A person having ordinary skill in the art would understand that aside from the different names, the disclosed programming languages (R, Python, and SQL) are normally applied in different programming domains (e.g., Python is a general purpose programming language with machine learning packages/extension to support data science applications; R is typically used for data mining/data science applications; and SQL is used for performing database queries), and thus also exhibit language differences such as different data types and programming syntax, such that a program written in one source programming language (i.e., R or Python) cannot be directly executed on a system that only understands and supports the target programming language (i.e., SQL) without some degree of modification of the target system. The Hyong reference (Hyong p.7 2nd-5th paragraphs; and p.8 1st-2nd paragraphs) further teaches that the SQL Server 2017 requires explicit enabling of SQL external script support (by re-running a “sp_configure” configuration script and restarting the SQL server service) as well as explicit installation of the relevant R or Python libraries to support the new in-database machine learning services, where the explicit re-configuration and installation actions indicate that the SQL Server 2017 does not natively support processing and execution of R and Python machine language code. Hence, Applicant’s argument that the conversion from machine learning model code (written in R or Python) into SQL taught in Hyong represents a “compatible” conversion is not persuasive (as this conversion requires considerable re-configurations and installations to modify and update the SQL server to support the R and Python programming languages), and the existing prior art rejection is maintained.
As indicated earlier, Examiner notes that the Applicant has made considerable amendments to the independent claims, where these amendments necessitate further examination and re-evaluation of the amended and original claims, and the remainder of the Applicant’s prior art arguments are directed to the newly added claim limitations recited in the respective independent claims. The updated claim mappings according to the Applicant’s amended claims are provided in the sections indicated below.

Claim Interpretation
Examiner notes that the term “non-compatible” provided in the following claim limitation in amended independent claims 1, 8, and 15 (“convert the trained machine learning model from a computer-readable code that is non-compatible with structured query language (SQL) into an SQL query such that the trained machine learning model can automatically retrieve data from a relational database configured for processing and responding to SQL queries”) does not have an explicit definition in the Applicant’s specification. Applicant’s specification paragraphs [0027]-[0028] only state an exemplary usage of a python library implementation to convert a GBT and/or Random Forest model to SQL code, while also broadly indicating that the query and/or translation of the model may come in other forms: “To add the model into production … an SQL query is introduced which to enables the use of the machine learning model by translating the model into SQL query so that the model may be used in production for calculating and making predictions. In one embodiment, a python library is implemented to convert the machine learning model to legible SQL. For example, in order to run arbitrary Python code on Teradata, a python library may be implemented to convert a GBT and/or Random Forest model to legible SQL code. Therefore, process 400 continues to operation 412 where the model is translated for use with SQL. Process 400 then concludes with the run of the now integrated ML model in production at operation 414. … Note that although process 400 specifies the use of an SQL query and/or SQL code, the query and/or translation of the ML model may come in other forms. …”. Hence, for the purposes of examination, the term “non-compatible” will be interpreted as broadly identifying modifications that transform/translate the trained machine learning model code into an SQL query, where the trained machine learning model code can be in a programming language or some other file format that requires one or more transformations/translations into SQL programming language.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 6-8, 13-15, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over 
   Bhattacharjee et al., U.S. PGPUB 2014/0229491, published 8/14/2014 [hereafter referred as Bhattacharjee] in view of Brownlee, Jason, Ensemble Machine Learning Algorithms in Python with scikit-learn, June 3 2016 [hereafter referred as Brownlee], in further view of StackOverflow: Export python scikit learn models into pmml, answered September 26, 2016 at 19:31, with associated Github README.md for SkLearnPMML and JPMML-SkLearn, both dated September 13, 2016 [hereafter referred as StackOverflow].  
Regarding amended Claim 1, 
Bhattacharjee teaches
(Currently Amended) A system comprising: 
a non-transitory memory storing instructions (Examiner’s note: Bhattacharjee teaches a computer system with non-transitory computer readable storage medium storing computer instructions to perform the described method and process steps (Bhattacharjee Figure 10, elements 1010, 1015, and 1055; and [0042]-[0043]).); and 
a processor coupled to the non-transitory memory and configured to execute the instructions to cause the system (Examiner’s note: Bhattacharjee teaches a computer system with a processor connected the non-transitory memory via a computer bus, reading instructions from the memory to perform the coded actions (Bhattacharjee Figure 10, element 1005; and [0042]-[0043]).) to: 
retrieve a training data set associated with historical user account information (Examiner’s note: Bhattacharjee teaches a data source containing historical data records, where the historical data is read from the data source and prepared for analysis to train a model to perform predictions of sales and costs and business trends based on transactional data, which includes predicting a probability that a customer may purchase goods from a supermarket, if a catalog with a list of goods on promotion is regularly sent to a mailbox. A person having ordinary skill in the art would understand that the prepared historical data used to train and store a model performing such a prediction will contain customer-related records containing customer transactional data, and hence represents training data based on user account information (Bhattacharjee [0001]-[0002]: “… large quantities of data can be explored and analyzed … Examples of predictions … forecasting future performance, sales, and costs … trend determination in a business field … Organizations can gain business value by exploring transactional data typically generated within the enterprise or from unstructured data created by external sources (… historical records) …”; Figure 1A and [0019]-[0020]: “… the data model 115 is created by training an algorithm over analyzed data, for example, historical data 105. … the historical data 105 may be read from a data source and be prepared for analysis …”; and [0024]: “… the model 222 may be considered as a reusable component by training an algorithm using historical data and saving the instance. Typically, models may be created to share computed business rules that can be applied to similar data … a model may be created to predict the probability that a customer may purchase goods from a supermarket, if a catalog with a list of goods on promotion is regularly sent to the mailbox.”).); 
preprocess the training data set, the preprocessing including formatting the training data set that enables the use of the formatted training data set in a machine learning model (Examiner’s note: Bhattacharjee teaches preparing the historical data with a data preparation component which checks the data for accuracy and missing fields, filters/extracts the data based on range values or inconsistent or unsuitable data before the historical data is analyzed and the data model is generated. A person having ordinary skill in the art would understand that this data preparation step represents a preprocessing step for the historical data, where these checks and filtering steps represent formatting the historical data records such that they can be applied for analysis and training of a machine learning model (Bhattacharjee Figure 1, element 112; and [0019]-[0020]: “the data model 115 is created by training an algorithm over analyzed data, for example, historical data 105 … a data mining algorithm may be a set of heuristics and calculations that creates a data model … from data. The data mining application 110 may invoke the historical data 105 and apply statistical operations as part of a chosen algorithm and generate the data model 115 … For accurate results, data may need to be prepared and processed before analysis. … the data mining application may include a data preparation component 112 responsible for applying the data preparation steps over the read historical data 105 … data preparation involves checking data for accuracy and missing fields, filtering data based on range values, filtering data to extract inconsistent or unsuitable data, sample the data to investigate a subset of data, manipulating data, etc. …”).) … 
train, using the formatted training data set, the machine learning model (Examiner’s note: As indicated earlier, Bhattacharjee teaches preparing the historical data through a series of checks and filtering steps (representing formatting the historical data records), before using the data to train and generate a data model, where the generated data model can be a decision tree model or classification and regression tree (with each representing a machine learning model) (Bhattacharjee [0019]-[0020]; [0021]: “… Based on the result of such analysis and the defined optimal parameters, the data (mining) model is created. An example of a data model may be a decision tree that predicts an outcome and describes how different criteria affect that outcome …”; and [0023]: “… the algorithm that may be applied over the data may be a Classification aNd Regression (CNR) tree algorithm … An R-CNR Tree 210 model may be generated …”).) … 
convert the trained machine learning model from a computer-readable code that is non-compatible with structured query language (SQL) into an SQL query (Examiner’s note: As indicated earlier, for the purposes of examination, the term “non-compatible” recited in this limitation will be interpreted as broadly identifying modifications that transform/translate the trained machine learning model code into an SQL query, where the trained machine learning model code can be in a programming language or some other file format that requires one or more transformations/translations into SQL programming language. Bhattacharjee teaches a converting module that performs a conversion from a data model (written in into an in-database analysis model, where the trained data model (initially written in an imperative programming language that is not a native database language, such as R, Java, C++, etc.) is first exported into a standardized structured format file (e.g., PMML) and provided to the converting module to generate an object model instance, where this object model instance is defined as a data representation in a programming language that instantiates objects from the predefined object model. Bhattacharjee further teaches this object model instance is further converted by a database converting module, where this database converting module converts the object model instance (representing the predictive analysis data model) into a series of SQL statements (inserted into a database stored procedure) that access each object element (accessed by CE_PROJECTIONS) to retrieve data and execute the object elements directly within database memory to return a prediction result (Bhattacharjee Figure 3A, Figure 3B, elements 325, 330, 333, 335, [0025] and [0028]-[0029]; Figure 6, elements 620, 630, 635, 640, Table 3, and [0034]-[0037]: “… The data model 620 may be transformed in a standard format … and a data model in a standard structural format 630 may be generated. The system 600 includes a converting module 635 that receives and converts the data model in the standard structural format 630 into an object model instance 640. … database converting module 645, part of the system 600, may convert the object model predictive analysis application that applies data models to score data sets stored on the database server 650 with the use of in-memory processing. … the database server 650 includes stored procedures (such as the stored procedure 655) to apply analyzes over data by using a defined data model, such as the data model 620. … the in-database analysis model may be defined as a stored procedure written in a database native language. … The body of the procedure may include a sequence of statements that specify a transformation of some data (by means of relational operations such as selection, projection) and binds the result to an output variable of the procedure. … objects from the instantiated object model may be written as a SELECT statement that may return output values defined in the instantiated object model. … Table 3 presents the in-database analysis model which is converted from the data model defined in PMML format in Table 1. … The outputs of all the projections are put into a union, which gives the final result.”).) …
… such that the trained machine learning model can automatically retrieve data from a relational database configured for processing and responding to SQL queries (Examiner’s note: This claim language recites an intended use of the converted trained machine learning model (i.e., for performing retrieval of data from a relational database). As indicated earlier, Bhattacharjee teaches the conversion of a trained machine learning model from a structured format file into a set of SQL statements in a stored procedure, where the SQL statements in the stored procedure are used to perform predictions (scoring) using new data directly within the memory of the database server, without explicitly extracting data from the database system (Bhattacharjee [0027]: “… The generated in-database model 310 may contain the logic for scoring a set of data with a model in the same manner as the data model 305 without extracting the data from the database system. The logic incorporated in the in-database model 310 may be executed on a database level. … Hence, database servers may provide predictive analysis capabilities to score data through dynamically created in-database models converted from pre-existing models. … Scoring new data may be done without the need of historical data for replicating the logic in the data model 305. … processing new data according to the in-database model 310 can be achieved on a database server, as part of a database system.”; Figure 3A, Figure 3B, elements 325, 330, 333, 335, [0025] and [0028]-[0029]; Figure 6, elements 620, 630, 635, 640, Table 3, and [0034]-[0037]).); 
determine that user data associated with a user and stored in the relational database is available (Examiner’s note: This claim limitation broadly recites steps involving performing predictions (scoring) on new data. Bhattacharjee teaches a development environment used for scoring available data directly in a database system, with a graphical interface that allows a user to select (and hence invoke) the corresponding available stored procedures that are present in the database system (Bhattacharjee [0027]; and Figure 9 and [0040]: “… FIG.9 is an exemplary screenshot, depicting an embodiment of a development environment 900 of a client application that may score data with an in-database analysis model within a database system. … the client application may have a modeler perspective comprising a “Navigator” 935 area that lists instances of databases systems, which may be called and available data and functionality may be consumed … such database systems may comprise stored procedure, such as Procedures 940 … The SQL statement 930 … is a statement that calls the execution of the stored procedure … The result is one result set that holds the information about a table and contains the result of a particular table’s output variable …”).);
in response to determining that the user data is available, automatically execute the SQL query in the relational database (Examiner’s note: This claim limitation broadly recites steps involving performing predictions (scoring) on new data. As indicated earlier, with a graphical interface that allows a user to select (and hence invoke) the corresponding available stored procedures that are present in the database system. A person having ordinary skill in the art would understand that invoking the available stored procedures present in the database system will trigger the SQL query within the stored procedure to execute and perform new predictions and scoring results on the available data within the corresponding database (Bhattacharjee [0027]; and Figure 9 and [0040]).); and 
generate a prediction on the user data retrieved from the relational database based on an input of the user data to the trained machine learning model (Examiner’s note: This claim limitation broadly recites steps involving performing predictions (scoring) on new data. As indicated earlier, with a graphical interface that allows a user to select (and hence invoke) the corresponding available stored procedures that are present in the database system. A person having ordinary skill in the art would understand that invoking the available stored procedures present in the database system will trigger the SQL query within the stored procedure to execute and perform new predictions and scoring results on the available data within the corresponding database (Bhattacharjee [0027]; and Figure 9 and [0040]).).  
While Bhattacharjee teaches using and training a decision tree algorithm as a machine learning model, Bhattacharjee does not explicitly teach
… that uses a decision tree ensemble …
… train … that uses the decision tree ensemble …
Brownlee teaches
… that uses a decision tree ensemble (Examiner’s note: Brownlee teaches training bagging/gradient boosting and random forest classifiers using datasets written in a programming language that is not a native database programming language (in this case, Python) (Brownlee pp.2-3 Random Forest and pp.3-4 Stochastic Gradient Boosting).) …
… train … that uses the decision tree ensemble (Examiner’s note: As indicated earlier, Brownlee teaches training ensemble models (including gradient boosting and bagged decision trees/random forest classifiers) using datasets written in a programming language that is not a database programming language (in this case, Python) (Brownlee p.2 About the Recipes: “…A standard classification problem from the UCI Machine Learning Repository is used to demonstrate each ensemble algorithm … Each ensemble algorithm is demonstrated using 10 fold cross validation …”; p.2 1.Bagged Decision Trees: “… Bagging performs best with algorithms that have high variance. A popular example are decision trees, often constructed without pruning. … A total of 100 trees are created.”; pp.2-3 2.Random Forest: “Random forest is an extension of bagged decision trees. Samples of the training dataset are taken with replacement, but the trees are constructed in a way that reduces the correlation between individual classifiers. … You can construct a Random Forest Model for classification using the RandomForestClassifier class …”; and pp.3-4 Stochastic Gradient Boosting: “… Stochastic Gradient Boosting (also called Gradient Boosting Machines) are one of the most sophisticated ensemble techniques … You can construct a Gradient Boosting model for classification using the GradientBoostingClassifier class …”).) …
Both Bhattacharjee and Brownlee are analogous art since they both teach training machine learning models, where the machine learning models are written in imperative programming languages that are not native database programming languages.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the decision tree model taught in Bhattacharjee and replace it with a decision tree ensemble (represented by either Bagged Decision Trees/Random Forest Classifier or a Gradient Boosting Classifier) taught in Brownlee as a way to perform pre-processing of historical data and training of a decision tree ensemble. The motivation to combine is taught in Brownlee, where the prediction accuracy is improved through training of these ensemble models on a dataset (e.g., for Bagged Decision Trees/Random Forest, the multiple bagged decision trees improves prediction accuracy on data that exhibits high variance; for Gradient Boosting, the multiple decision trees fixes the prediction errors of a prior decision tree model in the chain), thus making a system that uses these ensemble models to perform predictions more accurate (Brownlee p.1: “Ensembles can give you a boost in accuracy on your dataset. …”; p.1 Combine Model Predictions Into Ensemble Predictions: “… Boosting: building multiple models (typically of the same type) each of which learns to fix the prediction errors of a prior model in the chain. …”; p.2 1.Bagged Decision Trees: … Bagging performs best with algorithms that have high variance. …”; and p.3 Boosting Algorithms).
While Bhattacharjee in view of Brownlee teaches training machine learning models, where the machine learning models are written in imperative programming languages that are not native database programming languages, Bhattacharjee in view of Brownlee does not explicitly teach
… convert … [the decision tree ensemble] …
StackOverflow teaches
… convert … [the decision tree ensemble] (Examiner’s note: StackOverflow teaches a Python library wrapper sklearn2pmml that converts a trained machine learning model originally written in Python code into a PMML structured format file, where the supported Scikit-Learn Estimator and Transformer types (representing the object associated with the trained machine learning model) are listed in the associated Github README.md JPMML-SkLearn dated September 13, 2016, which includes ensemble models such as BaggingClassifier, GradientBoostingClassifier, and RandomForestClassifier (StackOverflow pp.1-4 1st answer dated September 26, 2016 at 19:31: “SkLearn2PMML is … a thin wrapper around the JPMML-SkLearn command-line application. For a list of supported Scikit-Learn Estimator and Transformer types, please refer to the documentation of the JPMML-SkLearn project. … There are two parts to an SkLearn2PMML conversion, an estimator … and a mapper (for preprocessing steps such as discretization or PCA). … sklearn2pmml(estimator=clf, mapper=default_mapper, pmml=“D:/workspace/IrisClassificationTree.pmml” … Let’s look at the .pmml file …”; Github README.md SkLearnPMML dated September 13, 2016; and Github README.md JPMML-SkLearn dated September 13, 2016). When combined with the teachings of  Bhattacharjee in view of Brownlee, the PMML file for a trained decision tree ensemble (generated by the sklearn2pmml Python library module) can be further converted into an object model instance that is provided to a database converting module to form SQL statements in a stored procedure which can be executed within a database server as an SQL query, with the conversion of the decision tree ensemble model corresponding to “… convert … [the decision tree ensemble] …”.) …
Both Bhattacharjee in view of Brownlee and StackOverflow are analogous art since they both teach using database-supported stored procedures containing trained machine learning models (written in imperative programming languages) to directly perform SQL queries within a database to retrieve data for generating model predictions.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to use the converting and database converting modules for converting a machine learning model (represented by a decision tree model) stored in a PMML file taught in Bhattacharjee in view of Brownlee and apply it to the PMML file representing a decision tree ensemble (where the PMML file was generated by the sk2learnpmml Python library module) taught in StackOverflow as a way to perform conversion of a decision tree ensemble model into object model instances that can be used in a database stored procedure to execute a SQL query and perform predictions directly in database memory. The motivation to combine is taught in StackOverflow, since providing support to convert existing machine learning code into a standardized structured PMML file format allows different systems to further use and apply the machine learning code. A person having ordinary skill in the art would recognize the versatility of having a standardized PMML file that can be applied certain machine learning systems that may not readily support the original machine learning programming code, in order to continue using and integrating the trained machine learning model in different workflow systems to generate the same predictable results, thus improving the reuse, flexibility, and portability of the existing machine language code to a variety of systems and applications (StackOverflow pp.1-4 1st answer dated September 26, 2016 at 19:31; Github README.md SkLearnPMML dated September 13, 2016; and Github README.md JPMML-SkLearn dated September 13, 2016).
Regarding previously presented Claim 6, 
Bhattacharjee in view of Brownlee, in further view of StackOverflow teaches
(Previously Presented) The system of claim 1, wherein the trained machine learning model is converted to the SQL query using a python library (Examiner’s note: As indicated earlier, StackOverflow teaches a Python library wrapper sklearn2pmml that converts a trained machine learning model originally written in Python code into a PMML structured format file, where the supported Scikit-Learn Estimator and Transformer types (representing the object associated with the trained machine learning model) are listed in the associated Github README.md JPMML-SkLearn dated September 13, 2016, which includes ensemble models such as BaggingClassifier, GradientBoostingClassifier, and RandomForestClassifier (StackOverflow pp.1-4 1st answer dated September 26, 2016 at 19:31: “SkLearn2PMML is … a thin wrapper around the JPMML-SkLearn command-line application. For a list of supported Scikit-Learn Estimator and Transformer types, please refer to the documentation of the JPMML-SkLearn project. … There are two parts to an SkLearn2PMML conversion, an estimator … and a mapper (for preprocessing steps such as discretization or PCA). … sklearn2pmml(estimator=clf, mapper=default_mapper, pmml=“D:/workspace/IrisClassificationTree.pmml” … Let’s look at the .pmml file …”; Github README.md SkLearnPMML dated September 13, 2016; and Github README.md JPMML-SkLearn dated September 13, 2016). When combined with the teachings of  Bhattacharjee in view of Brownlee, the PMML file for a trained decision tree ensemble (generated by the sklearn2pmml Python library module) is further converted into an object model instance used in a stored procedure which can be executed within a database server as an SQL query, with the conversion of the decision tree ensemble model using the sklear2pmml Python library module corresponding to “… converted to the SQL query using a Python library”.).  
Regarding previously presented Claim 7, 
Bhattacharjee in view of Brownlee, in further view of StackOverflow teaches
(Previously Presented) The system of claim 1, wherein the machine learning model is one of a gradient boosting model or a random forest model (Examiner’s note: As indicated earlier, Brownlee teaches training ensemble models, where these ensemble models include bagging/gradient boosting classifiers, and random forest classifiers (Brownlee p.2 1.Bagged Decision Trees; pp.2-3 2.Random Forest:; and pp.3-4 Stochastic Gradient Boosting).).  
Regarding amended Claim 8, 
Claim 8 recites a method comprising of claim limitations that are similar in scope to corresponding claim limitations in Claim 1, and hence is rejected under similar rationale and motivations provided by Bhattacharjee, Brownlee, and StackOverflow as indicated in Claim 1.
Regarding previously presented Claim 13, 
Claim 13 recites the method of claim 8 comprising of claim limitations that are similar in scope to corresponding claim limitations in Claim 6, and hence is rejected under similar rationale provided by Bhattacharjee in view of Brownlee, in further view of StackOverflow as indicated in Claim 6, in view of the rejections from Claim 8.
Regarding previously presented Claim 14, 
Claim 14 recites the method of claim 8 comprising of claim limitations that are similar in scope to corresponding claim limitations in Claim 7, and hence is rejected under similar rationale provided by Bhattacharjee in view of Brownlee, in further view of StackOverflow as indicated in Claim 7, in view of the rejections from Claim 8.
Regarding amended Claim 15, 
Claim 15 recites a non-transitory machine-readable medium having stored thereon machine-readable instructions executable to cause a machine to perform operations comprising of claim limitations that are similar in scope to corresponding claim limitations in Claim 1, and hence is rejected under similar rationale and motivations provided by Bhattacharjee, Brownlee, and StackOverflow as indicated in Claim 1.
Regarding previously presented Claim 19, 
Claim 19 recites the non-transitory machine-readable medium of claim 15, further comprising of claim limitations that are similar in scope to corresponding claim limitations in Claim 6, and hence is rejected under similar rationale provided by Bhattacharjee in view of Brownlee, in further view of StackOverflow as indicated in Claim 6, in view of the rejections from Claim 15.
Regarding previously presented Claim 20, 
Claim 20 recites the non-transitory machine-readable medium of claim 15, further comprising of claim limitations that are similar in scope to corresponding claim limitations in Claim 7, and hence is rejected under similar rationale provided by Bhattacharjee in view of Brownlee, in further view of StackOverflow as indicated in Claim 7, in view of the rejections from Claim 15.
Claims 2, 4, 9, 11, 16, 21, and 23-26 are rejected under 35 U.S.C. 103 as being unpatentable over 
Bhattacharjee et al., U.S. PGPUB 2014/0229491, published 8/14/2014 [hereafter referred as Bhattacharjee] in view of Brownlee, Jason, Ensemble Machine Learning Algorithms in Python with scikit-learn, June 3 2016 [hereafter referred as Brownlee], in further view of StackOverflow: Export python scikit learn models into pmml, answered September 26, 2016 at 19:31, with associated Github README.md for SkLearnPMML and JPMML-SkLearn, both dated September 13, 2016 [hereafter referred as StackOverflow] as applied to Claims 1, 8, and 15; in even further view of Natekin & Knoll, Gradient boosting machines, a tutorial, December 4, 2013 [hereafter referred as Natekin].  
Regarding previously presented Claim 2, 
Bhattacharjee in view of Brownlee, in further view of StackOverflow as applied to Claim 1 teaches 
(Previously Presented) The system of claim 1 …
wherein executing the instructions further causes the system to perform at least one iteration of: 
… generating the decision tree ensemble to train the machine learning model (Examiner’s note: As indicated earlier, Brownlee teaches training bagged/gradient boosting and random forest classifiers using datasets written in a programming language that is not a database programming language (in this case, Python) (Brownlee p.2 About the Recipes; p.2 1.Bagged Decision Trees; pp.2-3 2.Random Forest; and pp.3-4 Stochastic Gradient Boosting).).  
While Bhattacharjee in view of Brownlee, in further view of StackOverflow teaches training a boosting decision tree ensemble model for fixing prediction errors of a prior model in the chain, Bhattacharjee in view of Brownlee, in further view of StackOverflow does not explicitly teach
wherein executing the instructions further causes the system to perform at least one iteration of: 
… training a first decision tree to make a prediction using the formatted training data set … 
… determining whether an error exists in the first decision tree…
… in response to the determining that the error exists, training a second decision tree using the formatted training data set and the error … 
Natekin teaches
wherein executing the instructions further causes the system to perform at least one iteration of: 
… training a first decision tree to make a prediction using the formatted training data set (Examiner’s note: Natekin teaches ensemble models containing a plurality of weak simple models, where these weak simple models are added to the ensemble sequentially, with each model trained at each particular iteration with respect to the error of the whole ensemble, such that the combined ensemble obtains a stronger ensemble prediction. Natekin further teaches processing data to normalize and identify new features before feeding the data into a gradient boosted machine (GBM) for training, where this processed data represents formatted training data (Natekin p.1 col.1 2nd paragraph-col.2 1st paragraph (Section 1. Introduction): “… A different approach would be to build a bucket, or an ensemble of models for some particular learning task. … the ensemble approach relies on combining a large number of relatively weak simple models to obtain a stronger ensemble prediction. … The main idea of boosting is to add new models to the ensemble sequentially. At each particular iteration, a new weak, base-learner model is trained with respect to the error of the whole ensemble learnt so far. …”; p.6 col.1 Section 3.2 Specifying the Base Learners 2nd paragraph: “The commonly used base learner models can be classified into a three distinct categories: linear models, smooth models and decision trees.”; pp.7-8 Section 3.2.2 Decision tree base-learners; and p.11 col.2 Section 6.1.2 Data processing: “To proceed with the analysis, data has to be properly processed … we train the models on one half the available data and validate it on the other part. The train/test separation is organized sequentially: the first 100 points are used for training …”; and p.13 Figure 7F, where the figure shows the prediction from decision tree-based GBMs, where the decision tree models are first trained and then used for determining predictions.).) … 
… determining whether an error exists in the first decision tree (Examiner’s note: As indicated earlier, Natekin teaches training each weak simple model sequentially being added to the ensemble at each particular iteration with respect to the error of the whole ensemble, where this error is used as part of the loss function to fit the model to the data (Natekin p.1 col.1 2nd paragraph-col.2 1st paragraph (Section 1. Introduction); p.1 col.2 3rd paragraph (Section 1. Introduction): “In gradient boosting machines … the learning procedure consecutively fits new models to provide a more accurate estimate of the response variable. The principle idea behind this algorithm is to construct the new base-learners to be maximally correlated with the negative gradient of the loss function … if the error function is the classic squared-error loss, the learning procedure would result in consecutive error-fitting …”; and p.13 Figure 7D, which shows the squared-error loss for decision tree based GBMs over a number of iterations.).) …
… in response to the determining that the error exists, training a second decision tree using the formatted training data set and the error (Examiner’s note: As indicated earlier, Natekin teaches training each weak simple model sequentially being added to the ensemble at each particular iteration with respect to the error of the whole ensemble, where this error is used as part of the loss function to fit the model to the data, and hence this training process is being repeated for each weak simple model in the ensemble, thus representing a determination of training additional decision tree models based on the existing error to fit the training data (Natekin p.1 col.1 2nd paragraph-col.2 1st paragraph (Section 1. Introduction); p.1 col.2 3rd paragraph (Section 1. Introduction): “In gradient boosting machines … the learning procedure consecutively fits new models to provide a more accurate estimate of the response variable. The principle idea behind this algorithm is to construct the new base-learners to be maximally correlated with the negative gradient of the loss function … if the error function is the classic squared-error loss, the learning procedure would result in consecutive error-fitting …”; and p.13 Figure 7D, which shows the squared-error loss for decision tree based GBMs over a number of iterations.).) … 
Both Bhattacharjee in view of Brownlee, in further view of StackOverflow and Natekin are analogous art since they both teach training decision tree ensemble models.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the single decision tree training method taught in Bhattacharjee in view of Brownlee, in further view of StackOverflow and use the decision tree ensemble model training method taught in Natekin as a way to train each weak simple model within the ensemble model to produce improved accuracy in the model. The motivation to combine is taught in Natekin, as ensemble models provide higher accuracy results compared to single machine learning models, as well as providing faster convergence, thus providing both improved accuracy and improved performance in a system (Natekin p.2 col.1 1st paragraph: “… ensemble models are a useful practical tool for different predictive tasks, as they can consistently provide higher accuracy results compared to conventional single strong machine learning models … in boosted ensembles the base-learners play the role of the memory medium and are forming the captured patterns sequ[e]ntially, gradually increasing the level of pattern detail …”; p.12 col.2 last paragraph-p.13 col.1 3rd paragraph (Section 6.1.4): “… From the convergence plots on Figures 7C,D … the tree-based model with higher interaction depth is considerably more accurate than any of the GBMs models built … due to the increased model complexity, the convergence was achieved much faster … We can see that the stump-based GBM not only achieves nearly the same accuracy and convergence rates, but also predicts values very similar to the ones predicted by the spline-based GBM. … we will apply other popular machine learning techniques and compare their obtained performances … The algorithm accuracy comparisons are given in Table 1. …”; and p.13 Table 1 and Figure 7F.).
Regarding previously presented Claim 4, 
Bhattacharjee in view of Brownlee, in further view of StackOverflow, in even further view of Natekin teaches
(Previously Presented) The system of claim 2, wherein the performing the at least one iteration continues until the error converges (Examiner’s note: Natekin teaches determining convergence speed by through explaining the relationship between number of iterations and shrinkage parameter value to reach a similar convergence based on an empirical loss minimum, and performing training with training sets and measuring the error curves over a number of iterations, where minimizing the empirical loss over a number of iterations is interpreted as performing iterations until reaching a point where the error does not significantly change, such that the point where the number of iterations does not improve the error is interpreted as a convergence point, as shown in the error curves and convergence plots in Natekin p.9 Figure 5 and p.12 Figure 7 (Natekin p.9 Figure 5 and p.9 col.1 2nd paragraph: “… the cost of improving the generalization properties is the convergence speed. Choosing a stronger value of 𝛌 will increase the number of iterations M, required for convergence to a similar empirical loss minimum. … let us now consider the fitting experiment with both training and validation sets … The learning error curves for GBMs with different 𝛌 parameters are presented on Figure 5 … we can see that the training set error is substantially falling … each subsequent training set error is being reduced …”; and p.12 col.2 1st paragraph-p.13 col.1 1st paragraph: “… The corresponding convergence plots with the bootstrap estimates of the number of iterations are presented on Figures 7A,B. … From these convergence plots several implications can be deduced … The corresponding convergence plots for the tree-based GBMs are presented on Figures 7C,D. … due to the increased model complexity, the convergence was achieved much faster, which means that the optimal number of iterations M for the tree-based GBM is approximately 650 instead of a 1000.”).).  
Regarding previously presented Claim 9, 
Claim 9 recites the method of claim 8 comprising of claim limitations that are similar in scope to corresponding claim limitations in Claim 2, and hence is rejected under similar rationale and motivations provided by Bhattacharjee in view of Brownlee, in further view of StackOverflow and Natekin as indicated in Claim 2, in view of the rejections from Claim 8.
Regarding previously presented Claim 11, 
Claim 11 recites the method of claim 9 comprising of claim limitations that are similar in scope to corresponding claim limitations in Claim 4, and hence is rejected under similar rationale provided by Bhattacharjee in view of Brownlee, in further view of StackOverflow, in even further view of Natekin as indicated in Claim 4, in view of the rejections from Claim 9.
Regarding previously presented Claim 16, 
Claim 16 recites the non-transitory machine-readable medium of claim 15, with operations further comprising of claim limitations that are similar in scope to corresponding claim limitations in Claim 2, and hence is rejected under similar rationale and motivations provided by Bhattacharjee in view of Brownlee, in further view of StackOverflow and Natekin as indicated in Claim 2, in view of the rejections from Claim 15.
Regarding previously presented Claim 21,
 Bhattacharjee in view of Brownlee, in further view of StackOverflow as applied to Claim 1 teaches 
(Previously Presented) The system of claim 1.
However, Bhattacharjee in view of Brownlee, in further view of StackOverflow does not teach
wherein the machine learning model is configured such that the decision tree ensemble runs either in series or in parallel during the training the machine learning model.  
Natekin teaches
wherein the machine learning model is configured such that the decision tree ensemble runs either in series or in parallel during the training the machine learning model (Examiner’s note: As indicated earlier, Natekin teaches training each weak simple model sequentially being added to the ensemble at each particular iteration with respect to the error of the whole ensemble, where this sequential training represents training the ensemble in a serial order. Natekin further teaches that the each of the boosting iterations can be parallelized as part of the learning process, thus representing training the ensemble in a parallel order (Natekin p.1 col.1 2nd paragraph-col.2 1st paragraph (Section 1. Introduction); p.1 col.2 3rd paragraph (Section 1. Introduction); and p.20 col.1 2nd-3rd paragraphs: “… one can take full advantage of parallelization to obtain the predictions… A different approach to parallelization of the GBMs would be to parallelize each of the boosting iterations, which can still bring improvement in the evaluation speed.”).).  
Both Bhattacharjee in view of Brownlee, in further view of StackOverflow and Natekin are analogous art since they both teach training decision tree ensemble models.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the single decision tree training method taught in Bhattacharjee in view of Brownlee, in further view of StackOverflow and use the decision tree ensemble model training method taught in Natekin as a way to train each weak simple model within the ensemble model to produce improved accuracy in the model. The motivation to combine is taught in Natekin, as provided in the prior art claim mapping indicated in Claim 2.
Regarding previously presented Claim 23, 
Claim 23 recites the method of claim 8 comprising of claim limitations that are similar in scope to corresponding claim limitations in Claim 21, and hence is rejected under similar rationale and motivations provided by Bhattacharjee in view of Brownlee, in further view of StackOverflow and Natekin as indicated in Claim 21, in view of the rejections from Claim 8.
Regarding previously presented Claim 24, 
Bhattacharjee in view of Brownlee, in further view of StackOverflow, in even further view of Natekin teaches
(Previously Presented) The method of claim 9, wherein the performing the at least one iteration continues until a threshold number of iterations are performed (Examiner’s note: Natekin teaches specifying a maximum number of iterations                         
                            
                                
                                    M
                                
                                
                                    m
                                    a
                                    x
                                
                            
                        
                     to perform early stopping and to estimate an optimal number of iterations to prevent overfitting, where this maximum number of iterations represents performing until a threshold number of iterations (Natekin p.9 col.2-p.10 col.1 Section 4.3 Early Stopping: “… On[[c]]e important practical consideration that can be derived from Figure 5 is that one can greatly benefit from early stopping … the shrinkage parameter 𝛌, the maximum number of iterations                         
                            
                                
                                    M
                                
                                
                                    m
                                    a
                                    x
                                
                            
                        
                     and the cross-validation parameter k, corresponding to the number of validation folds, are specified. …”).).  
Regarding previously presented Claim 25, 
Claim 25 recites the non-transitory machine-readable medium of claim 15, further comprising of claim limitations that are similar in scope to corresponding claim limitations in Claim 21, and hence is rejected under similar rationale and motivations provided by Bhattacharjee in view of Brownlee, in further view of StackOverflow and Natekin as indicated in Claim 21, in view of the rejections from Claim 15.
Regarding previously presented Claim 26, 
Claim 26 recites the non-transitory machine-readable medium of claim 16, with operations further comprising of claim limitations that are similar in scope to corresponding claim limitations in Claim 4, and hence is rejected under similar rationale provided by Bhattacharjee in view of Brownlee, in further view of StackOverflow, in even further view of Natekin as indicated in Claim 4, in view of the rejections from Claim 16.
Claim 22 is rejected under 35 U.S.C. 103 as being unpatentable over 
Bhattacharjee et al., U.S. PGPUB 2014/0229491, published 8/14/2014 [hereafter referred as Bhattacharjee] in view of Brownlee, Jason, Ensemble Machine Learning Algorithms in Python with scikit-learn, June 3 2016 [hereafter referred as Brownlee], in further view of StackOverflow: Export python scikit learn models into pmml, answered September 26, 2016 at 19:31, with associated Github README.md for SkLearnPMML and JPMML-SkLearn, both dated September 13, 2016 [hereafter referred as StackOverflow], in even further view of Natekin & Knoll, Gradient boosting machines, a tutorial, December 4, 2013 [hereafter referred as Natekin] as applied to Claim 2; in even further view of Baranauskas, Jose Augusto, How Many Trees in a Random Forest?, July 2012 [hereafter referred as Baranauskas].  
Regarding previously presented Claim 22,
 Bhattacharjee in view of Brownlee, in further view of StackOverflow, in even further view of Natekin as applied to Claim 2 teaches 
(Previously Presented) The system of claim 2.
However, Bhattacharjee in view of Brownlee, in further view of StackOverflow, in even further view of Natekin does not teach
wherein the performing the at least one iteration continues until a number of decision trees in the decision tree ensemble equals a threshold number of decision trees.
Baranauskas teaches
wherein the performing the at least one iteration continues until a number of decision trees in the decision tree ensemble equals a threshold number of decision trees (Examiner’s note: Baranauskas teaches constructing and training decision trees in a random forest, where the addition of new trees into the random forest is doubled at every iteration until there is no further performance gain detected, where this determination of no additional performance gain detected (based on AUC values) represents a threshold number of trees (Baranauskas p.154 Abstract: “… The research reported here analyzes whether there is an optimal number of trees within a Random Forest, i.e., a threshold from which increasing the number of trees would bring no significant performance gain, and would only increase the computational cost. Our main conclusions are: as the number of trees grows, it does not always mean the performance of the forest is significantly better than previous forests (fewer trees) … It is also possible to state there is a threshold beyond which there is no significant gain …”; p.155 5th paragraph (Section 1. Introduction): “… we have analyzed the performance of Random Forests as the number of trees grows (from 2 to 4096 trees, and doubling the number of trees at every iteration), aiming to seek out for a number (or a range of numbers) of trees from which there is no more significant performance gain, unless huge computational resources are available for large datasets.”; p.163 2nd-3rd paragraphs (Section 7. Results and Discussion); and p.166 Section 8 Conclusion 1st paragraph: “… it is possible to suggest, based on the experiments, a range between 64 and 128 trees in a forest. With these numbers of trees it is possible to obtain good balance between AUC, processing time, and memory usage. …”).).  
Both Bhattacharjee in view of Brownlee, in further view of StackOverflow, in even further view of Natekin and Baranauskas are analogous art since they both teach constructing and training decision tree ensemble models.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to take the single decision tree training method taught in Bhattacharjee in view of Brownlee, in further view of StackOverflow, in even further view of Baranauskas and apply the determined threshold number of decision trees taught in Baranauskas as a way to optimize the computational cost of generating decision tree ensemble models. The motivation to combine is taught in Baranauskas, since the determination and application of a threshold number of decision trees in which no further performance gain is detected allows a system to be more computationally and memory efficient (by re-allocating its computational power to other tasks) without sacrificing overall accuracy of the model (Baranauskas p.154 Abstract and p.166 Section 8 Conclusion).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Warren et al., U.S. PGPUB 2007/0027905, published 2/1/2007, where Warren teaches a source query representation (implemented as a semantic tree created by a high level programming language) that is converted by a translation module into a target query representation, which is represented in a target language such as SQL to be used by a relational database (Figures 1 and 4; [0028]-[0033]).
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM WAI YIN KWAN whose telephone number is 303-297-4332. The examiner can normally be reached Monday-Friday 8:00am - 4:30pm PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on 571-272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/WILLIAM WAI YIN KWAN/Examiner, Art Unit 2121                                                                                                                                                                                                        


/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121