DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. EP17306308.2, filed on 09/29/2017.

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 


Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) are:
“model selector” in claims 1,6, 10-12, 17 and 20

model selector” are found at paragraphs [0124] and [0129] which state:
[0124] “The processing illustrated in FIG. 15 may be used to determine an optimized model selection strategy that may be a function of input data, scoring metrics (e.g., metrics to determine how well a strategy is performing), and historical states (e. g., prior strategies). The processing depicted in FIG. 15 may be performed by a machine learning platform described above. The processing depicted in FIG. 15 may be implemented in software (e.g., code, instructions, program) executed by one or more processing units (e.g., processors, cores) of the respective systems, hardware, or combinations thereof.”

[0129] “At 1550, the computer system may select the model group and a model selector for the model group. The model group may include one or more machine learning models, where each ML model in the model group may be configured to perform a same function, such as classifying end user intents.”

However, the above cited portion of the specifications does not have structural support for “model selector” in the specification.  
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.



Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
Claim limitation “model selector” in claims 1, 6, 10-12, 17 and 20
invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. Limitation from above claims is devoid of any structure that performs the function in the claim and furthermore, the specification does not provide sufficient structure to perform the function as recited in this claim as previously stated under 112f. Therefore, the claims are indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph.
The term “most” in claim 13 is a relative term which renders the claim indefinite. The term “most” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of 
Dependent claims 2-16 are rejected for being dependency of independent claim 1. 
Dependent claims 18-19 are rejected for being dependency of independent claim 17. 
 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.

Claims 1-3, 5-6, 10, 12-13, 17 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ahmadi et al. (“Fuzzy Modification of Mixture of Experts”) in view of Chaoji et al. (US Pat No. 10380498 B1) and further in view of Britto et al. (“Dynamic selection of classifiers-A comprehensive review”). 
Regarding claim 1
Ahmadi teaches a method comprising, by a computing system:  2selecting a model group and a model selector for the model group, (pg. 487 section 5 “Using a mixture-of-experts classifier as the second or third boosting classifier can solve two problems: The difficult patterns may be more easily partitioned to subgroups, while the second and third boosting classifiers usually handle a more difficult problem from the original one. This approach incorporates classifier selection and classifier combination.”)
wherein: 3the model group includes one or more machine learning (ML) models, 4each ML model in the model group configured to perform a same function; (abstract “In the present paper, we use an ensemble of NNs which are trained using different subsets of entire training data set. Then a fuzzy inference unit is used to process the outputs of NNs. A criterion is introduced to modify the topologies of NNs and in addition, fuzzy rules are generated simultaneously and automatically. Also a method is presented to divide the feature space into Regions of Competence (ROC). Each classifier in the ensemble will be an expert for a ROC”)
 (pg. 103 second paragraph “Now we design three classifiers with lower complexities which are expert for each region. The classifiers selected to classify signature classes in each region are 3-layer [25 25 7], [25 25 8] and [25 25 10] NNs. The amount of classification accuracy rate for each expert is 98%, 91.5% and 87.5% respectively. As shown in table (4), with a crisp processing of the three experts,”)
based on a set of 6rules or a trainable selection model, at least one ML model from the model group for data 7analysis; (pg. 90 third paragraph “We expect that the complexity of the competent classifiers is less than the single best classifier. In the second step of the training phase, we generate fuzzy rules. This may be performed using another set of training vectors. Generated fuzzy rules can be used as a criterion to investigate if the topologies of competent classifiers are sufficient to perform an accurate classification task (section 3). After fuzzy rules and competent classifiers are fixed, our classification system will be designed.”)
8analyzing input data using the model group and the model selector; (pg. 91 section 2 “Figure (1) shows the general block diagram of the proposed system. The unit “Competent Classifiers” of the system, processes input patterns using different decision functions. Each classifier in the ensemble has different discriminant function and is responsible in different regions of the d-dimensional space of the feature vectors.”)  
Ahmadi does not teach and 5the model selector is configured to dynamically select, 9determining, during the analyzing, a score for the model group and the model 10selector based on the analyzing and a set of scoring metrics; 

Chaoji teaches determining, during the analyzing, a score for the model group and the model 10selector based on the analyzing and a set of scoring metrics; (Col 12 lines 51-61 “At 514, the Model Factory 102 determines whether the score distribution is within a threshold range. The threshold range may be determined from the user directive, or by comparing score distributions over time. Score distributions falling outside of the determined threshold range are retrained at 516. Retraining the ML model 516 may also include refreshing, rebuilding, or rescoring the ML model. Scores reflecting an acceptable level of distribution (i.e., within the threshold range) result in continued ML performance monitoring at 502. Data sampling rates may be determined comparing score distribution over time.”)
and 11updating, during the analyzing, the model selector or the model group based upon 12determining that the score is below a threshold value. (Col 12 lines 51-61 “At 514, the Model Factory 102 determines whether the score distribution is within a threshold range. The threshold range may be determined from the user directive, or by comparing score distributions over time. Score distributions falling outside of the determined threshold range are retrained at 516. Retraining the ML model 516 may also include refreshing, rebuilding, or rescoring the ML model. Scores reflecting an acceptable level of distribution (i.e., within the threshold range) result in continued ML performance monitoring at 502. Data sampling rates may be determined comparing score distribution over time.”)
Ahmadi and Chaoji are analogous art because they are both directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ahmadi to incorporate the teaching of Chaoji to include method or system for automated generation using machine learning algorithm to enable platform services for end to end sequence of modeling steps.
One of ordinary skill in the art would have been motivated to make this modification in order to improve “model generation efficiency, model performance, and first run performance of individual ML models by learning from the improvements made to one or more prior ML models having similar characteristics” as disclosed by Chaoji (abstract “The system further identifies common requirements between the user directive and one or more prior user directives and associates characteristics of the prior user directive, or model generated therefrom, with the user directive. The system further associates performance values generated by continuous monitoring of deployed ML models to individual characteristics of the user directive used to generate each of the deployed ML models. The system continuously improves model generation efficiency, model performance, and first run performance of individual ML models by learning from the improvements made to one or more prior ML models having similar characteristics.”). 
Ahmadi in view of Chaoji does not teach and 5the model selector is configured to dynamically select. 
Britto teaches and 5the model selector is configured to dynamically select. (Pg. 3666 “The focus of this paper is on the second phase of an MCS, particularly, the approaches based on dynamic selection (DS) of classifiers or ensembles of such classifiers.”)
Ahmadi, Chaoji and Britto are analogous art because they are all directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ahmadi in view of Chaoji to incorporate the teaching of Britto to include method or system for automated generation using machine learning algorithm to enable platform services for end to end sequence of modeling steps.
One of ordinary skill in the art would have been motivated to make this modification in order to improve “model generation efficiency, model performance, and first run performance of individual ML models by learning from the improvements made to one or more prior ML models having similar characteristics” as disclosed by Britto (abstract “The system further identifies common requirements between the user directive and one or more prior user directives and associates characteristics of the prior user directive, or model generated therefrom, with the user directive. The system further associates performance values generated by continuous monitoring of deployed ML models to individual characteristics of the user directive used to generate each of the deployed ML models. The system continuously improves model generation efficiency, model performance, and first run performance of individual ML models by learning from the improvements made to one or more prior ML models having similar characteristics.”). 

Regarding claim 17
1		Claims 17 is directed to a system comprising: 2one or more processors; and 3a memory coupled to the one or more processors, the memory storing instructions, 4which, when executed by the one or more processors, cause the system perform the method of claim 1 respectively. Therefore the rejection made to claim 1 is applied to claim 17.  
	In addition, Chaoji teaches processors and memory see FIG. 2 and (col 7 lines 20-23 “The Model Factory 102 may comprise one or more processors 202 capable of executing one or more modules stored on a computer-readable storage media 204.”)


Regarding claim 20
1		Claim 20 is directed to a system comprising: 2one or more processors; and 3a memory coupled to the one or more processors, the memory storing instructions, 4which, when executed by the one or more processors, cause the system perform the method of claim 1 respectively. Therefore the rejection made to claim 1 is applied to claim 20. 
	In addition, Chaoji teaches processors and memory see FIG. 2 and (col 7 lines 20-23 “The Model Factory 102 may comprise one or more processors 202 capable of executing one or more modules stored on a computer-readable storage media 204.”)

Regarding claim 2
Ahmadi in view of Chaoji with Britto 1			

teaches the method of claim 1. 
Ahmadi further teaches wherein the one or more ML models in the model 2group have a common ML model schema. (Pg. 90 second-third paragraph “In this paper we also propose a method in which classifiers of the ensemble are locally trained and the outputs of all classifiers are fused to classify using a fuzzy system. Experimental results demonstrate the advantages of the proposed method (section 3). The proposed system contains two phase: training phase and operating phase. Training phase contains two steps. In the first step of the training phase, we first employ some classifiers over entire the training data set. The topology of these classifiers may be chosen arbitrarily.”)
Regarding claim 31
Ahmadi in view of Chaoji with Britto 1			

teaches the method of claim 1. 
Chaoji further teaches wherein the one or more ML models in the model 2group include different versions of a machine learning model. (Col 8 lines 44-48 “The model selection module 214 selects a ML algorithm from an ML algorithm database 220. The ML algorithm database 220 may contain one or more ML modeling algorithms supported by the Model Factory 102 (e.g., linear models, RandomForest, boosted trees, etc.).”)
Ahmadi, Britto and Chaoji are analogous art because they are all directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ahmadi in view of Britto to incorporate the teaching of Chaoji to include method or system for automated 
One of ordinary skill in the art would have been motivated to make this modification in order to improve “model generation efficiency, model performance, and first run performance of individual ML models by learning from the improvements made to one or more prior ML models having similar characteristics” as disclosed by Chaoji (abstract “The system further identifies common requirements between the user directive and one or more prior user directives and associates characteristics of the prior user directive, or model generated therefrom, with the user directive. The system further associates performance values generated by continuous monitoring of deployed ML models to individual characteristics of the user directive used to generate each of the deployed ML models. The system continuously improves model generation efficiency, model performance, and first run performance of individual ML models by learning from the improvements made to one or more prior ML models having similar characteristics.”). 

Regarding claim 5
Ahmadi in view of Chaoji with Britto 1			

teaches the method of claim 1. 
Ahmadi further teaches wherein the set of rules includes a rule for 2selecting the at least one model based on attributes of the input data. (Pg. 90 third paragraph “In the second step of the training phase, we generate fuzzy rules. This may be performed using another set of training vectors. Generated fuzzy rules can be used as a criterion to investigate if the topologies of competent classifiers are sufficient to perform an accurate classification task (section 3). After fuzzy rules and competent classifiers are fixed, our classification system will be designed.”)


Regarding claim 6
Ahmadi in view of Chaoji with Britto 1			

teaches the method of claim 1. 
Ahmadi further teaches wherein updating the model selector comprises: 2adding a new rule to the set of rules; (pg. 98 “Looking at these matrices, the conclusion we reach is that the third rule in RM{3}, not only is a weak rule but also has been repeated in RM{2} which is a strong rule. Thus, for a better classification we will eliminate the third rule in rule matrix {3}. With new topology of NNs, we have generated a new rule which will improve the classification task.”) 3revising a rule in the set of rules; or revising the trainable selection model.  

Regarding claim 10
Ahmadi in view of Chaoji with Britto 1			

teaches the method of claim 1. 
Chaoji further teaches wherein analyzing the input data using the model 2group and the model selector comprises: 3analyzing a first portion of the input data using a first ML model in the model 4group; (col 9 lines 37-43 “At 302, the Model Factory 102 receives a current user directive 104(n) which includes one or more model requirements. For instance, the model requirements may include at least a target feature. Additionally, the requirements may include one or more raw data sources, or portions thereof, and one or more process steps to be conducted by the Model Factory 102. The user then executes the user directive.”)
and 5analyzing a second portion of the input data using a second ML model in the 6model group. (Col 10 lines 18-25 “At 308, the Model Factory 102 associates with the current user directive features associated with the prior user directive and the features are selected for model development of the current model. In addition, or alternatively, the Model Factory 102 may prune features prior to model generation using a combination of correlation and/or model based feature selection techniques to arrive at a smaller feature set having optimized predictive capability.”)
Ahmadi, Britto and Chaoji are analogous art because they are all directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ahmadi in view of Britto to incorporate the teaching of Chaoji to include method or system for automated generation using machine learning algorithm to enable platform services for end to end sequence of modeling steps.
One of ordinary skill in the art would have been motivated to make this modification in order to improve “model generation efficiency, model performance, and first run performance of individual ML models by learning from the improvements made to one or more prior ML models having similar characteristics” as disclosed by Chaoji (abstract “The system further identifies common requirements between the user directive and one or more prior user directives and associates characteristics of the prior user directive, or model generated therefrom, with the user directive. The system further associates performance values generated by continuous monitoring of deployed ML models to individual characteristics of the user directive used to generate each of the deployed ML models. The system continuously improves model generation efficiency, model performance, and first run performance of individual ML models by learning from the improvements made to one or more prior ML models having similar characteristics.”). 

Regarding claim 12
Ahmadi in view of Chaoji with Britto 1			

teaches the method of claim 1. 
Chaoji further teaches wherein the model selector is further configured to 2determine a scheme for using the selected at least one ML model to analyze the input data. (Col 12 lines 23-34 “At 506, the Model Factory 102 determines a distribution of the model input data 116. Furthermore, the Model Factory 102 may additionally or alternatively determine a distribution of the ML model output data at 508. The distribution measurements may be a measure of statistical variance, for example a measure relative to a general tendency of the data such as mean, median, mode; and/or some measure of dispersion of the data such as the average deviation from the mean, mean square deviation, or root mean square deviation. The determined distribution of each of the data points in the input features and output scores is associated with a distribution score at 510.”)
Ahmadi, Britto and Chaoji are analogous art because they are all directed to machine learning. 
Ahmadi in view of Britto to include method or system for automated generation using machine learning algorithm to enable platform services for end to end sequence of modeling steps.
One of ordinary skill in the art would have been motivated to make this modification in order to improve “model generation efficiency, model performance, and first run performance of individual ML models by learning from the improvements made to one or more prior ML models having similar characteristics” as disclosed by Chaoji (abstract “The system further identifies common requirements between the user directive and one or more prior user directives and associates characteristics of the prior user directive, or model generated therefrom, with the user directive. The system further associates performance values generated by continuous monitoring of deployed ML models to individual characteristics of the user directive used to generate each of the deployed ML models. The system continuously improves model generation efficiency, model performance, and first run performance of individual ML models by learning from the improvements made to one or more prior ML models having similar characteristics.”). 

Regarding claim 13
Ahmadi in view of Chaoji with Britto  1			

teaches the method of claim 12. 
Chaoji further teaches wherein the scheme for using the selected at least 2one ML model to analyze the input data comprises: 3analyzing a same portion of the input (col 12 lines 15-22 “At 504, the Model Factory 102 receives model output data 118 from the deployed ML model. The model output data 118 may be a copy of the output data or a reference indicating a location of the data in a data store. Still further, the reference may be a copy of the ML model and a reference to one or more raw data sources 106, or portions thereof, the raw data sources being identical to the model input data raw data source(s).”)
and 5selecting, from results of analyzing the same portion of the input data by the 6selected at least one ML model, (col 10 lines 18-25 “At 308, the Model Factory 102 associates with the current user directive features associated with the prior user directive and the features are selected for model development of the current model. In addition, or alternatively, the Model Factory 102 may prune features prior to model generation using a combination of correlation and/or model based feature selection techniques to arrive at a smaller feature set having optimized predictive capability.”)
a most common result as a result for the portion of the input 7data. (Col 12 lines 15-22 “At 504, the Model Factory 102 receives model output data 118 from the deployed ML model. The model output data 118 may be a copy of the output data or a reference indicating a location of the data in a data store. Still further, the reference may be a copy of the ML model and a reference to one or more raw data sources 106, or portions thereof, the raw data sources being identical to the model input data raw data source(s).”)
Ahmadi, Britto and Chaoji are analogous art because they are both directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ahmadi in view of Britto to incorporate the teaching of Chaoji to include method or system for automated generation using machine learning algorithm to enable platform services for end to end sequence of modeling steps.
One of ordinary skill in the art would have been motivated to make this modification in order to improve “model generation efficiency, model performance, and first run performance of individual ML models by learning from the improvements made to one or more prior ML models having similar characteristics” as disclosed by Chaoji (abstract “The system further identifies common requirements between the user directive and one or more prior user directives and associates characteristics of the prior user directive, or model generated therefrom, with the user directive. The system further associates performance values generated by continuous monitoring of deployed ML models to individual characteristics of the user directive used to generate each of the deployed ML models. The system continuously improves model generation efficiency, model performance, and first run performance of individual ML models by learning from the improvements made to one or more prior ML models having similar characteristics.”). 

Claims 4 and 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Ahmadi et al. (“Fuzzy Modification of Mixture of Experts”) in view of Chaoji et al. (US Britto et al. and further in view of Baughman et al. (US Pat No. 10318552 B2). 
Regarding claim 4
Ahmadi in view of Chaoji with Britto 1			

teaches the method of claim 1. 
Ahmadi in view of Chaoji with Britto does not teach wherein the set of scoring metrics comprises a 2business goal.  
Baughman teaches wherein the set of scoring metrics comprises a 2business goal. (Examiner notes that the scoring data is normalized into numerical representation and is used to produce a prediction in locating natural resources as evidence by col 2 lines 19-23 and also see col 17 lines 24-31 “The threshold that limits combining clusters is determined by scoring the favorable cluster space based on a metric, such as the overall accuracy of an experiment or a RAND index, for example. A RAND index or linear discriminate analysis may be used to tightly couple clusters or spread clusters out. A rand index is a measure of the similarity between two data clusters, and linear discriminate analysis finds a linear combination of features that can be used to separate or characterize two data clusters.”)
Ahmadi, Chaoji, Britto and Baughman are analogous art because they are all directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ahmadi in view of Chaoji with Britto to incorporate the teaching of Baughman to include method or system for data analytics and probability mapping of natural resource location.
Baughman (col 1 lines 20-26 “Techniques and processes may rely on the experience of previous success, or make use of scientific discoveries or relationships. In some instances real-time data from instruments may be used to determine locations most likely to include quantities of sought-after natural resources. Large quantities of information and data may exist, which may be pertinent to discovering or improving the accuracy of locating natural resources; however, effective and meaningful ways of using multiple sources and vast quantities of data that may be available for improved decision making of natural resource location has been elusive.”). 

Regarding claim 81
Ahmadi in view of Chaoji with Britto 
teaches the method of claim 1. 
Ahmadi in view of Chaoji with Britto does not teach wherein the input data includes real-time input 2data from a production environment.  
Baughman teaches wherein the input data includes real-time input 2data from a production environment. (Col 15 lines 6-10 “Resource locator program 300 also receives real-time data input (step 310). The real-time data may include, for example, data from sensors measuring current temperature at one or more locations, wind speed, humidity, weather conditions, or tidal conditions. Real-time data may also include social media communications related to aspects of one or more natural resources.”)
Ahmadi, Chaoji, Britto and Baughman are analogous art because they are all directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ahmadi in view of Chaoji with Britto to incorporate the teaching of Baughman to include method or system for data analytics and probability mapping of natural resource location.
One of ordinary skill in the art would have been motivated to make this modification in order to improve the accuracy of locating natural resources effectively using machine learning model and improved decision making of natural resource location as disclosed by Baughman (col 1 lines 20-26 “Techniques and processes may rely on the experience of previous success, or make use of scientific discoveries or relationships. In some instances real-time data from instruments may be used to determine locations most likely to include quantities of sought-after natural resources. Large quantities of information and data may exist, which may be pertinent to discovering or improving the accuracy of locating natural resources; however, effective and meaningful ways of using multiple sources and vast quantities of data that may be available for improved decision making of natural resource location has been elusive.”). 

Regarding claim 9
Ahmadi in view of Chaoji with Britto 1			teaches the method of claim 8. 
Ahmadi in view of Chaoji with Britto does not teach wherein the input data includes contextual data of 2the production environment. 
Baughman teaches wherein the input data includes contextual data of 2the production environment. (Pg. 303 “The data going into a lake contain logs and sensor data (e.g., from the Internet of Things), low level customer behavior (e.g., Website click streams), social media, document collections of (e.g., email and customer files), geo-location trails, images, video and audio and another data useful for integrated analysis. The data lake governance includes application framework to capture and contextualize data by cataloging and indexing and further advanced metadata management.”)
Ahmadi, Chaoji, Britto and Baughman are analogous art because they are all directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ahmadi in view of Chaoji with Britto to incorporate the teaching of Baughman to include method or system for data analytics and probability mapping of natural resource location.
One of ordinary skill in the art would have been motivated to make this modification in order to improve the accuracy of locating natural resources effectively using machine learning model and improved decision making of natural resource location as disclosed by Baughman (col 1 lines 20-26 “Techniques and processes may rely on the experience of previous success, or make use of scientific discoveries or relationships. In some instances real-time data from instruments may be used to determine locations most likely to include quantities of sought-after natural resources. Large quantities of information and data may exist, which may be pertinent to discovering or improving the accuracy of locating natural resources; however, effective and meaningful ways of using multiple sources and vast quantities of data that may be available for improved decision making of natural resource location has been elusive.”). 

Claim 7, 11, 14-16 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Ahmadi et al. (“Fuzzy Modification of Mixture of Experts”) in view of Chaoji et al. (US Pat No. 10380498 B1) in view of Britto et al. and further in view of Breckenridge et al. (US Pat No. 8595154 B2). 
Regarding claim 7
Ahmadi in view of Chaoji with Britto 1			

teaches the method of claim 1. 
Chaoji further teaches wherein updating the model group comprises: retraining a first ML model in the model group based on the analyzing and the 3score; (Col 12 lines 51-61 “At 514, the Model Factory 102 determines whether the score distribution is within a threshold range. The threshold range may be determined from the user directive, or by comparing score distributions over time. Score distributions falling outside of the determined threshold range are retrained at 516. Retraining the ML model 516 may also include refreshing, rebuilding, or rescoring the ML model. Scores reflecting an acceptable level of distribution (i.e., within the threshold range) result in continued ML performance monitoring at 502. Data sampling rates may be determined comparing score distribution over time.”)
Ahmadi in view of Chaoji with Britto does not teach and 4adding the retrained first ML model to the model group. 
Breckenridge teaches and 4adding the retrained first ML model to the model group. (Col 9 lines 56-62 “Each trained predictive model can be associated with its respective effectiveness score. One or more of the trained predictive models in the repository 215 are updateable predictive models. In some implementations, the predictive models stored in the repository 215 are trained using the entire initial training data, i.e., all K partitions and not just K-1 partitions.”)
Ahmadi, Chaoji, Britto and Breckenridge are analogous art because they are all directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ahmadi in view of Chaoji with Britto to incorporate the teaching of Breckenridge to include method or system for dynamic predictive modeling platform for training and retraining predictive models. 
One of ordinary skill in the art would have been motivated to make this modification in order to reduce “the amount of training data that may be required to train a predictive model can be large, e.g., in the order of gigabytes or terabytes” and obtain a desired predictive output as disclosed by Breckenridge (col 1 lines 15-28 “Various types of predictive models can be used to analyze data and generate predictive outputs. Typically, a predictive model is trained with training data that includes input data and output data that mirror the form of input data that will be entered into the predictive model and the desired predictive output, respectively. The amount of training data that may be required to train a predictive model can be large, e.g., in the order of gigabytes or terabytes. The number of different types of predictive models available is extensive, and different models behave differently depending on the type of input data. Additionally, a particular type of predictive model can be made to behave differently, for example, by adjusting the hyper-parameters or via feature induction or selection.”). 

Regarding claim 11
Ahmadi in view of Chaoji with Britto 1			

teaches the method of claim 10. 
Ahmadi in view of Chaoji with Britto does not teach wherein analyzing the input data using the model 2group and the model selector further comprises analyzing a third portion of the input data using a 3third ML model in the model group.  
Breckenridge teaches wherein analyzing the input data using the model 2group and the model selector further comprises analyzing a third portion of the input data using a 3third ML model in the model group. (Examiner notes that FIG. 6-7 shows condition satisfied check and when it doesn’t the process loop back to 602 or 702 to continue training to machine learning model and third time it loops back it uses third machine learning model to train the input data see col 17 lines 23-32 “he process 700 begins with providing access to a first trained predictive model (e.g., trained predictive model 218) (Box 702). That is, for example, operations such as those described above in reference to boxes 602–612 of FIG. 6 can have already occurred such that the first trained predictive model has been selected (e.g., based on effectiveness) and access to the first trained predictive model has been provided, e.g., to the client computing system 202. In another example, the first trained predictive model can be a trained predictive model that was trained using the initial training data.”)
Ahmadi, Chaoji, Britto and Breckenridge are analogous art because they are all directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ahmadi in view of Chaoji with Britto incorporate the teaching of Breckenridge to include method or system for dynamic predictive modeling platform for training and retraining predictive models. 
One of ordinary skill in the art would have been motivated to make this modification in order to reduce “the amount of training data that may be required to train a predictive model can be large, e.g., in the order of gigabytes or terabytes” and obtain a desired predictive output as disclosed by Breckenridge (col 1 lines 15-28 “Various types of predictive models can be used to analyze data and generate predictive outputs. Typically, a predictive model is trained with training data that includes input data and output data that mirror the form of input data that will be entered into the predictive model and the desired predictive output, respectively. The amount of training data that may be required to train a predictive model can be large, e.g., in the order of gigabytes or terabytes. The number of different types of predictive models available is extensive, and different models behave differently depending on the type of input data. Additionally, a particular type of predictive model can be made to behave differently, for example, by adjusting the hyper-parameters or via feature induction or selection.”). 

Regarding claim 14
Ahmadi in view of Chaoji with Britto 1			

teaches the method of claim 1. 
Ahmadi in view of Chaoji with Britto does not teach the method further comprising: reporting usage of the one or more ML models in the model group for the 3analyzing. 
Breckenridge teaches the method further comprising: reporting usage of the one or more ML models in the model group for the 3analyzing. (Col 9 lines 4-8 “In other implementations, techniques other than, or in addition to, cross-validation can be used to estimate the effective ness. In one example, the resource usage costs for using the trained model can be estimated and can be used as a factor to estimate the effectiveness of the trained model.”) 
Ahmadi, Chaoji, Britto and Breckenridge are analogous art because they are all directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ahmadi in view of Chaoji with Britto to incorporate the teaching of Breckenridge to include method or system for dynamic predictive modeling platform for training and retraining predictive models. 
One of ordinary skill in the art would have been motivated to make this modification in order to reduce “the amount of training data that may be required to train a predictive model can be large, e.g., in the order of gigabytes or terabytes” and obtain a desired predictive output as disclosed by Breckenridge (col 1 lines 15-28 “Various types of predictive models can be used to analyze data and generate predictive outputs. Typically, a predictive model is trained with training data that includes input data and output data that mirror the form of input data that will be entered into the predictive model and the desired predictive output, respectively. The amount of training data that may be required to train a predictive model can be large, e.g., in the order of gigabytes or terabytes. The number of different types of predictive models available is extensive, and different models behave differently depending on the type of input data. Additionally, a particular type of predictive model can be made to behave differently, for example, by adjusting the hyper-parameters or via feature induction or selection.”). 

Regarding claim 15
TTTT


Ahmadi in view of Chaoji with Britto 1			

teaches the method of claim 1. 
Ahmadi in view of Chaoji with Britto does not teach the method further comprising:  2receiving a plurality of ML models; 3selecting the one or more ML models from the plurality of ML models; 4determining a common schema for the one or more ML models; 5converting a first ML model having a schema different from the common schema 6based on the common schema; and 7adding the converted first ML model to the model group.  
Breckenridge teaches the method further comprising:  2receiving a plurality of ML models; (FIG. 6 step 610 shows receiving plurality of machine learning models col 16 lines 22-33 “That is, retrained predictive models are generated (Box 608) using: the training data queue 213; the updateable trained predictive models obtained from the repository 215; and the corresponding training functions that were initially used to train the updateable trained predictive models, which training functions are obtained from the training function repository 216. The effectiveness of each of the generated retrained predictive models is estimated (Box 610). The effectiveness can be estimated, for example, in the manner described above in reference to FIG. 5 and an effectiveness score for each retrained predictive model can be generated.”)
3selecting the one or more ML models from the plurality of ML models; (col 16 lines 34-42 “A trained predictive model is selected from the multiple trained predictive models based on their respective effective ness scores. That is, the effectiveness scores of the retrained predictive models and the effectiveness scores of the trained predictive models already stored in the repository 215 can be compared and the most effective model, i.e., a first trained predictive model, selected. Access is provided to the first trained predictive model to the client computing system 202 (Box 612).”) 
4determining a common schema for the one or more ML models; (col 16 lines 42-47 “As was discussed above, in Some implementations, the effectiveness of each retrained predictive model can be compared to the effectiveness of the updateable trained predictive model from which it was derived, and the most effective of the two models stored in the repository 215 and the other discarded.”)
5converting a first ML model having a schema different from the common schema 6based on the common schema; (Examiner interprets converting as retraining/updated model see col 14 lines 8-20 “In some implementations, the effective score of a retrained predictive model is determined by tallying the results from the initial cross-validation (i.e., done for the updateable predictive model from which the retrained predictive was generated) and adding in the retrained predictive models score on each new piece of training data. By way of illustrative example, consider Model A that was trained with a batch of 100 training samples and has an estimated 67% accuracy as determined from cross-validation. Model A then is updated (i.e., retrained) with 10 new training samples, and the retrained Model A gets 5 predictive outputs correct and 5 predictive outputs incorrect.”) 
and 7adding the converted first ML model to the model group. (Col 16 lines 20-28 “The updateable trained predictive models that are stored in the repository 215 are “updated with the training data stored in the training data queue 213. That is, retrained predictive models are generated (Box 608) using: the training data queue 213; the updateable trained predictive models obtained from the repository 215; and the corresponding training functions that were initially used to train the updateable trained predictive models, which training functions are obtained from the training function repository 216.”) 
Ahmadi, Chaoji, Britto and Breckenridge are analogous art because they are all directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ahmadi in view of Chaoji with Britto to incorporate the teaching of Breckenridge to include method or system for dynamic predictive modeling platform for training and retraining predictive models. 
One of ordinary skill in the art would have been motivated to make this modification in order to reduce “the amount of training data that may be required to train a predictive model can be large, e.g., in the order of gigabytes or terabytes” and obtain a desired predictive output as disclosed by Breckenridge (col 1 lines 15-28 “Various types of predictive models can be used to analyze data and generate predictive outputs. Typically, a predictive model is trained with training data that includes input data and output data that mirror the form of input data that will be entered into the predictive model and the desired predictive output, respectively. The amount of training data that may be required to train a predictive model can be large, e.g., in the order of gigabytes or terabytes. The number of different types of predictive models available is extensive, and different models behave differently depending on the type of input data. Additionally, a particular type of predictive model can be made to behave differently, for example, by adjusting the hyper-parameters or via feature induction or selection.”). 

Regarding claim 18
Claim 18 recites analogous limitations to claim 15 and therefore is rejected on the same ground as claim 15. 

Regarding claim 161
Ahmadi in view of Chaoji with Britto and Breckenridge 1			

teaches the method of claim 15. 
Ahmadi further teaches wherein determining the common schema for the 2one or more ML models comprises: 3determining the common schema based on a union of schemas for the one or more 4ML models; (Each expert [corresponds to machine learning models] are trained using three layer Neural network which corresponds to sharing common schema see section 2.1 “In the proposed system we use three layer NNs and look for the best number of neurons in the hidden layer based on the criterion discussed in section 2.3.)
Breckenridge further teaches …5adding one of two congruent features in two respective schemas for two ML 6models to the common schema; (col 18 lines 1-8 “such that if adding training data from the training data queue 213 will cause the maximum volume to be exceeded, then some of the training data is deleted. The particular training data that is to be deleted can be selected based on the date of receipt (e.g., the oldest data is deleted first), selected randomly, selected sequentially if the training data is ordered in some fashion, based on a property of the training data itself, or otherwise selected.”)
or 7dropping a feature in a schema for a second ML model based on determining that 8the feature has an importance level below a second threshold value. (Col 18 lines 15-21 “A set of feature vectors can be deleted that includes a larger proportion of easily classified feature vectors. That is, based on an estimation of how hard the classification is, the feature vectors included in the stored training data can be pruned to satisfy either a threshold volume of data or another constraint used to control what is retained in the training data repository 214.” also see col 14 lines 21-30 “In some implementations, the effectiveness score of the retrained predictive model is compared to the effectiveness score of the trained predictive model from which the retrained predictive model was derived. If the retrained predictive model is more effective, then the retrained predictive model can replace the initially trained predictive model in the predictive model repository 215. If the retrained predictive model is less effective, then it can be discarded. In other implementations, both predictive models are stored in the repository, which therefore grows in size.”) 
Ahmadi, Chaoji, Britto and Breckenridge are analogous art because they are all directed to machine learning. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ahmadi in view of Chaoji with Britto to incorporate the teaching of Breckenridge to include method or system for dynamic predictive modeling platform for training and retraining predictive models. 
One of ordinary skill in the art would have been motivated to make this modification in order to reduce “the amount of training data that may be required to train a predictive model can be large, e.g., in the order of gigabytes or terabytes” and obtain a desired predictive output as disclosed by Breckenridge (col 1 lines 15-28 “Various types of predictive models can be used to analyze data and generate predictive outputs. Typically, a predictive model is trained with training data that includes input data and output data that mirror the form of input data that will be entered into the predictive model and the desired predictive output, respectively. The amount of training data that may be required to train a predictive model can be large, e.g., in the order of gigabytes or terabytes. The number of different types of predictive models available is extensive, and different models behave differently depending on the type of input data. Additionally, a particular type of predictive model can be made to behave differently, for example, by adjusting the hyper-parameters or via feature induction or selection.”). 

Regarding claim 19
Claim 19 recites analogous limitations to claim 16 and therefore is rejected on the same ground as claim 16. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Avnimelech et al. (“Boosted Mixture of Experts: An Ensemble Learning Scheme") teaches new supervised learning procedure for ensemble machines, in which outputs of predictors, trained on different distributions, are combined by a dynamic classifier combination mode.
Cruz et al. (“Dynamic classifier selection: Recent advances and perspectives”) teaches an updated taxonomy based on the main characteristics found in a dynamic selection system the methodology used to define a local region for the estimation of the local competence of the base classifiers. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VAN C MANG whose telephone number is (571)270-7598. The examiner can normally be reached Mon - Fri 8:00-5:00pm.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on 5712729767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/V.M./Examiner, Art Unit 2126 
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126