DETAILED ACTION
1.	This action is in response to the application 15/919628 filed on March 13, 2018. Claims 1-20 are pending and have been examined.
Notice of Pre-AIA  or AIA  Status
2.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
3.	35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

4.	Claims 1 and 4 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 analysis:
In the instant case, the claim(s) 1 and 4 are directed to a method. Thus, falls within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2 analysis:
Step 2A: Prong 1 analysis:
The claim(s) recite(s):
Claims 1 and 4:
Determining accuracy score… (mental process); 
Updating a model selector… (mental process); 
Providing inference requests… (mental process);

Step 2A: Prong 2 analysis:
	This judicial exception is not integrated into a practical application because the additional element in claims 1 and 4, “obtaining… inference results…” is mere data gathering and adding insignificant extra-solution activity to the judicial exception as discussed in MPEP 2106.05(g). Also, in claims 1 and 4 the additional elements “machine learning models” are generally linking the use of judicial exception to a particular technological environment of field of use (Machine Learning technology) as discussed in MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. Please see MPEP 206.04.(a)(2).III.C. The claims are directed to an abstract idea.
Step 2B analysis:
	The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of claim 1 and 4 are merely adding insignificant extra-solution activity to the judicial exception and generally linking the use of judicial exception to a particular technological environment or field of use. The obtaining step is insignificant extra-solution activity that is a well understood, routine, and conventional function supported under Berkheimer analysis in MPEP 2106.05(d)(I). There is no inventive concept in the claim. The claim is not patent eligible.
5.	Claim(s) 2 and 3 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more, and the rejection of claim 1 is 
6.	Claim(s) 5-9 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more, and the rejection of claim 4 is incorporated into claims 5-9.  Claims 5-9 recite more specifics to the judicial exceptions identified in the rejection of claim 4. They recite the following limitations: updating distribution by selecting ML model that generate higher or lower inference, using a common input data, comparing the first plurality of inference results, comparing inference results with ground truth confirmations, and analysis of explicit or implied user feedback. These limitations are abstract ideas of the “mental process” grouping which can be performed in one’s mind with the aid of pencil and paper. Claims 5-9 do not recite any other additional elements, than the ones recited in claim 4, which integrate the judicial exception into a practical application or amount to significantly more. The claims are not patent eligible.
7.	Claim(s) 10 and 11 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more, and the rejection of claim 4 is incorporated into claims 10-11. They recite more specifics to the judicial exceptions identified 
8.	Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more, and the rejection of claim 4 is incorporated into claim 12. Claim 12 recites more specifics to the judicial exceptions identified in the rejection of claim 4. They recite containers, computer devices, and provider network. The additional elements are recited at a high-level of generality such that they amount to no more than mere instructions to apply the exception using generic computer component. Claim 12 does not recite any other additional elements which integrate the judicial exception into a practical application or amount to significantly more. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Please see MPEP 2106.05(b). The claims is not patent eligible
9.	Claim(s) 13-15 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more, and the rejection of claim 4 is incorporated into claims 13-15.  Claims 13-15 recite more specifics to the judicial exceptions identified in the rejection of claim 4. Claims 13-15 recite the following limitations: providing inference request, generation a result, receiving a message, providing and inference request, 
10.	Claims 16 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 analysis:
In the instant case, the claim(s) 16 is directed to a system. Thus, falls within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2 analysis:
Step 2A: Prong 1 analysis:
The claim(s) recite(s):
Claims 16:
Select one or more machine learning models … (mental process)
Determining accuracy score… (mental process); 
Updating a model selector… (mental process); 
Accordingly, the claim recites an abstract idea which is one of the judicial exception.
Step 2A: Prong 2 analysis:
	This judicial exception is not integrated into a practical application because the additional element in claim 16, “obtain… inference results…” is mere data gathering and adding 
Step 2B analysis:
	The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of claim 16 are merely adding insignificant extra-solution activity to the judicial exception and generally linking the use of judicial exception to a particular technological environment or field of use. The obtaining step is insignificant extra-solution activity that is a well understood, routine, and conventional function supported under Berkheimer analysis in MPEP 2106.05(d)(I). There is no inventive concept in the claim. The claim is not patent eligible.
11.	Claim(s) 17-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more, and the rejection of claim 16 is incorporated into claims 17-20.  Claims 17-20 recite more specifics to the judicial exceptions 
Claim Rejections - 35 USC § 103
12.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

13.	Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over US 20150332169 A1 to Reference Bivens et al., (hereinafter, “Bivens”), in view of US 10904360 B1 to reference Govan et al., (hereinafter, “Govan”).
Regarding claim 1, Bivens teaches obtaining a first plurality of inference results generated by a first machine learning (ML) model and a second plurality of inference results generated by a second machine learning (ML) model, wherein the first ML model and the second ML model are part a group of ML models associated with one or more users and that generate a common type of inference (Bivens para. [0023] discloses, “The trained models 
	determining, based at least in part on the first plurality of inference results and the second plurality of inference results, a first accuracy score corresponding to the first ML model and a second accuracy score corresponding to the second ML model, wherein each accuracy score indicates an amount of correctness of the inferences generated by the corresponding ML model, wherein the first accuracy score indicates that the first ML model is of higher quality than the second ML model (Bivens fig. 6 and para. [0045] discloses method of calculating accuracy. Bivens gives an example where “In run 1 of one trained model, examples X and Y are correctly predicted, and example Z is incorrectly predicted. In run 2 of another trained model, examples X and Z are correctly predicted, but example Y is mispredicted. The accuracy using traditional method is the same for both runs, which is 0.67. With the proposed evaluation method, run 1 is penalized since the mispredicated example Z has higher trustworthiness score than example Y, which is mispredicated by run 2. Therefore run 1 has a lower accuracy compared to run 2.” Biven is teaching an example of calculating accuracy, corresponding to trained models, using weight proportional to trustworthiness measure. Accuracy measurement using weighted trustworthiness results in the model on run 2 having a higher accuracy score than model on run 1).
updating a model selector to cause the model selector to select between the first ML model and the second ML model for generating inferences for inference requests according to an updated distribution, the updated distribution indicating that a comparatively larger amount of inference requests are to be provided to the first ML model than as indicated by a previous distribution and providing, by the model selector, a plurality of inference requests to the first ML model and the second ML model according to the updated distribution. However, Govan teaches:
	updating a model selector to cause the model selector to select between the first ML model and the second ML model for generating inferences for inference requests according to an updated distribution, the updated distribution indicating that a comparatively larger amount of inference requests are to be provided to the first ML model than as indicated by a previous distribution (Govan para. [51] – [53] discloses dynamic selection mechanism for choosing from ensemble of models. Govan discloses, “There are many other possible ensembles that could be chosen from. The particular ensemble (which may be a single recommender or a hybrid recommendation ensemble) may be chosen and parameterized at request time (in real-time) by a dynamic selection mechanism. The selection criteria may be chosen and continually updated by an online learning and optimization system. One particular embodiment of this kind of online learning scheme is a multi-armed bandit where each “arm” is used to select a set of parameters and/or a recommendation strategy. Another embodiment of this is a reinforcement learning system which learns through repeated trial-and-error to select the optimal mixture and/or individual recommendation strategy.” Govan teaches that selection criteria can be chosen and updated and that several types of weighting can be applied across 
	providing, by the model selector, a plurality of inference requests to the first ML model and the second ML model according to the updated distribution (Govan para. [51] –[52] discloses recommendation strategies that can change based on request time for a system that automatically adjusts based on context. Govan also discloses that weighting can be applied to models and for tuning of features within models to provide results. Models generating inference results implies that there are plurality of requests to the models for generating these inference results according to update).
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify the teachings of Bivens and incorporate the teachings to Govan with a motivation to update the model selector to select between machine learning models and assign more weight to one of the machine learning models and to provide plurality of inference requests to the models. One would be motivated to use the combination to determine which strategy (ML model) would be best for a customer at a particular time and to update the recommendation strategy with a system that automatically adjusts based on context or by the weighting applied to individual models to manage hundreds of customer requests and gain better overall result (Govan para. [51] – [52]).
	As per claim 2, the combination Bivens and Govan as shown above teaches the method of claim 1, wherein 
determining the first accuracy score and the second accuracy score is based at least in part on comparing the first plurality of inference results and the second plurality of inference results with a corresponding plurality of ground truth confirmations obtained using input data that was used by the first ML model and the second ML model to generate the first plurality of inference results and the second plurality of inference results (Bivens para. [0025] discloses using trustworthiness to weigh the selection of labeled data for training in supervised machine learning. Weighing labeled data for training in supervised learning implies using ground truth confirmations. Bivens in para. [0045] gives example of calculating accuracy using weight proportional to trustworthiness measure corresponding to each trained machine learning model).
	As per claim 3, the combination Bivens and Govan as shown above teaches the method of claim 1, wherein 
	determining the first accuracy score and the second accuracy score is based at least in part on an analysis of explicit or implied user feedback provided by the one or more users results (Bivens para. [0037] discloses using trustworthiness scores to train models. The trustworthiness scores may be computed from the degree of user’s knowledge and/or experience. Bivens in para. [0045] teaches calculating accuracy score for different runs using weight proportional to the trustworthiness score for given examples).
As per claim claim 4, Bivens teaches
	obtaining a plurality of inference results generated by a plurality of machine learning (ML) models of a group that generate a common type of inference (Bivens para. [0023] 
	determining, based at least in part on the plurality of inference results, a plurality of accuracy scores corresponding to the plurality of ML models (Bivens fig. 6 and para. [0045] discloses method of calculating accuracy. Bivens gives an example where “In run 1 of one trained model, examples X and Y are correctly predicted, and example Z is incorrectly predicted. In run 2 of another trained model, examples X and Z are correctly predicted, but example Y is mispredicted. The accuracy using traditional method is the same for both runs, which is 0.67. With the proposed evaluation method, run 1 is penalized since the mispredicated example Z has higher trustworthiness score than example Y, which is mispredicated by run 2. Therefore run 1 has a lower accuracy compared to run 2.” Biven is teaching an example of calculating accuracy corresponding to plurality of trained models).
	Bivens fails to explicitly teach updating a model selector, based on the plurality of accuracy scores, to cause the model selector to select ones of the plurality of ML models to generate inferences for inference requests according to an updated distribution and providing, by the model selector, a plurality of inference requests to the plurality of ML models according to the updated distribution. However, Govan teaches:
updating a model selector, based on the plurality of accuracy scores, to cause the model selector to select ones of the plurality of ML models to generate inferences for inference requests according to an updated distribution (Govan para. [51] – [53] discloses dynamic selection mechanism for choosing from ensemble of models. Govan discloses, “There are many other possible ensembles that could be chosen from. The particular ensemble (which may be a single recommender or a hybrid recommendation ensemble) may be chosen and parameterized at request time (in real-time) by a dynamic selection mechanism. The selection criteria may be chosen and continually updated by an online learning and optimization system. One particular embodiment of this kind of online learning scheme is a multi-armed bandit where each “arm” is used to select a set of parameters and/or a recommendation strategy. Another embodiment of this is a reinforcement learning system which learns through repeated trial-and-error to select the optimal mixture and/or individual recommendation strategy.” Govan teaches selection criteria for a model that can be chosen and updated with several types of weighting that can be applied across and within models).
providing, by the model selector, a plurality of inference requests to the plurality of ML models according to the updated distribution (Govan para. [51] – [52] discloses recommendation strategies that can change based on request time for a system that automatically adjusts based on context. Govan also discloses that weighting can be applied to models and for tuning of features within models to provide results. Models generating inference results implies that there are plurality of requests to models for generating these inference results according to update).
	Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify the teachings of Bivens and incorporate the teachings to Govan with a motivation to update the model selector to select between the 
As per claim 5, the combination Bivens and Govan as shown above teaches the method of claim 4, wherein:
the updated distribution indicates that a first ML model of the plurality of ML models is to be selected to generate inferences at a higher likelihood compared to a corresponding likelihood of a previous distribution utilized by the model selector (Govan para. [51] – [53] teaches selection criteria for a model that can be chosen and updated with several types of weighting that can be applied across and within models. Bivens fig. 6 and para. [45] discloses method of calculating accuracy where a ML model using weight with higher trustworthiness is more accurate than when using weight with lower trustworthiness). 
the updated distribution indicates that a second ML model of the plurality of ML models is to be selected to generate inferences at a lower likelihood compared to a corresponding likelihood of the previous distribution utilized by the model selector (Govan para. [51] – [53] teaches selection criteria for a model that can be chosen and updated with several types of weighting that can be applied across and within models. Bivens fig. 6 and para. [0045] discloses method of calculating accuracy where a ML model using weight with lower trustworthiness is less accurate than when using weight with higher trustworthiness).
claim 6, the combination Bivens and Govan as shown above teaches the method of claim 4, wherein 
the plurality of inference results includes a first plurality of inference results generated by the plurality of ML models using a common input data (Bivens para. [0023] discloses an instance where search results of a search engine is input to trained machine learning models to generate the search results in ranking order). 
As per claim 7, the combination Bivens and Govan as shown above teaches the method of claim 6, wherein 
determining the plurality of accuracy scores is based at least in part on comparing the first plurality of inference results (Bivens para. [0045] discloses comparing outputs (results) across different parameter settings or compared with a baseline method to measure the accuracy).
As per claim 8, the combination Bivens and Govan as shown above teaches the method of claim 4, wherein 
determining the plurality of accuracy scores is based at least in part on comparing the plurality of inference results with a corresponding plurality of ground truth confirmations obtained using input data that was used by the plurality of ML models to generate the plurality of inference results results (Bivens para. [0025] discloses using trustworthiness to weigh the selection of labeled data for training in supervised machine learning. Weighing labeled data for training in supervised learning implies using ground truth confirmations. Bivens in para. [0045] gives example of calculating accuracy using weight proportional to trustworthiness measure corresponding to each trained machine learning model).
claim 9, the combination Bivens and Govan as shown above teaches the method of claim 4, wherein 
determining the plurality of accuracy scores is based at least in part on an analysis of explicit or implied user feedback provided by one or more users that caused inference requests to be issued that resulted in the plurality of inference results being generated by the plurality of ML models (Bivens para. [0044] discloses example of using implicit feedback from user actions for search engine result rankings. Implicit feedback obtained from user actions, like clicking, will train machine learning model to rank document which the user deems to be more accurate or relevant. Biven in para [0045] also teaches an example of calculating accuracy corresponding to plurality of trained models).
As per claim 10, the combination Bivens and Govan as shown above teaches the method of claim 4, further comprising: 
receiving a request to perform an inference using an input data (Bivens para. [0021] gives example of inputs to a machine learning algorithm to train a model).
selecting, by the model selector based on an analysis of the input data, a first ML model from a second plurality of ML models to be used to perform the inference (Govan para. [51] – [53] discloses dynamic selection mechanism for choosing from ensemble of models. Govan teaches selection criteria for a model that can be chosen from ensemble of models with several types of weighting that can be applied across and within models).
providing the input data to the first ML model (Bivens para. [0022] discloses plurality of input sample and plurality of trained models and using one input sample corresponding to one trained model).
claim 11, the combination Bivens and Govan as shown above teaches the method of claim 10, wherein: 
the selecting the first ML model comprises using the input data or other data generated based on the input data as input to a second ML model (Govan para. [51] - [52] discloses dynamic selection mechanism for selecting model from an ensemble. Several types of weightings can be applied as input across and within models. Selecting a model can involve applying weighting to mix individual model outputs to provide a better overall result than each individual models). 
the second ML model generates a result identifying the first ML model (Bivens para. [0045] gives an example where accuracy score of model from run 1 is compared to the result of model from run 2 and model from run 2 is determined to be more accurate).
As per claim 12, the combination Bivens and Govan as shown above teaches the method of claim 4, wherein 
the plurality of ML models are executed by a corresponding plurality of containers that are executed by one or more computing devices within a provider network (Bivens para. [0026] teaches “A memory device may be connected to the one or more processors and store the plurality of samples, the plurality of trained models, and other data used by the one or more processors.” Bivens para. [0005] discloses, “The one or more processors may be further operable to ensemble outputs from the plurality of trained models by computing a weighted average of the outputs of the plurality of trained models.” Bivens teaches sampling plurality of training data examples and executing plurality of trained models by using weighted average of the outputs of the plurality of trained models).
claim 13, the combination Bivens and Govan as shown above teaches the method of claim 4, further comprising: 
providing, by the model selector, an inference request to each of the plurality of ML models (Govan para. [51] –[52] discloses recommendation strategies that can change based on request time for a system that automatically adjusts based on context. Govan also discloses that weighting can be applied to models and for tuning of features within models to provide results. Models generating inference results implies that there is request to the models for generating these inference results according to update).
generating a result based on a plurality of inference results generated by the plurality of ML models (Bivens para. [0027] – [0029] discloses plurality of machine learning models 210n. Each of the trained models generate outputs with respect to the test data and the outputs can be ensemble to produce ensemble output or result related to the test data).
As per claim 14, the combination Bivens and Govan as shown above teaches the method of claim 4, further comprising: 
receiving a message indicating that a second ML model is to be tested alongside a first ML model (Govan para. [58] describes ranking and optimizing two models using supervised ML mechanisms. The effectiveness of the models are evaluated during continued usage and a model that displays higher level of effectiveness is selected. Govan para. [81] discloses that the described embodiments could include presentation of content via standalone PC application, smartphone or tablet computer application).
providing, by the model selector, an inference request to the first ML model and the second ML model (Govan para. [51] – [52] discloses recommendation strategies that can 
sending a response to the inference request including a first inference result generated by the first ML model but not a second inference result generated by the second ML model (Bivens fig. 4B and para. [0044] discloses training data example generated by machine learning. For query 1, the pairs of documents associated with the query and their relevance along with user trustworthiness represents a training data example. This training data example for query 1 is used to train the first machine learning model).
determining a first accuracy score for the first ML model based at least in part on the first inference result and a second accuracy score for the second ML model based on a second inference result generated by the second ML model (Bivens fig. 4B and para. [0044] – [0045] teaches of a training data example generated by machine learning, where a pair of documents associated with a query represents a training data example and their relevance. A plurality of such pairs and user trustworthiness associated with it are used to train a ML model for each query. Query 1 trains first ML model and query 2 trains second ML model. Biven discloses method of calculating accuracy using weight proportional to trustworthiness measure).
As per claim 15, the combination Bivens and Govan as shown above teaches the method of claim 4, further comprising 
determining an unbiased estimate of accuracy for each of the plurality of ML models that indicates how the corresponding ML model would have performed if it had processed the plurality of inference requests despite not having actually processed the plurality of inference requests 
15.	Claims 16 is rejected under 35 U.S.C. 103 as being unpatentable over Bivens in view of Govan. As per claim claim 16, Bivens teaches A system comprising: 
a dynamic router implemented by a first one or more electronic devices, the dynamic router including first instructions that upon execution cause the dynamic router to implement a model selector to select one or more of a plurality of machine learning (ML) models according to a distribution to perform inferences for inference requests, and cause the inference requests to be provided to the selected ML models (Bivens para. [0056] discloses routers, switches, gateway computers, computer readable program instructions etc. that can be an embodiment of a dynamic router. Govan para. [51] – [53] discloses dynamic selection mechanism for choosing from ensemble of models. Govan discloses, “There are many other possible ensembles that could be chosen from. The particular ensemble (which may be a single recommender or a hybrid recommendation ensemble) may be chosen and parameterized at request time (in real-time) by a dynamic selection mechanism. The selection criteria may be chosen and continually updated by an online learning and optimization system. One particular embodiment of this kind of online learning scheme is a multi-armed bandit where each “arm” is used to select a set of parameters and/or a recommendation strategy. Another embodiment of this is a reinforcement learning system which learns through repeated trial-and-error to select the optimal mixture and/or individual recommendation strategy.” Govan teaches selection criteria for a model that can be chosen and updated with several types of weighting that can be applied across and within models).
an analytics engine implemented by a second one or more electronic devices (Bivens para. [0046] and fig. 5 discloses an example computer or processing system that uses machine learning to train data examples. The system may include personal computer systems, multiprocessor systems, programmable consumer electronics, and hand held devices), the analytics engine including second instructions that upon execution cause the analytics engine to: obtain a plurality of inference results generated by the plurality of ML models of a group that generate a common type of inference (Bivens para. [0023] discloses, “The trained models output results … Each trained model thus may produce a set of rankings.” Bivens para. [0027] – [0029] discloses machine learning (ML) models 210a, 210b from a group of 210n ML models. Each of the models can produce output with respect to test data and the output from the trained models can be ensembled to produce an ensemble output). The sampling of the set of training data examples to input into the models are picked from trusted and non-trusted users to produce an ensemble output).
determine, based at least in part on the plurality of inference results, a plurality of accuracy scores corresponding to the plurality of ML models (Bivens fig. 6 and para. [0045] discloses method of calculating accuracy. Bivens gives an example where “In run 1 of one trained model, examples X and Y are correctly predicted, and example Z is incorrectly predicted. In run 2 of another trained model, examples X and Z are correctly predicted, but example Y is mispredicted. The accuracy using traditional method is the same for both runs, which is 0.67. With the proposed evaluation method, run 1 is penalized since the mispredicated example Z has higher trustworthiness score than example Y, which is mispredicated by run 2. Therefore 
Bivens fails to explicitly cause a model selector of the dynamic router to be updated, based on the plurality of accuracy scores, to use an updated distribution to select ones of the plurality of ML models to generate inferences for inference requests. However, Govan teaches:			
cause a model selector of the dynamic router to be updated, based on the plurality of accuracy scores, to use an updated distribution to select ones of the plurality of ML models to generate inferences for inference requests (Govan para. [51] – [53] discloses dynamic selection mechanism for choosing from ensemble of models. Govan discloses, “There are many other possible ensembles that could be chosen from. The particular ensemble (which may be a single recommender or a hybrid recommendation ensemble) may be chosen and parameterized at request time (in real-time) by a dynamic selection mechanism. The selection criteria may be chosen and continually updated by an online learning and optimization system. One particular embodiment of this kind of online learning scheme is a multi-armed bandit where each “arm” is used to select a set of parameters and/or a recommendation strategy. Another embodiment of this is a reinforcement learning system which learns through repeated trial-and-error to select the optimal mixture and/or individual recommendation strategy.” Govan teaches selection criteria for a model, from a plurality of models, that can be chosen and updated with several types of weighting that can be applied across and within models).
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify the teachings of Bivens and incorporate the teachings 
As per claim 17, the combination Bivens and Govan as shown above teaches the system of claim 16, wherein 
the plurality of inference results includes a first plurality of inference results generated by the plurality of ML models using a common input data (Bivens para. [0023] discloses an instance where search results of a search engine is input to trained machine learning models to generate the search results in ranking order).
As per claim 18, the combination Bivens and Govan as shown above teaches the system of claim 17, wherein 
the second instructions upon execution further cause the analytics engine to determine the plurality of accuracy scores based at least in part on comparing the first plurality of inference results (Bivens para. [0045] discloses comparing outputs (results) across different parameter settings or compared with a baseline method to measure the accuracy. Biven teaches examples of predicting accuracy of models through different runs using weight proportional to trustworthiness measure for each example with a baseline measure).
claim 19, the combination Bivens and Govan as shown above teaches the system of claim 16, wherein 
the second instructions upon execution further cause the analytics engine to determine the plurality of accuracy scores based at least in part on a comparison of the plurality of inference results with a corresponding plurality of ground truth confirmations obtained using input data that was used by the plurality of ML models to generate the plurality of inference results (Bivens para. [0025] discloses using trustworthiness to weigh the selection of labeled data for training in supervised machine learning. Weighing labeled data for training in supervised learning implies using ground truth confirmations. Bivens in para. [0045] gives example of calculating accuracy using weight proportional to trustworthiness measure corresponding to each trained machine learning model).
As per claim 20, the combination Bivens and Govan as shown above teaches the system of claim 16, wherein 
the second instructions upon execution further cause the analytics engine to determine the plurality of accuracy scores based at least in part on an analysis of explicit or implied user feedback provided by one or more users that caused inference requests to be issued that resulted in the plurality of inference results being generated by the plurality of ML models (Bivens para. [0037] discloses using trustworthiness scores to train models. The trustworthiness scores may be computed from the degree of user’s knowledge and/or experience (user feedback). Bivens in para. [0045] teaches calculating accuracy score for different runs of trained machine learning models using weight proportional to the trustworthiness score).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RAHUL GURUNG whose telephone number is (571) 272-8406. The examiner can normally be reached on 7:30 am to 4:00 pm from Mondays to Thursdays.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki, can be reached at telephone number (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://portal.uspto.gov/external/portal. Should you have questions about access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

/RAHUL GURUNG/Examiner, Art Unit 2122                                                                                                                                                                                                        
/ERIC NILSSON/Primary Examiner, Art Unit 2122