DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Notice to Applicant
The following is a Final Office action.  In response to Examiner’s Non-Final Rejection of 05/11/2022, Applicant, on 07/13/2022, claims 1-3, 12-14, and 16-19 are amended; claims 4, 7, 15, and 20 are canceled.
Allowable Subject Matter
Claim 9 is allowable over the prior art, however this claim remains rejected under 35 USC 103 as it is dependent from a rejected claim, and thus would be allowable only if rewritten in independent form to include all the limitations of the base claim and any intervening claims. 
Response to Arguments
Applicant's arguments filed 07/13/2022have been fully considered, but they are not fully persuasive. In response to Applicant’s amendments, the 35 USC § 101 and 102 have been removed. With regards to the 101 removal, the determining a performance of a machine learning algorithm and applying the algorithm performance to future configurations is a learning improvement to the machine learning field. However, the updated 35 USC § 103 rejection of the claims are applied in light of Applicant's amendments.      
Applicant’s arguments with respect to the rejection to the claims of 35 U.S.C. 103 have been considered but are moot because the arguments do not apply to the current combination of references being used in the current rejection. In light of Applicants amendments and arguments the Examiner updated the search and provided new art to reject the claim limitations. 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
  The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
 
Claims 1-3, 5-6, 12-14, and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. PGPub 20180046926  (hereinafter “Achin”) et al., in view of U.S. PGPub 20180343171 to (hereinafter “Jensen”) et al.
A method of selecting a model of machine learning executed by a processor, the method comprising:  5
configuring, by the processor, a configuration space for machine learning based on a data-set, the configuration space including a plurality of different configurations; extracting, by the processor, a meta-feature comprising quantitative information about the data-set from the data-set; 
Achin 0005: “ Machine-learning techniques (e.g., supervised statistical-learning techniques) may be used to generate a predictive model from a dataset that includes previously recorded observations of at least two variables…The observations are generally partitioned into at least one “training” dataset and at least one “test” dataset. A data analyst then selects a statistical-learning procedure and executes that procedure on the training dataset to generate a predictive model. The analyst then tests the generated model on the test dataset to determine how well the model predicts the value(s) of the target(s), relative to actual observations of the target(s)…0103-0111: Achin 0103-0105: “These tools may provide insight into a prediction problem's dataset (e.g., by highlighting problematic variables in the dataset, identifying relationships between variables in the dataset, etc.), and/or insight into the results of the search. In some embodiments, data analysts may use the interface to guide the search, e.g., by specifying the metrics to be used to evaluate and compare modeling solutions, by specifying the criteria for recognizing a suitable modeling solution, etc. Thus, the user interface may be used by analysts to improve their own productivity, and/or to improve the performance of the exploration engine 110. In some embodiments, user interface 120 presents the results of the search in real-time, and permits users to guide the search (e.g., to adjust the scope of the search or the allocation of resources among the evaluations of different modeling solutions) in real-time. In some embodiments, user interface 120 provides tools for coordinating the efforts of multiple data analysts working on the same prediction problem and/or related prediction problems…In some embodiments, the model deployment engine also provides tools for monitoring and/or updating predictive models. System users may use the deployment engine 140 to deploy predictive models generated by exploration engine 110, to monitor the performance of such predictive models, and to update such models (e.g., based on new data or advancements in predictive modeling techniques). In some embodiments, exploration engine 110 may use data collected and/or generated by deployment engine 140 (e.g., based on results of monitoring the performance of deployed predictive models) to guide the exploration of a search space for a prediction problem (e.g., to re-fit or tune a predictive model in response to changes in the underlying dataset for the prediction problem)….a machine-executable template includes metadata describing attributes of the predictive modeling technique encoded by the template. The metadata may indicate one or more data processing techniques that the template can perform as part of a predictive modeling solution (e.g., in a pre-processing step, in a post-processing step, or in a step of predictive modeling algorithm)… metadata may indicate how well the corresponding modeling technique is expected to perform on datasets having particular characteristics, including, without limitation, wide datasets, tall datasets, sparse datasets, dense datasets, datasets that do or do not include text, datasets that include variables of various data types (e.g., numerical, ordinal, categorical, interpreted (e.g., date, time, text), etc.), datasets that include variables with various statistical properties (e.g., statistical properties relating to the variable's missing values, cardinality, distribution, etc.)
calculating, by the processor, a performance of each configuration, wherein the calculating includes the processor selecting a machine learning algorithm for the configuration and executing the machine learning algorithm on the data-set to generate the performance; for each configuration, the processor executing meta-learning on the configuration, the meta-feature, and the performance of the configuration to…; Achin 0103-0105: “These tools may provide insight into a prediction problem's dataset (e.g., by highlighting problematic variables in the dataset, identifying relationships between variables in the dataset, etc.), and/or insight into the results of the search. In some embodiments, data analysts may use the interface to guide the search, e.g., by specifying the metrics to be used to evaluate and compare modeling solutions, by specifying the criteria for recognizing a suitable modeling solution, etc. Thus, the user interface may be used by analysts to improve their own productivity, and/or to improve the performance of the exploration engine 110. In some embodiments, user interface 120 presents the results of the search in real-time, and permits users to guide the search (e.g., to adjust the scope of the search or the allocation of resources among the evaluations of different modeling solutions) in real-time. In some embodiments, user interface 120 provides tools for coordinating the efforts of multiple data analysts working on the same prediction problem and/or related prediction problems…In some embodiments, the model deployment engine also provides tools for monitoring and/or updating predictive models. System users may use the deployment engine 140 to deploy predictive models generated by exploration engine 110, to monitor the performance of such predictive models, and to update such models (e.g., based on new data or advancements in predictive modeling techniques). In some embodiments, exploration engine 110 may use data collected and/or generated by deployment engine 140 (e.g., based on results of monitoring the performance of deployed predictive models) to guide the exploration of a search space for a prediction problem (e.g., to re-fit or tune a predictive model in response to changes in the underlying dataset for the prediction problem)… 0147: “exploration engine 110 determines the suitability of a predictive modeling procedure for a prediction problem based, at least in part, on the output of a “meta” machine-learning model, which may be trained to determine the suitability of a modeling procedure for a prediction problem based on the results of various modeling procedures (e.g., modeling procedures similar to the modeling procedure at issue) for other prediction problems (e.g., prediction problems similar to the prediction problem at issue). The machine-learning model for estimating the suitability of a predictive modeling procedure for a prediction problem may be referred to as a “meta” machine-learning model because it applies machine learning recursively to predict which techniques are most likely to succeed for the prediction problem at issue. Exploration engine 110 may therefore produce meta-predictions of the suitability of a modeling technique for a prediction problem by using a meta-machine-learning algorithm trained on the results from solving other prediction problems.”Exam. Note: The art teaches the ability to utilize meta machine learning (determining suitability of models) to select and perform the best predictive modeling, thus teaching the execution and configuration of meta learning in a space.
Achin does not explicitly teach the following. However, Jensen teaches:
derive an expected improvement value and an expected worsening value; excluding, by the processor, one of the configurations based on the expected improvement value and the expected worsening value of each of the configurations; and Jensen 0092-0108: “the system 100 does not select any lines from those which have been identified as highly likely to fail if reclassified as a trial line…the accuracy threshold used by the second algorithm is greater than that used in the first algorithm. The value used is may be determined by the network operator based on the following considerations: [0100] if set too high, a relatively small number of lines will be excluded from the trial selection and the chances of improving the first algorithm is only increased by a relatively small amount; [0101] if set too low, a relatively large number of lines will be excluded from the trial which will reduce the number of failures. However, it is less likely for lines to be reclassified as option a) when they would actually have restarted without failure, and so the interesting cases which could lead to a modification to the first algorithm are not readily identified.”
selecting, by the processor, the machine learning algorithm of one of the remaining configurations for execution;Jensen 0092-0108: As the lines selected for the trial in the second timestep were more likely to result in a success than a purely random selection (as those which were highly likely to fail were excluded from the selection process) then any new modifications to the first algorithm (created by the machine-learning process) are more likely to correctly predict whether those marginal cases should be classified as first actions or second actions. Furthermore, by excluding the cases in which it is highly likely that a re-classification would result in a failure, the negative impact on customer experience is reduced… in this embodiment, the second algorithm is also being developed by selecting a second subset of trial lines (i.e. distinct from the set selected as trial lines in the second timestep) to further enrich the data set for the purpose of developing the second algorithm…claim 2:  selecting a first subset of terminals to re-classify as being associated with the second action based on the predicted likelihood of success values.” 
Exam. Note: The art teaches the ability to analyze data/datasets by the machine learning algorithm to select and utilize the optimal configurations based on improvement/worsening values and not selecting/excluding certain data.  
Achin and Jensen are deemed to be analogous references as they are reasonably pertinent to each other and directed towards measuring, collecting, and analyzing information with a series of inputs to solve similar problems in the similar environments.  Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to have modified Achin with the aforementioned teachings from Jensen with a reasonable expectation of success, by adding steps that allow the software to analyze and update configurations with the motivation to more efficiently and accurately select algorithmic data [Jensen 0099]. 
 As per claim 2, Achin and Jensen teach all the limitations of claim 1. 
 In addition, Achin teaches:
wherein each configuration comprises at least one of preprocessing correlation information about the at least one data-set, machine learning algorithm correlation information, hyper-parameter correlation information about the preprocessing, and hyper-parameter correlation information about the machine learning 20algorithm;Achin 0021-0022: “at least a group of the observations of the time-series data include respective values of a first variable, and the actions of the method further include, prior to fitting the predictive model to the training data and testing the fitted model on the testing data: determining that the values of the first variable include time values; for each observation in the group, generating a respective value of a second variable, wherein the value of the second variable includes an offset between the time value of the first variable and a reference time value; and adding the values of the second variable to the respective observations in the group… determining that changes in the values of the first and second variables are correlated, with a temporal lag between the changes in the value of the first variable and the correlated changes in the value of the second variable; and displaying, via a graphical user interface, graphical content indicating a duration of the temporal lag between the changes in the value of the first variable and the correlated changes in the value of the second variable…0042:  predictive modeling procedure includes a plurality of tasks including at least one pre-processing task and at least one model-fitting task; and at least one processor configured to execute the machine-executable module, wherein executing the machine-executable module causes the apparatus to perform the predictive modeling procedure. Performing the predictive modeling procedure may include performing the pre-processing task, including: (a) obtaining time-series data including one or more data sets,”Exam. Note: Matching hyperparameter with time/time series from the art. 
 As per claim 3, Achin and Jensen teach all the limitations of claim 2. 
 In addition, Achin teaches:
wherein calculating of the performance comprises: executing preprocessing on the to generate preprocessed data; Achin 0042, 0084-0108:  “ A template may encode, for machine execution, pre-processing steps, model-fitting steps, and/or post-processing steps suitable for use with the template's predictive modeling algorithm(s). Examples of pre-processing steps include, without limitation, imputing missing values, feature engineering (e.g., one-hot encoding, splines, text mining, etc.), feature selection (e.g., dropping uninformative features, dropping highly correlated features, replacing original features by top principal components, etc.). Examples of model-fitting steps include, without limitation, algorithm selection, parameter estimation, hyper-parameter tuning, scoring, diagnostics, etc. Examples of post-processing steps include, without limitation, calibration of predictions, censoring, blending, etc.,”
selecting a certain algorithm based on the machine learning algorithm correlation information; selecting a hyper-parameter based on the hyper-parameter correlation information about the preprocessing and the hyper-parameter correlation information about the certain algorithm; Achin 0178: “selection of modeling techniques based on application of deductive rules, selection of modeling techniques based on the performance of similar modeling techniques on similar prediction problems, selection of modeling techniques based on the output of a meta machine-learning model, any combination of the foregoing modeling techniques, or other suitable modeling techniques …0255: “Select model structures, generate derived features, select model tuning parameters, fit models, and evaluate: In some embodiments, the predictive modeling system 100 can fit many different model types, including, without limitation, decision trees, neural networks, support vector machine models, regression models, boosted trees, random forests, deep learning neural networks, etc… Based on the response variable type and the fitting metric selected, the predictive modeling system may offer a set of predictive models, including traditional regression models, neural networks, and other machine learning models.”  
and deriving the calculated performance by executing the selected certain algorithm on the preprocessed data the selected hyper-parameter;Achin 0178, 0255: “The available modeling methodologies may include, without limitation, selection of modeling techniques based on application of deductive rules, selection of modeling techniques based on the performance of similar modeling techniques on similar prediction problems, selection of modeling techniques based on the output of a meta machine-learning model, any combination of the foregoing modeling techniques, or other suitable modeling techniques… Select model structures, generate derived features, select model tuning parameters, fit models, and evaluate: In some embodiments, the predictive modeling system 100 can fit many different model types, including, without limitation, decision trees, neural networks, support vector machine models, regression models, boosted trees, random forests, deep learning neural networks, etc. The predictive modeling system 100 may provide the option of automatically constructing ensembles from those component models that exhibit the best individual performance…  The predictive modeling system 100 may fit and evaluate the different model structures considered as part of this automated process, ranking the results in terms of validation set performance.” 
  As per claim 5, Achin and Jensen teach all the limitations of claim 1. 
 In addition, Achin teaches:
further comprises storing the meta-feature, the plurality of configurations, and the calculated performance into a meta-database; Achin 0179: “At step 402 of method 400, the exploration engine 110 prompts the user to select the dataset for the predictive modeling problem to be solved. The user can chose from previously loaded datasets or create a new dataset, either from a file or instructions for retrieving data from other information systems. In the case of files, the exploration engine 110 may support one or more formats including, without limitation, comma separated values, tab-delimited, eXtensible Markup Language (XML), JavaScript Object Notation, native database files, etc. In the case of instructions, the user may specify the types of information systems, their network addresses, access credentials, references to the subsets of data within each system, and the rules for mapping the target data schemas into the desired dataset schema. Such information systems may include, without limitation, databases, data warehouses, data integration services, distributed applications, Web services, etc.” 
 As per claim 6, Achin and Jensen teach all the limitations of claim 5. 
 In addition, Achin teaches:
wherein the executing of the meta-learning comprises executing the meta-learning based on the meta-feature, the plurality of configurations, and the 20calculated performance that are stored in the meta-database; Achin 0147: “exploration engine 110 determines the suitability of a predictive modeling procedure for a prediction problem based, at least in part, on the output of a “meta” machine-learning model, which may be trained to determine the suitability of a modeling procedure for a prediction problem based on the results of various modeling procedures (e.g., modeling procedures similar to the modeling procedure at issue) for other prediction problems (e.g., prediction problems similar to the prediction problem at issue). The machine-learning model for estimating the suitability of a predictive modeling procedure for a prediction problem may be referred to as a “meta” machine-learning model because it applies machine learning recursively to predict which techniques are most likely to succeed for the prediction problem at issue. Exploration engine 110 may therefore produce meta-predictions of the suitability of a modeling technique for a prediction problem by using a meta-machine-learning algorithm trained on the results from solving other prediction problems…0035: selecting one or more predictive modeling procedures from the plurality of predictive modeling procedures based on the determined suitabilities of the selected modeling procedures for the prediction problem; and performing the one or more predictive modeling procedures.”Exam. Note: The art teaches the ability to utilize meta machine learning (determining suitability of models) to select and perform the best predictive modeling, thus teaching the execution and configuration of meta learning in a space. Additionally, see paragraph 0179 which teaches the utilization of databases/data warehouses/ storage units to leverage storage capabilities. 
Claim 12 is directed to the method for performing a very similar method of claims 1 and 4 with overlapping claim limitations.  It is already established that the repeating claim limitations in claim 12 are taught in claims 1 and 4 (see claims 1 and 4 rejection above). The additional limitations of extracting, from the data-set, a meta-feature comprising quantitative information about at least one of information about correlation associated with the data-set, information about linearity, information about smoothness, and information about a distribution density in claim 12 are also taught by Achin in the following Achin 0147: “exploration engine 110 determines the suitability of a predictive modeling procedure for a prediction problem based, at least in part, on the output of a “meta” machine-learning model, which may be trained to determine the suitability of a modeling procedure for a prediction problem based on the results of various modeling procedures (e.g., modeling procedures similar to the modeling procedure at issue) for other prediction problems (e.g., prediction problems similar to the prediction problem at issue). The machine-learning model for estimating the suitability of a predictive modeling procedure for a prediction problem may be referred to as a “meta” machine-learning model because it applies machine learning recursively to predict which techniques are most likely to succeed for the prediction problem at issue. Exploration engine 110 may therefore produce meta-predictions of the suitability of a modeling technique for a prediction problem by using a meta-machine-learning algorithm trained on the results from solving other prediction problems…0035: selecting one or more predictive modeling procedures from the plurality of predictive modeling procedures based on the determined suitabilities of the selected modeling procedures for the prediction problem; and performing the one or more predictive modeling procedures… Two or more models may be blended by combining the outputs of the constituent models. In some embodiments, the blended model may comprise a weighted, linear combination of the outputs of the constituent models…  the method 1000 includes an additional step (not shown) of selecting the predictive modeling procedures that are performed in step 1010. The system 100 may select the modeling procedures, for example, from the library 130 of predictive modeling techniques. In some embodiments, the system 100 selects two or more modeling procedures from two or more different predictive modeling families. Examples of predictive modeling families can include linear regression techniques (e.g., generalized additive models), tree-based techniques (e.g., random forests), support vector machines, neural networks (e.g., multilayer perceptron), etc. For example, the system 100 may select a modeling procedure from the tree family (e.g., a random forest modeling procedure), another modeling procedure from the linear regression family (e.g., a generalized additive model), and a support vector machine modeling procedure…  the data indicative of the characteristics of prediction problems includes data indicative of characteristics of datasets representing the prediction problem. Characteristics of a dataset may include, without limitation, the dataset's width, height, sparseness, or density.” Thus, all the limitation of claim 12 are taught by Achin. 
Claims 13 and 14 is directed to the method for performing a very similar method of claims 3 and 6 above.  Since Achin and Jensen teaches the concepts recited in the limitations, the same art and rationale apply. 
Claim 17 is directed to the CRM for performing the method of claim 1 above.  Since Achin and Jensen teaches the CRM, the same art and rationale apply. 
 Claim 18 is directed to the CRM for performing the method of claims 3 and 6 above.  Since Achin and Jensen teaches the CRM, the same art and rationale apply. 
 Claim 8 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. PGPub 20180046926  (hereinafter “Achin”) et al., in view of U.S. PGPub 20180343171 to (hereinafter “Jensen”) et al., in further view of U.S. PGPub 20020161677 to (hereinafter “Zumbach”) et al.
 As per claim 8, Achin and Jensen teach all the limitations of claim 1. 
 Achin and Jensen do not explicitly teach the following. However, Zumbach teaches:
wherein the executing of the meta- learning comprises: deriving an empirical mean and a variance of the calculated performance according to the plurality of configurations; and Page 3 of 14App. No. 16/518, 104PATENTReply to Office Action of 05/11/2022deriving the expected improvement value and the expected worsening value for each of the plurality of configurations based on the empirical mean and the variance of the calculated performance; Achin 0100- 0104: “Moving Norm, Variance and Standard Deviation [0103] With the efficient moving average operator, we define the moving norm, moving variance, and moving standard deviation operators, respectively: MNorm[.tau., p; z]=MA[.tau.;.vertline.z.vertline..sup.p].sup.1/p, MVar[.tau., p; z]=MA[.tau.;.vertline.z-MA[.tau., z].vertline..sup.p], MSD[.tau., p; z]=MA[.tau.;.vertline.z-MA[.tau.; z].vertline..sup.p].sup.1/- p. (32)  The norm and standard deviation are homogeneous of degree one with respect to z. The p-moment up is related to the norm by .mu..sub.p=MA [.vertline.z.vertline..sup.p=mNORM[z].sup.p. Usually, p=2 is taken. Lower values forp provide a more robust estimate (see Section 3.6), and p=1 is another common choice. Even lower values can be used, for example p={fraction (1/2)}. In the formulae for MVar and MSD, there are two MA operators with the same range .tau. and the same kernel. This choice is in line with standard practice: empirical means and variances are computed for the same sample. Other choices can be interesting--for example, the sample menu can be estimated with a longer time range.
Achin, Jensen, and Zumbach are deemed to be analogous references as they are reasonably pertinent to each other and directed towards measuring, collecting, and analyzing information with a series of inputs to solve similar problems in the similar environments.  Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to have modified Achin and Jensen with the aforementioned teachings from Zumbach with a reasonable expectation of success, by adding steps that allow the software to perform mathematical operations with the motivation to more efficiently and accurately organize and analyze data [Zumbach 0104]. 
  Claims 10-11, 16, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. PGPub 20180046926  (hereinafter “Achin”) et al., in view of U.S. PGPub 20180343171 to (hereinafter “Jensen”) et al., in further view of U.S. PGPub 20180018565  to (hereinafter “KUROKAWA”) et al.
 As per claim 10, Achin and Jensen teach all the limitations of claim 1. 
 Achin and Jensen do not explicitly teach the following. However, Kurokawa teaches:
data set includes semiconductor correlation information, and the extracting of the meta-feature comprises extracting the meta-feature associated with the semiconductor correlation information; Kurokawa 0099: “Mainly described here is an operation example of the prediction circuit 112 performing leaning and prediction using a neural network. Note that operation from Step S11 to Step S14 in FIG. 5 corresponds to learning by a neural network of the prediction circuit 112 (hereinafter also referred to as learning operation), and operation from Step S21 to Step S50 corresponds to prediction together with the learning by a neural network of the prediction circuit 112 (hereinafter also referred to as predicting operation). Note that the prediction is made by an inference (recognition) by the neural network. …0152: “Note that a convolutional neural network (CNN) in which the above-described neural network is used as a feature extraction filter of convolution or a fully connected arithmetic circuit can be used for the prediction circuit 112. Weight coefficients of the feature extraction filter can be determined using random numbers. Owing to this, even when a waveform pattern matching the signal Sco or the signal Sto is not easily expected, features can be extracted and the learning can be performed efficiently. …0188: “When the neural network described in this embodiment is used for the prediction circuit 112 in Embodiment 1, a semiconductor device capable of predicting the necessity of power supply can be provided.”
Achin, Jensen, and Kurokawa are deemed to be analogous references as they are reasonably pertinent to each other and directed towards measuring, collecting, and analyzing information with a series of inputs to solve similar problems in the similar environments.  Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to have modified Achin with the aforementioned teachings from Kurokawa with a reasonable expectation of success, by adding steps that allow the software to utilize system information with the motivation to more efficiently and accurately analyze data systems [Kurokawa 0099]. 
  As per claim 11, Achin, Jensen, and Kurokawa teach all the limitations of claim 10. 
 In addition, Kurokawa teaches:
wherein the extracting of the meta-feature associated with the semiconductor correlation information comprises extracting at least one of a meta- feature corresponding to smoothness associated with the semiconductor correlation information in the configuration space, a meta-feature corresponding to distribution density associated with 34Attorney Docket No. 8021S-1259 (SS-56509-US) the semiconductor correlation information in the configuration space, and a meta-feature corresponding to statistics associated with the semiconductor correlation information; Kurokawa 0099: “Mainly described here is an operation example of the prediction circuit 112 performing leaning and prediction using a neural network. Note that operation from Step S11 to Step S14 in FIG. 5 corresponds to learning by a neural network of the prediction circuit 112 (hereinafter also referred to as learning operation), and operation from Step S21 to Step S50 corresponds to prediction together with the learning by a neural network of the prediction circuit 112 (hereinafter also referred to as predicting operation). Note that the prediction is made by an inference (recognition) by the neural network. …0152: “Note that a convolutional neural network (CNN) in which the above-described neural network is used as a feature extraction filter of convolution or a fully connected arithmetic circuit can be used for the prediction circuit 112. Weight coefficients of the feature extraction filter can be determined using random numbers. Owing to this, even when a waveform pattern matching the signal Sco or the signal Sto is not easily expected, features can be extracted and the learning can be performed efficiently. …0188: “When the neural network described in this embodiment is used for the prediction circuit 112 in Embodiment 1, a semiconductor device capable of predicting the necessity of power supply can be provided.”
Achin, Jensen, and Kurokawa are deemed to be analogous references as they are reasonably pertinent to each other and directed towards measuring, collecting, and analyzing information with a series of inputs to solve similar problems in the similar environments.  Before the effective filing date of the claimed invention, it would have been obvious for one of ordinary skill in the art to have modified Achin and Jensenwith the aforementioned teachings from Kurokawa with a reasonable expectation of success, by adding steps that allow the software to utilize system information with the motivation to more efficiently and accurately analyze data systems [Kurokawa 0099]. 
 Claim 16 is directed to the method for performing a very similar method of claims 10-11 above.  Since Achin and Kurokawa teach the concepts recited in the limitations, the same art and rationale apply. 
Claim 19 is directed to the CRM for performing the method of claims 10-11 above.  Since Achin, Jensen, and Kurokawa teaches the CRM, the same art and rationale apply.
Conclusion
 The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Yu; HongSuresh. Method For Meta-Level Continual Learning, .U.S. PGPub 20190034798 Deep neural networks have shown great success in several application domains when a large amount of labeled data is available for training. However, the availability of such large training data has generally been a prerequisite in a majority of learning tasks. Furthermore, the standard deep neural networks lack the ability to continuous learning or incrementally learning new concepts on the fly, without forgetting or corrupting previously learned patterns. In contrast, humans can rapidly learn and generalize from a few examples of the same concept. Humans are also very good at incremental (i.e. continuous) learning. These abilities have been mostly explained by the meta learning (i.e. learning to learn) process in the brain (Harlow, 1949).
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
 A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.  
 Any inquiry concerning this communication or earlier communications from the examiner should be directed to Arif Ullah, whose telephone number is (571) 270-0161.  The examiner can normally be reached from Monday to Friday between 9 AM and 5:30 PM.
 If any attempt to reach the examiner by telephone is unsuccessful, the examiner’s supervisor, Peter H. Choi, can be reached at (571) 272-6971.  The fax telephone numbers for this group are either (571) 273-8300 or (703) 872-9326 (for official communications including After Final communications labeled “Box AF”).
 Another resource that is available to applicants is the Patent Application Information Retrieval (PAIR). Information regarding the status of an application can be obtained from the (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAX. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, please feel free to contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Arif Ullah/
Primary Examiner, Art Unit 3683