DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 2019-12-09 and 2021-07-30 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Status
Claims 1-20 are pending in the application.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with 
“retrospect learning module having logic instructions configured to” in Claim 1
“dataset splitting module configured to” in Claim 6
“model building module configured to” in Claim 6
“result testing module configured to” in Claim 6
“automatic hyperparameter enhancement module configured to” in Claim 6
“tuning module configured to” in Claim 6
“forgetting score calculating module for” in Claim 7
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.  
In this case, each “module” is interpreted as comprising processing hardware components as specified in Instant Specification [0036]:  “Those skilled in the pertinent art will appreciate that various embodiments may be described in terms of logical blocks, modules, circuits, algorithms, steps, and sequences of actions, which may be performed or otherwise controlled with a general purpose processor, a DSP, an application specific integrated circuit (ASIC), a field programmable gate array, programmable logic devices, discrete gates, transistor logic, discrete hardware components, elements associated with a computing device, or any suitable combination thereof designed to perform or otherwise control the functions described herein”.  

If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-7 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim limitations 
“retrospect learning module having logic instructions configured to” in Claim 1
“dataset splitting module configured to” in Claim 6
“model building module configured to” in Claim 6
“result testing module configured to” in Claim 6
“automatic hyperparameter enhancement module configured to” in Claim 6
“tuning module configured to” in Claim 6
invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function.  
In this case, each “module” is interpreted as comprising processing hardware components as specified in Instant Specification [0036]:  “Those skilled in the pertinent art will appreciate that various embodiments may be described in terms of logical blocks, modules, circuits, algorithms, steps, and sequences of actions, which may be performed or otherwise controlled with a general purpose processor, a DSP, an application specific integrated circuit (ASIC), a field programmable gate array, programmable logic devices, discrete gates, transistor logic, discrete hardware components, elements associated with a computing device, or any suitable combination thereof designed to perform or otherwise control the functions described herein”.  
However, MPEP 2181(II)(b) states: “For a computer-implemented 35 U.S.C. 112(f)  claim limitation, the specification must disclose an algorithm for performing the claimed specific computer function, or else the claim is indefinite under 35 U.S.C. 112(b)”.  Unlike the 
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 


Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-3, 5-6, 8-10, 13, and 15-18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Jomaa et. al. (“Hyp-RL : Hyperparameter Optimization by Reinforcement Learning”; hereinafter “Jomaa”).
As per Claim 1, Jomaa teaches a Machine Learning (ML) system comprising: a processing device; and a memory device (Jomaa, Pg 1 Abstract, discloses machine learning:  “Hyperparameter tuning is an omnipresent problem in machine learning as it is an integral aspect of obtaining the state-of-the-art performance for any model.”  Jomaa, Section 6.2 on the top paragraph of Page 12, discloses:  “The policy network has 115369 parameters and required 24 GPU hours to train for 10 million frames.”  Here, Jomaa discloses the use of a GPU, and thus implies that Jomaa is using a computing system, which includes a processing device (“GPU”) and also inherently must include a memory device in order to run the code.)
(Jomaa, Pg 1 Abstract Para 2, discloses:  “In this paper we model the hyperparameter optimization problem as a sequential decision problem, which hyperparameter to test next, and address it with reinforcement learning”.  Here, Jomaa discloses using reinforcement learning to learn hyperparameters.  The code that performs this may be called a “retrospect learning module”, as Jomaa has above disclosed a computer system (GPU), and the Instant Specification [0015] states: “The retrospect learning module includes logic instructions configured to cause the processing device to use Reinforcement Learning (RL) to tune hyperparameters of one or more ML techniques and to cause the processing device to train a ML model using the one or more ML techniques in which the respective hyperparameters were tuned in the RL.”
having logic instructions configured to cause the processing device to
use Reinforcement Learning (RL) to tune hyperparameters of one or more ML techniques (Jomaa, Pg 4 Section 4, discloses:  “In this section, we formulate the sequential decision-making task of hyperparameter tuning as a Markov Decision Process (MDP) and describe the model architecure used.”, and in Section 4.1 continues:  “A standard reinforcement learning setting is based on an MDP”)
and train a ML model using the one or more ML techniques in which the respective hyperparameters were tuned with the RL (Jomaa, Pg 5 Section 4.1 above Eq 5, discloses:  “The response can be any meaningful performance metric, so without loss of generality, we consider it to be the validation loss of the model M trained with hyperparameters”.  Here, Jomaa discloses training a ML model as part of an iterative RL process, and thus during the hyperparameters tuning process, and thus the hyperparameters were tuned with RL.  This is also inherent that one would continue to train once the best hyperparameters were found. This is acknowledged in the Related Work in Pg 3 Section 2 Para 4: “Hyperparameter optimization is also addressed within the scope of reinforcement learning, specifically for architectural network design. In [41], an RNN-based policy network is proposed which iteratively generates new architectures based on the gradient information from its child networks. At the end of the training process, the model that resulted in the best validation error is selected for further training.”)

As per Claim 2, Jomaa teaches the system of claim 1.  Jomaa teaches wherein the logic instructions further cause the processing device to store information from one or more previous iterations of ML model-building processes (Jomaa, Pg 5 Section 4.1 below Eq 7, discloses:  “The state of the environment is defined as the data set metafeatures D = Rw, with w = dim(D) as the number of metafeatures, described in Section 5, plus the history of evaluated hyperparameter configurations and their corresponding response”.  Here, Jomaa discloses “the history of evaluated hyperparameter configurations and their corresponding response”, which is information from previous iterations of the ML model-building process.  Furthermore, Section 4.2 on Page 7 discloses:  “A simple LSTM cell includes a memory cell, ct that accumulates information across time steps”, and at the end of Section 4.2 on Pg 8 concludes:  “Through this formulation, the agent is able to start navigating the hyperparameter response surface intelligently from the very start”.  Thus, again Jomaa discloses storing information from previous iterations during the process.)
 (Jomaa, Pg 6 Section 4.2, discloses: “The state representation is decomposed into two parts: the static data set features sstatic = d, and the sequence of selected hyperparameter configurations and their correponding rewards, sdynamic 2 ( R), which we model as a dynamic multivariate time series distribution at time T as sdynamic 2 ( R)T. Each channel hence in sdynamic represents a hyperparameter value, with the final channel including the reward.”  Here, Jomaa discloses “sequence of selected hyperparameter configurations and their corresponding rewards”, and thus the stored information is utilized as a reward within the RL.)

As per Claim 3, Jomaa teaches the ML system of claim 1.  Jomaa teaches wherein the stored information includes metrics of one or more intermediate ML models obtained during the one or more previous iterations, wherein the metrics include one or more of accuracy, precision, recall, a training time, an inference time, and a forgetting score, and wherein the forgetting score is used to evaluate how well the ML model- building processes can learn new patterns while retaining knowledge of previously learned patterns.  (Jomaa, Pg 5 Section 4.1 below Eq 5, discloses:  “The response can be any meaningful performance metric, so without loss of generality, we consider it to be the validation loss of the model M trained with hyperparameters ”.  Here, Jomaa discloses “validation loss”, which is a measure of accuracy.)

As per Claim 5, Jomaa teaches the ML system of claim 1.  Jomaa teaches wherein the logic instructions further cause the processing device to receive an input dataset with respect to an environment for which the ML model is to be modeled (Jomaa, Pg 4 Section 3, discloses:  “The objective of a machine learning algorithm A : D   !M is to estimate a model M 2 M, from the space of all models M with hyperparameters  2 , that optimizes an objective function, for example a loss function L, over a data set distribution D”.  Here, Jomaa discloses an input dataset (“data set distribution”)).
(Jomaa, Pg 4 Section 3 before Eq 2, discloses:  “However, the trained model M  might suffer from a generalization error if   is not carefully optimized. The task of hyperparameter optimization is to identify an optimal hyperparameter configuration that results in a model M, such that the generalization error on the validation set is minimized.” Here, Jomaa discloses a “trained model”, thus inherently disclosing a “training set”.  This model is then iterated through intermediate models, as Jomaa describes an iterative process that “results in a model”, and thus there are intermediate models before the final result.)
Jomaa suggests but does not explicitly teach split the input dataset into at least a training dataset and a testing dataset; use the training dataset to build an intermediate ML model; use the testing dataset to obtain metrics about the intermediate ML model. (Jomaa, Pg 4 Section 3 before Eq 2, discloses:  “However, the trained model M might suffer from a generalization error if  is not carefully optimized. The task of hyperparameter optimization is to identify an optimal hyperparameter configuration  2  that results in a model M , such that the generalization error on the validation set is minimized.” Here, Jomaa discloses a “validation set” and a “trained” model. This suggests that the input data set has been split into a “training set” and a “validation set”.  This model is then iterated through intermediate models, as Jomaa describes an iterative process that “results in a model”, and thus there are intermediate models before the final result, and the training dataset has been used in this process to build the intermediate ML model.  Here, the “validation set” of Jomaa is functioning as a “testing set” which is used to obtain metrics about the intermediate ML model (“generalization error on the validation set is minimized”), the “generalization error” being a metric.)

As per Claim 6, Jomaa teaches the ML system of claim 1.  Jomaa teaches wherein the retrospect learning module comprises a dataset splitting module configured to split an input dataset from an environment in which a ML model is intended to operate (Jomaa, Pg 4 Section 3 before Eq 2, discloses:  “However, the trained model M might suffer from a generalization error if  is not carefully optimized. The task of hyperparameter optimization is to identify an optimal hyperparameter configuration  2  that results in a model M , such that the generalization error on the validation set is minimized.” Here, Jomaa discloses a “validation set” and a “trained” model. This suggests that the input data set has been split into a “training set” and a “validation set”, thus suggesting a “dataset splitting module”)
a model building module configured to build ML models in multiple iterations (Jomaa, Pg 5 Section 4.1 below Eq 7, discloses:  “The state of the environment is defined as the data set metafeatures D = Rw, with w = dim(D) as the number of metafeatures, described in Section 5, plus the history of evaluated hyperparameter configurations and their corresponding response”.  Here, Jomaa discloses “the history of evaluated hyperparameter configurations and their corresponding response”, which is information from previous iterations of the ML model-building process.  Each hyperparameter configuration represents an intermediate ML model.  Thus, Jomaa suggests a model building module configured to build ML models in multiple iterations.)
a result testing module configured to obtain metrics regarding each iteration (Jomaa, Pg 4 Section 3 before Eq 2, discloses:  “However, the trained model M might suffer from a generalization error if  is not carefully optimized. The task of hyperparameter optimization is to identify an optimal hyperparameter configuration  2  that results in a model M , such that the generalization error on the validation set is minimized.” Here, Jomaa discloses obtaining metrics about the intermediate ML model (“generalization error on the validation set is minimized”), the “generalization error” being a metric.)
an automatic hyperparameter enhancement module configured to automatically tune the hyperparameters of ML techniques of the ML model (Jomaa, Pg 1 Intro Para 1 concludes: “To achieve an automatic hyperparameter tuning, one must also take into account how well certain configurations performed on other data sets and carry this knowledge over to new data sets. A well trained policy would then be able to navigate the response surface of a learning algorithm based on previous experiences, in order to converge rapidly to a global optimum.”  Here, Jomaa discloses “automatic hyperparameter tuning”).
(Jomaa, Pg 2 Intro Para 2, also concludes: “In this paper, we learn a controller that can tune the hyperparameters of a fixed topology by defining a state representation, set of actions, and a transition function which allow an agent to navigate the response surface and maximize its reward.” Here, Jomaa discloses a tuning module (“a controller that can tune the hyperparameters”)).

	Claim 8 is a method claim corresponding to system Claim 1, and is rejected for the same reasons.

Claim 9 is a method claim corresponding to system Claim 2, and is rejected for the same reasons.

Claim 10 is a method claim corresponding to system Claim 3, and is rejected for the same reasons.

	Claim 13 is a method claim corresponding to system Claim 5, and is rejected for the same reasons.

	As per Claim 15, Jomaa teaches the ML system of claim 1.  Jomaa teaches wherein the RL-based system includes: states defined as one or more of performance metrics, parameters of previously-training ML models, information provided by a human expert, information (Jomaa, Pg 5 Section 4.1 below Eq 7, discloses:  “The state of the environment is defined as the data set metafeatures D = Rw, with w = dim(D) as the number of metafeatures, described in Section 5, plus the history of evaluated hyperparameter configurations and their corresponding response”.  Here, Jomaa discloses states which comprise parameters of previously-training ML models (“history of evaluated hyperparameter configurations”) and performance metrics (“and their corresponding response”))
	actions defined as a tuning of the hyperparameters (Jomaa, Pg 5 Section 4.1 below eq 6, discloses:  “The agent navigates the hyperparameter response space through a series of actions, which are simply the next hyperparameter configurations to be evaluated.”)
	rewards defined as one or more of maximizing accuracy, precision, and recall; minimizing amount of data required; minimizing computation time; minimize human labelling; minimizing cost associated with large hyperparameter changes; maximizing transfer efficiency; minimizing forgetting score; and a configurable weighted combination of a plurality of these rewards. (Jomaa, Pg 5 Section 4.1 below Eq 5, discloses:  “The response can be any meaningful performance metric, so without loss of generality, we consider it to be the validation loss of the model M trained with hyperparameters ”.  Here, Jomaa discloses that the reward (“response”, see Section 2 last paragraph:  “observes the reward provided by the response function”) is “any meaningful performance metric”, and discloses “validation loss” for which minimizing is a measure of “maximizing accuracy”).

Claim 16 is a non-transitory computer readable medium claim corresponding to system Claim 1.  The difference is that it recites a non-transitory computer readable medium. Jomaa, Section 6.2 on the top paragraph of Page 12, discloses:  “The policy network has 115369 parameters and required 24 GPU hours to train for 10 million frames.”  Here, Jomaa discloses the use of a GPU, and thus implies that Jomaa is using a computing system, which suggests a non-transitory computer readable medium.  Claim 16 is rejected for the same reasons as Claim 1.

Claim 17 is a non-transitory computer readable medium claim corresponding to system Claim 2, and is rejected for the same reasons.

Claim 18 is a non-transitory computer readable medium claim corresponding to system Claim 3, and is rejected for the same reasons.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 4, 12, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Jomaa in view of Kemker et. al. (“Measuring Catastrophic Forgetting in Neural Networks”; hereinafter “Kemker”) and Zhong et .al. (US 2021/0073665 A1; hereinafter “Zhong”).
As per Claim 4, Jomaa teaches the ML system of claim 3.  However, Jomaa does not teach wherein the logic instructions further cause the processing device to calculate the forgetting score by using a first dataset to train a first model; determining a first accuracy of the first model when applied to the first dataset; using a second dataset to tune the first model to achieve a second model; determining a second accuracy of the second model when applied to the second dataset; determining a third accuracy of the second model when applied to the first dataset; and calculating a ratio between the second accuracy and the third accuracy.
Kemker teaches wherein the logic instructions further cause the processing device to calculate the forgetting score by (Kemker, Pg 3391 Intro Para 3, discloses:  “We establish new benchmarks with novel metrics for measuring catastrophic forgetting”)

For the following limitations, refer to Kemker pg 3394 “Evaluation Metrics”: “We propose three new metrics to evaluate a model’s ability to retain prior sessions while still learning new knowledge,

    PNG
    media_image1.png
    347
    603
    media_image1.png
    Greyscale

where T is the total number of sessions, αnew,i is the test accuracy for session i immediately after it is learned, αbase,i is the test accuracy on the first session (base set) after i new sessions have been learned, αall,i is the test accuracy of all of the test data for the classes seen to this point, and αideal is the offline MLP accuracy on the base set, which we assume is the ideal performance. Ωbase, Ωnew, and Ωall are normalized area under the curve metrics. Ωbase measures a model’s retention of the first session, after learning in later study sessions. Ωnew measures the model’s ability to immediately recall new tasks. Ωall computes how well a model both retains prior knowledge and acquires new information. By normalizing Ωbase and Ωall by αideal, the results will be easier to compare between datasets. Unless a model exceeds αideal, results will be between [0,1], which enables comparison between datasets.

using a first dataset to train a first model (Kemker, Pg 3391 Intro, “Problem Formulation”, discloses:  “In this paper, we study catastrophic forgetting in MLP-based neural networks that are incrementally trained for classification tasks.”  Here, Kemker discloses “incrementally trained”, which implies the use of a first dataset to train a first model.)
determining a first accuracy of the first model when applied to the first dataset (Kemker, Pg 3394 “Evaluation Metrics”, discloses:  “αideal is the offline MLP accuracy on the base set”)
using a second dataset to tune the first model to achieve a second model (Kemker, Pg 3391 Intro, “Problem Formulation”, discloses:  “In this paper, we study catastrophic forgetting in MLP-based neural networks that are incrementally trained for classification tasks.”  Here, Kemker discloses “incrementally trained”, which implies the use of a second dataset to train a second model.)
determining a second accuracy of the second model when applied to the second dataset (Kemker, Pg 3394 “Evaluation Metrics”, discloses:  “αnew,i is the test accuracy for session i immediately after it is learned”.  When i = 2, then this accuracy is for a second model on a second dataset.)
determining a third accuracy of the second model when applied to the first dataset (Kemker, Pg 3394 “Evaluation Metrics”, discloses:  “αbase,i is the test accuracy on the first session (base set) after i new sessions have been learned”.  When i = 2, then this accuracy is for a second model on the first dataset.)
Jomaa and Kemker are analogous art because they are both in the field of endeavor of machine learning.
It would have been obvious before the effective filing date of the claimed invention to combine the RL hyperparameter tuning of Jomaa with the forgetting score of Kemker.  Jomaa’s 
However, while Kemker teaches calculating a ratio between the first accuracy and the third accuracy, Kemker does not explicitly teach calculating a ratio between the second accuracy and the third accuracy.
Zhong teaches calculating a ratio between the second accuracy and the third accuracy.  (Recall above that Kemker established that the second and third accuracies are accuracies of the same model (“second model”) on the first and second datasets, which may be considered “source” and “target” datasets.  Zhong, [0030] and [0034-0037], discloses:

    PNG
    media_image2.png
    278
    414
    media_image2.png
    Greyscale


    PNG
    media_image3.png
    570
    415
    media_image3.png
    Greyscale

Here, Zhong discloses in [0030] that “acctarget is an estimation accuracy of the model f with respect to the target dataset” and in [0035] that “accsource is the accuracy of the model f with respect to the source dataset”.  Zhong, [0037], discloses:  “It can be seen that the robustness of the model can be determined based on a ratio of a smaller accuracy in the first accuracy accsource and the estimation accuracy acctarget of the model with respect to the source dataset to the first accuracy.”  This is shown by Eq. 4, where in a case where acctarget < accsource, Zhong discloses that robustness can be measured by acctarget / accsource, or a ratio between the second accuracy and the third accuracy.)
Jomaa, Kemker, and Zhong are analogous art because they are both in the field of endeavor of machine learning.
It would have been obvious before the effective filing date of the claimed invention to combine the RL hyperparameter tuning of Jomaa with the forgetting score of Kemker, also with the robustness measure comprising an accuracy ratio of Zhong.  Jomaa’s method comprises multiple iterations of training, and Kemker provides a way of measuring the forgetting of the training on previous datasets. Zhong provides another ratio that provides another measure of robustness of a model over different datasets, and when applied to the transfer-learning/domain-adaptation problem of Jomaa and Kemker, and in particular Kemker’s measurement of forgetting, provides another way of measuring catastrophic forgetting.  One of ordinary skill in the art would be motivated to do so in order to be able to measure robustness in order to make sure the model can be consistently accurate across multiple datasets (Zhong, end of [0011]:  “estimate the robustness of the model according to the accuracy of the model with respect to the first dataset and the estimated accuracy of the model with respect to the second dataset.”)

Claim 12 is a method claim corresponding to system Claim 4, and is rejected for the same reasons.

Claim 20 is a non-transitory computer readable medium claim corresponding to system Claim 4.  The difference is that it recites a non-transitory computer readable medium. Jomaa, Section 6.2 on the top paragraph of Page 12, discloses:  “The policy network has 115369 parameters and required 24 GPU hours to train for 10 million frames.”  Here, Jomaa discloses the use of a GPU, and thus implies that Jomaa is using a computing system, which suggests a non-transitory computer readable medium.  Claim 20 is rejected for the same reasons as Claim 4.

Claims 7, 11, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Jomaa in view of Kemker.
As per Claim 7, Jomaa teaches the ML system of claim 6 and a retrospect learning module (see Rejection to Claim 1).  However, Jomaa does not teach wherein the retrospect learning module further comprises a forgetting score calculating module for calculating a forgetting score used to evaluate how well the ML model can learn new patterns while retaining information about previously learned patterns.
Kemker teaches wherein [the retrospect learning module further comprises] a forgetting score calculating module for calculating a forgetting score used to evaluate how well the ML model can learn new patterns while retaining information about previously learned patterns. (Kemker, Pg 3391 Intro Para 3, discloses:  “We establish new benchmarks with novel metrics for measuring catastrophic forgetting”.  Kemker, pg 3394 “Evaluation Metrics”, discloses:  “We propose three new metrics to evaluate a model’s ability to retain prior sessions while still learning new knowledge”).

It would have been obvious before the effective filing date of the claimed invention to combine the RL hyperparameter tuning of Jomaa with the forgetting score of Kemker.  Jomaa’s method comprises multiple iterations of training, and Kemker provides a way of measuring the forgetting of the training on previous datasets.  One of ordinary skill in the art would be motivated to do so in order to be able to mitigate catastrophic forgetting (Kemker, Abstract:  “In this paper, we introduce new metrics and benchmarks for directly comparing five different mechanisms designed to mitigate catastrophic forgetting in neural networks: regularization, ensembling, rehearsal, dual-memory, and sparse-coding. Our experiments on real-world images and sounds show that the mechanism(s) that are critical for optimal performance vary based on the incremental training paradigm and type of data being used, but they all demonstrate that the catastrophic forgetting problem is not yet solved.”)

As per Claim 11, Jomaa teaches the method of claim 10.  However, Jomaa does not teach wherein the metrics include at least the forgetting score, and wherein the forgetting score is used to evaluate how well the ML model-building processes can learn new patterns while retaining knowledge of previously learned patterns.
Kemker teaches wherein the metrics include at least the forgetting score, and wherein the forgetting score is used to evaluate how well the ML model-building processes can learn new patterns while retaining knowledge of previously learned patterns (Kemker, Pg 3391 Intro Para 3, discloses:  “We establish new benchmarks with novel metrics for measuring catastrophic forgetting”.  Kemker, Pg 3394 “Evaluation Metrics”, discloses:  “We propose three new metrics to evaluate a model’s ability to retain prior sessions while still learning new knowledge”).
Jomaa and Kemker are analogous art because they are both in the field of endeavor of machine learning.
It would have been obvious before the effective filing date of the claimed invention to combine the RL hyperparameter tuning of Jomaa with the forgetting score of Kemker.  Jomaa’s method comprises multiple iterations of training, and Kemker provides a way of measuring the forgetting of the training on previous datasets.  One of ordinary skill in the art would be motivated to do so in order to be able to mitigate catastrophic forgetting (Kemker, Abstract:  “In this paper, we introduce new metrics and benchmarks for directly comparing five different mechanisms designed to mitigate catastrophic forgetting in neural networks: regularization, ensembling, rehearsal, dual-memory, and sparse-coding. Our experiments on real-world images and sounds show that the mechanism(s) that are critical for optimal performance vary based on the incremental training paradigm and type of data being used, but they all demonstrate that the catastrophic forgetting problem is not yet solved.”)

Claim 19 is a non-transitory computer readable medium claim corresponding to method Claim 11.  The difference is that it recites a non-transitory computer readable medium. Jomaa, Section 6.2 on the top paragraph of Page 12, discloses:  “The policy network has 115369 parameters and required 24 GPU hours to train for 10 million frames.”  Here, Jomaa discloses the use of a GPU, and thus implies that Jomaa is using a computing system, which suggests a non-transitory computer readable medium.  Claim 19 is rejected for the same reasons as Claim 11.

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Jomaa in view of He et. al. (WO 2018/212710 A1; hereinafter “He”).
	As per Claim 14, Jomaa teaches the method of claim 13.  Jomaa teaches wherein the method further comprises the step of utilizing the validation dataset to perform cross-validation multiple times to evaluate the intermediate ML model during multiple iterations.  (Jomaa, Page 9 above Section 6, discloses:  “The values of the hyperparameters are summarized in Table 2. The experimental results represent the average of a 5-fold based cross-validation. The hyperparameter grid results in 2916 distinct experiments per data set, and 13 dimensions per configuration.”  Here, Jomaa discloses “5-fold cross-validation”, and thus discloses performing cross-validation multiple times (5 times) to evaluate the intermediate ML model during multiple iterations (5 iterations).  Cross-validation inherently involves splitting a data set into a training and validation set.  Jomaa also discloses this earlier in Pg 8 Section 5:  “The hyperparameter response is obtained by evaluating on 20% of the data after training on the remaining 80%”).
	However, Jomaa does not explicitly teach wherein the step of the splitting the input dataset further includes the step of the splitting the input dataset into the training dataset, the testing dataset, and a validation dataset.
He teaches wherein the step of the splitting the input dataset further includes the step of the splitting the input dataset into the training dataset, the testing dataset, and a validation  (He, Section “Evaluation Protocol”, discloses:  “We randomly split each dataset into three portions: 70% for training, 20% for validation, and 10% for testing. The validation set is only used for tuning hyper-parameters, and the performance comparison is done on the test set. To evaluate the performance, we adopt root mean square error (RMSE), a widely used measure for regression task. A lower score indicates a better performance.”  Here, He discloses splitting the input dataset into a training dataset and test dataset, and the test dataset is used to obtain metrics (“performance comparison is done on the test set”)).
Jomaa and He are analogous art because they are both in the field of endeavor of machine learning.
	It would have been obvious before the effective filing date of the claimed invention to combine the RL hyperparameter tuning of Jomaa with the 3-way dataset split of He.  One of ordinary skill in the art would be motivated to do so to properly evaluate the performance of the machine learning model in order to achieve a better performing resulting model after the hyperparameter tuning (He:  “The validation set is only used for tuning hyper-parameters, and the performance comparison is done on the test set. To evaluate the performance, we adopt root mean square error (RMSE), a widely used measure for regression task. A lower score indicates a better performance”)

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Neary (“Automatic Hyperparameter Tuning in Deep Convolutional Neural Networks Using Asynchronous Reinforcement Learning") discloses using RL to perform automatic hyperparameter tuning
Wu et. al. (“RPR-BP: A Deep Reinforcement Learning Method for Automatic Hyperparameter Optimization”) discloses using RL to perform automatic hyperparameter tuning
Hsu et. al. (“MONAS: Multi-Objective Neural Architecture Search using Reinforcement Learning”) discloses using RL to perform automatic hyperparameter tuning
Dong et. al. (“Dong et .al., “Dynamical Hyperparameter Optimization via Deep Reinforcement Learning in Tracking”) discloses using RL to perform automatic hyperparameter tuning
Zoph and Le (“Neural Architecture Search with Reinforcement Learning”) discloses using RL to perform automatic hyperparameter tuning
Chen et. al. ("Deep Reinforcement Learning with Model-Based Acceleration for Hyperparameter Optimization") discloses using RL to perform automatic hyperparameter tuning
Balaprakash et. al. (“Scalable reinforcement-learning-based neural architecture search for cancer deep learning research”) discloses using RL to perform automatic hyperparameter tuning
Pfulb et .al. (“Catastrophic forgetting: still a problem for DNNs”) discloses measuring forgetting for transfer learning, see Figure 1 showing testing results of the old model on the old dataset, and the new model on both the old and new dataset
Sodhani et. al. (“Towards Training Recurrent Neural Networks for Lifelong Learning”) discloses measuring forgetting for transfer learning, see Figure 1 showing accuracy measures of the new model on the previous and new tasks
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEONARD A SIEGER whose telephone number is (571)272-9710. The examiner can normally be reached M-F 8:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571) 272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic 





/L.A.S./Examiner, Art Unit 2126         
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126