DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are presented for examination. 
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention. 

Claims 1-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Lee et al [WO 2016/004075 A1].
	As to claims 1, 8, and 15, Lee et al teach a method, comprising: 
training a machine learning model using a given training dataset  [e.g., Data set chunks 2844 in fig. 29; “A training set 3302 comprising a plurality of observation records (ORs) such as OR 3304 A, OR 3304B and OR 3304C is to be used for training a model to predict the value of a dependent variable DV” in paragraph 0208; “In at least some embodiments, a model generator component of the MLS may require that input variables to be used for generating features (that can then be used for training a linear model) meet certain data-type constraints” in paragraph 0260]; and 
caching at least one parameter of the machine learning model from the training with the given training dataset, wherein the cached at least one parameter of the machine learning model is used for a subsequent training of the machine learning model [e.g., “In another approach to attaining consistent splits, respective mechanisms (e.g., APIs) may be implemented to (a) save a current state of a PRNG and (b) to re-set a PRNG to a saved state in one embodiment. Consider a scenario in which an API ‘save state(PRNG)’ can be invoked to save the internal state of a PRNG to an object ‘state AfterTraining’ after the training set of a TEI has been generated, and a different API ‘set_state(PRNG, state AfterTraining)’ can be invoked to reset the state of the PRNG (or a different PRNG) to the saved state just before starting the selection of the test set of the TEI” in paragraph 0195; “The plan generator may determine a set of consistency metadata 3152, e.g., metadata that may be shared among related jobs that are inserted in the MLS job queue for the requested split iterations” in paragraph 0200; “During each learning iteration 5020, one or more prepared ORs 5015 may be examined by the model generator (which may also be referred to as a model trainer). Based on the examination of the input variables in the prepared OR, and/or the accuracy of a prediction for the dependent variables of the prepared OR by the model in its current state, respective parameters or weights may be identified for a new set of one or more processed variables. In at least some implementations, the previously-stored parameters or weights may be updated if needed in one or more learning iterations, e.g., using a stochastic gradient descent technique or some similar optimization approach” in paragraph 0262], 
wherein the method is performed by at least one processing device comprising a processor coupled to a memory [e.g., “At least in some implementations, a significant portion or all of the learning iterations of a particular model may be intended to be performed on a single MLS server such as server 5160 (e.g., using one or more threads of execution at such a server). In some such implementations, the parameter vector for the model may be required to fit in the main memory 5170 of the MLS server 5160” in paragraph 0267; fig. 76].
As to claims 2, 9, and 16, Lee et al teach wherein the caching is performed after each of a plurality of iterations of the training of the machine learning model [e.g., “Initially, in at least some implementations, an empty parameter vector 5025 may be created. The parameter vector 5025 may be used to store parameters (e.g., real numbers that represent respective weights) assigned to a collection of features or processed variable values, where the features are derived from the observation record contents using one or more feature processing transformations (FPTs) of the types described earlier” in paragraph 0261; Parameter vector 5025 in fig. 50].
As to claims 3, 10, and 17, Lee et al teach wherein a given cached iteration of the training of the machine learning model is identified using a key based at least in part on one or more of: (i) a hash of the given training dataset, (ii) a hash of the at least one parameter of the machine learning model following the given cached iteration, and (iii) one or more hyperparameters of the machine learning  model following the given cached iteration [e.g., “When making a prediction of a dependent variable value for a give observation record, a linear model may compute the weighted sum of the features whose weights are included in the parameter vector in some implementations. In at least some embodiments, a key-value structure such as a hash map may be used for the parameter vector 5025, with feature identifiers (assigned by the model generator) as keys, and the parameters as respective values stored for each key” in paragraph 0261; Parameter vector 5025 in fig. 50].
As to claims 4, 11, and 18, Lee et al teach wherein the key for the given cached iteration of the training of the machine learning model is evaluated to determine if the given cached iteration is in a cache memory [e.g., “The term ‘feature vector’ may refer to a set of pairs or tuples of (feature identifiers, feature values), which may, for example, be stored in a key-value structure (such as a hash map) or a compressed vector. The term ‘feature parameter’ or ‘parameter’ may refer to a value of a parameter corresponding to a property indexed by the feature identifier. A real number representing a weight is one example of a parameter that may be used in some embodiments, although for some types of machine learning techniques more complex parameters (e.g., parameters that comprise multiple numerical values or probability distributions) may be used” in paragraph 0258; “Initially, in at least some implementations, an empty parameter vector 5025 may be created. The parameter vector 5025 may be used to store parameters (e.g., real numbers that represent respective weights) assigned to a collection of features or processed variable values, where the features are derived from the observation record contents using one or more feature processing transformations (FPTs) of the types described earlier” in paragraph 0261].
As to claims 5, 12, and 19, Lee et al teach wherein a given cached iteration of the training of the machine learning model comprises the trained machine learning model following the given cached iteration, checkpoints of the given cached iteration, and a response time of the given cached iteration [e.g., “At time t1, a training job J1 of a training-and-evaluation iteration TEI1 for a model M1 is begun. Job J1 is scheduled at a set of servers SSI of the MLS, and may include the selection of a training set, e.g., either at the chunk-level, at the observation record level, or at both levels. A pseudo-random number source PRNS 3002 (such as a function or method that returns a sequence of PRNs, or a list of pre-generated PRNs) may be used to generate the training set for Job J1 . At time t2, a training job J2 may be scheduled at a server set SS2, for a training-and-evaluation iteration TEI2 for a different model M2. The training set for job J2 may be obtained using pseudo-random numbers obtained from a different PRNS 3002B” in paragraph 0197; “At time t3, a test job J3 for the evaluation phase of TEI1 is scheduled, more than two hours later than job J1. The scheduling of J3 may be delayed until J1 completes, for example, and the size of the data set being used for J1/J3 may be so large that it takes more than two hours to complete the training phase in the depicted example. J3 may be scheduled at a different set of servers SS3 than were used for J1” in paragraph 0198; fig. 30].
As to claims 6, 13, and 20, Lee et al teach wherein the caching of a given iteration of the training of the machine learning model occurs when the given cached iteration is not found in a cache memory [e.g., “Initially, in at least some implementations, an empty parameter vector 5025 may be created. The parameter vector 5025 may be used to store parameters (e.g., real numbers that represent respective weights) assigned to a collection of features or processed variable values, where the features are derived from the observation record contents using one or more feature processing transformations (FPTs) of the types described earlier” in paragraph 0261].
As to claims 7 and 14, Lee et al teach wherein the cache memory is accessible by one or more of: (a) one or more physical processing devices, and (b) one or more virtual processing devices that implement the training of the machine learning model [e.g., “ “At least in some implementations, a significant portion or all of the learning iterations of a particular model may be intended to be performed on a single MLS server such as server 5160 (e.g., using one or more threads of execution at such a server). In some such implementations, the parameter vector for the model may be required to fit in the main memory 5170 of the MLS server 5160” in paragraph 0267; fig. 76].
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: 
Dirac et al [US 2015/0379430 A1] teach training a machine learning model.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ilwoo Park whose telephone number is (571) 272-4155.  The examiner can normally be reached on Monday through Friday from 9:00 AM to 5:00 PM. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Dr. Henry Tsai can be reached on (571) 272-4176.  The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300. lnformation regarding the status of an application may be obtained from the Patent Application lnformation Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/ILWOO PARK/Primary Examiner, Art Unit 2184                                                                                                                                                                                                        7/19/2022