DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claims 1, 11 are objected to because of the following informalities:

(1) The claims, recite the limitation recites: “… a first set distributed training systems … a second set distributed training systems …”. The limitation should recite “… a first set of distributed training systems … a second set of distributed training systems …” (emphasis added).  Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.

Claims 1, 11 recite the limitation: "… specified in the training specification … a candidate hyper parameter set of the plurality of sets of hyper-parameters …" (emphasis added).   There is insufficient antecedent basis for this limitation in the claim.

(1) For the purposes of examination, the claims are interpreted as:
“… specified in the model training specification … a candidate hyper parameter set of the plurality of  sets …” (emphasis added).
Claims 12-20, recite the limitation: "The computer program product of claim 11 …".   There is insufficient antecedent basis for this limitation in the claim.

(1) For the purposes of examination, the claims are interpreted as: “The non-transitory computer readable medium of claim 11 …” (emphasis added).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 10 are rejected under 35 U.S.C. 103 as being unpatentable over Bilenko (US 2014/0344193 A1) in view of Achin et al., hereinafter referred to as Achin (US 2015/0339572 A1) in further view of Breckenridge et al., hereinafter referred to as Breckenridge (US 2015/0170056 A1). 

	As per claim 1, Bilenko discloses a system (Bilenko: Fig. 1, system 100.) comprising:
a device processor (Bilenko: Paras. [0057], [0065] discloses a device processor 602.); and
a non-transitory computer readable medium storing instructions executable by the device processor to (Bilenko: Paras. [0057], [0065] disclose a non-transitory computer readable medium storing instructions 604 executable by the device processor.):
receive a model training specification (Bilenko: Paras. [0058] discloses receiving an indication that a hyper-parameter for a computer-executable learning algorithm is to be tuned [i.e., receiving a model training specification].);
determine a plurality of hyper parameter sets, the plurality of hyper parameter sets comprising a first hyper parameter set (214) and a second hyper parameter set (216) for training a type of predictive model (learning algorithm) specified in the training specification (Bilenko: Para. [0052]-[0053], [0058] disclose several candidate pairs of hyper-parameter configurations comprising a first hyper parameter set (214) and a second hyper parameter set (216) for training a type of learning algorithm specified via hyper-parameter to be tuned are determined.);
(Bilenko: Para. [0052]-[0053] disclose the system 100 can operate in a distributed fashion, such that several candidate pairs of hyper-parameter configurations comprising the first hyper parameter set (214) and the second hyper parameter set (216) are distributed to numerous computing devices.);
initiate training of (206-208) of a complete set of training data (104) for each respective one of the first set of distributed training systems (Bilenko: Para. [0030], [0052]-[0054] disclose initiating training of multiple pairs of predictive models, in parallel, across numerous computing devices [i.e., first and second sets of distributed training systems], based on the received tuning hyper-parameter indication, using the first and second hyper parameter sets, and using different portions, such as folds (206-208) of a complete set of training data (104) for each respective computing device.);
distribute the second hyper parameter set to each of a second set distributed training systems (Bilenko: Para. [0052]-[0053] disclose the system 100 can operate in a distributed fashion, such that several candidate pairs of hyper-parameter configurations comprising the first hyper parameter set (214, 216) and the second hyper parameter set (214, 216) are distributed to numerous computing devices.);
initiate training of (206-208) of a complete set of training data for each respective one of the second set of distributed training systems (Bilenko: Para. [0030], [0052]-[0054] disclose initiating training of multiple pairs of predictive models, in parallel, across numerous computing devices [i.e., first and second sets of distributed training systems], based on the received tuning hyper-parameter indication, using the first and second hyper parameter sets, and using different portions, such as folds (206-208) of a complete set of training data (104) for each respective computing device.);
select a candidate hyper parameter set of the plurality of sets of hyper-parameters, based on a measure of estimated effectiveness of each of the first plurality of predictive models and second plurality of predictive models (Bilenko: Paras. [0053]-[0054] discloses selecting a candidate hyper parameter set of the plurality of sets of hyper-parameters, based on performance [i.e., a measure of estimated effectiveness] of the plurality of the first and second predictive models (210, 212).); and

However Bilenko does not explicitly disclose “… training of a first plurality of predictive models, in parallel, by the first set of distributed training systems … training of a second plurality of predictive models, in parallel, by the second set of distributed training systems … generate a production predictive model by training a predictive model using the selected candidate hyper parameter set and the complete set of training data.”
Further, Achin is in the same field of endeavor and teaches training of a first plurality of predictive models, in parallel, by the first set of distributed training systems and training of a second plurality of predictive models, in parallel, by the second set of distributed training systems (Achin: Fig. 3 & Paras. [0123], [0129], [0130] disclose training a first set of predictive models from a subset of predictive models in parallel on a first set of processing nodes and training a second set of predictive models in parallel from the subset of predictive models on a second set of processing nodes from multiple processing nodes.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Bilenko and Achin before him or her, to modify the predictive model system of Bilenko to include training of sets of predictive models by sets of distributed training systems feature as described in Achin. The motivation for doing so would have been to improve prediction output scores of prediction models by using development techniques that provide enhancements to accuracy for the desired prediction model.
However Bilenko-Achin do not explicitly disclose “… generate a production predictive model by training a predictive model using the selected candidate hyper parameter set and the complete set of training data.”  
Furthermore, Breckenridge is in the same field of endeavor and teaches generate a production predictive model (trained predictive model) by training a predictive model (best or most optimal predictive model) using the selected candidate hyper parameter set (included in the optimal predicted model) and the complete set of training data (Breckenridge: Fig. 5 & Paras. [0046], [0049], [0053]-[0055] disclose estimating effectiveness of multiple predictive models which use multiple hyper-parameter configurations and based on the effectiveness, select the best model, which includes the most optimal hyper-parameter configuration and finally train the best predictive model using a complete set of K partitions of training data to generate the trained predictive model.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Bilenko-Achin and Breckenridge before him or her, to modify the predictive model system of Bilenko-Achin to include the generating production predictive model features as described in Breckenridge. The motivation for doing so would have been to improve prediction outputs of predictive models using predictive model optimization by providing multiple training configurations with various data formats. 

As per claim 2, Bilenko-Achin-Breckenridge disclose the system of claim 1, wherein the model training specification specifies a machine learning algorithm for training predictive models (Bilenko: Paras. [0058] discloses receiving an indication that a hyper-parameter for a computer-executable learning algorithm is to be tuned [i.e., receiving a model training specification] for training predictive models and Achin: Para. [0065] discloses using a machine executable template to specify an ML algorithm for the training models.).

As per claim 3, Bilenko-Achin-Breckenridge disclose the system of claim 1, wherein the model training specification specifies a hyper-parameter search space for training predictive models (Bilenko: Paras. [0007]-[0009] disclose receiving an indication that a desired hyper-parameter configuration [i.e., hyper-parameter search space] for a computer-executable learning algorithm is to be tuned for training predictive models.).

	As per claim 10, Bilenko-Achin-Breckenridge disclose the system of claim 1, wherein the instructions are configured to initiate training of the first plurality of predictive models (210) and second plurality of predictive models (212) in parallel (Bilenko: Para. [0030], [0052]-[0054] disclose initiating training of multiple pairs of predictive models, in parallel, across numerous computing devices.).

As per claims 11-13, 20, the claim(s) recites analogous limitations to claim(s) 1-3, 10 above, and is/are therefore rejected on the same premise.
 


	
Claims 4, 5, 7-9, 14, 15, 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Bilenko in view of Achin in view of Breckenridge in further view of Brueckner et al., hereinafter referred to as Brueckner (US 2016/0078361 A1).
  
	As per claim 4, Bilenko-Achin-Breckenridge disclose the system of claim 1 (Bilenko: Abstract.), 
	However Bilenko-Achin-Breckenridge do not explicitly disclose “… wherein the model training specification specifies a data source for training predictive models.”
	Further, Brueckner is in the same field of endeavor and teaches wherein the model training specification specifies a data source for training predictive models (Brueckner: Paras. [0055] & [0064] disclose the template provided by a client to specifies a data source 150 for training predictive models.).
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Bilenko-Achin-Breckenridge and Brueckner before him or her, to modify the predictive model system of Bilenko-Achin-Breckenridge to include the specifying data source feature as described in Brueckner. The motivation for doing so would have been to improve the quality of machine learning results using additional parameters that can configure learning systems to provide a more tailored model desired by clients of a machine learning service.

As per claim 5, Bilenko-Achin-Breckenridge disclose the system of claim 1, wherein the instructions are executable by the device processor to:
initiate fetching of raw training data 
initiate merging and preprocessing of the fetched raw training data (Breckenridge: Fig. 5 & Paras. [0030], [0042] disclose merging older training data with new training data. The training data is fetched from data repository 214.).
However does not explicitly disclose “… raw training data from a plurality of data sources.”
Further, Brueckner teaches raw training data from a plurality of data sources (Brueckner: Paras. [0130]-[0131], [0147] disclose retrieving data sets that are not necessarily formatted [i.e., raw training data], which then the MLS may assigns default variable to and retrieves the raw data from a plurality of data sources.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Bilenko-Achin-Breckenridge and Brueckner before him or her, to modify the predictive model system of Bilenko-Achin-Breckenridge to include the plurality of data sources feature as described in Brueckner. The motivation for doing so would have been to improve the quality of machine learning results using various data types and sets that can configure learning systems to provide a more tailored model desired by clients of a machine learning service. 

As per claim 7, Bilenko-Achin-Breckenridge-Brueckner disclose the system of claim 5, wherein the instructions are executable by the device processor to:
initiate cleaning of the merged and preprocessed fetched raw training data (Brueckner: Paras. [0064], [0131], [0226] disclose performing cleansing of the combined files.).

As per claim 8, Bilenko-Achin-Breckenridge-Brueckner disclose the system of claim 5, wherein the instructions are executable by the device processor to:
initiate storing of the merged and preprocessed fetched raw training data in a caching layer (Achin: Para. [0140] discloses when a template T is invoked on a dataset sample S, the template checks the storage structure to determine whether the results of executing that template on that dataset sample are already stored via cache and Brueckner: Paras. [0131], [0147] disclose storing the combined and preprocessed files [i.e., merged and preprocessed fetched raw training data].).

As per claim 9, Bilenko-Achin-Breckenridge-Brueckner disclose the system of claim 5, wherein the distributed training systems comprise software containers that are configured based on the received model training specification (Achin: Fig. 3 & Paras. [0089], [0092], [0094] disclose the multiple processing nodes create software wrappers or containers based on the executable template.).

As per claims 14, 15, 17-19, the claim(s) recites analogous limitations to claim(s) 4, 5, 7-9 above, and is/are therefore rejected on the same premise.



Allowable Subject Matter
Claims 6 & 16 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure and can be viewed in the list of cited references.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PEET DHILLON whose telephone number is (571) 270-5647.  The examiner can normally be reached on M-F: 5am-1:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sath V. Perungavoor can be reached on 571-272-7455.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/PEET DHILLON/Primary Examiner, Art Unit 2488                                                                                                                                                                                                        Date: 05-30-2021