Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This non-final office action is responsive to the U.S. patent application no. 16/529,223 filed on August 1, 2019. 
Claims 1-20 are pending;
Claims 1-20 are rejected.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on January 4, 2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement has been considered by the examiner.
Drawings
The drawings are objected to under 37 CFR 1.83(a).  The drawings must show every feature of the invention specified in the claims.  Therefore, the claimed features must be shown or the feature(s) canceled from the claim(s).  No new matter should be entered.
The figures as filed on August 1, 2019 do not show “sets of machine learning models”, “first supersets of the machine learning models” as found in claim 1 and “second supersets of machine learning models” as found in claim 9. 
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 10-15 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  
Claim 10 recites “a machine readable storage device”, which when given its broadest reasonable interpretation, could be directed to carrier waves that are considered non-statutory under 35 U.S.C. 101. See MPRP 2106.II.  
Claims 11-15 depend from claim 10, but fail to further limit the claimed invention to statutory subject matter.  Therefore, claims 11-15 inherit the 35 U.S.C. 101 issue of the independent claim.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 1-20 are rejected under 35 U.S.C. 112(a) as failing to comply with the enablement requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to enable one skilled in the art to which it pertains, to make and/or use the invention.
Claim 1. A computer implemented method of training distributed sets of machine learning models, the method comprising: 
training each of the distributed machine learning models on different subsets of a set of training data; 
performing a first layer model synchronization operation in a first layer for each set of machine learning models (the application specification did not disclose what process steps are involved in a first layer model synchronization operation, neither does it disclose how said first layer is created or the structure of said first layer model), wherein each model synchronization operation in the first layer generates first updates for each of the machine learning models in each respective set (the application specification did not disclose the process steps involved in generating first updates, how said first updates for a model are obtained, and what said first updates entail); 
updating the machine learning models based on the first updates (the application specification did not disclose how a machine learning model is updated, for instance, which parameter(s) are updated and how such parameters are identified for update); 
performing a second layer model synchronization operation in a second layer for first supersets of the machine learning models (the application specification did not disclose what process steps are involved in a second layer model synchronization operation, neither does it disclose how said second layer is created and the structure of said second layer model, or what first supersets of the machine learning models are and how they are formed), wherein each first superset comprises different sets of machine learning models (this still does not enable one of ordinary skill in the art to create the first supersets in ways the inventors may have intended as the specification does not disclose what criteria is used to group different sets of machine learning models into a superset), and wherein each model synchronization in the second layer generates second updates for each of the machine learning models in each respective first superset (the application specification did not disclose the process steps involved in generating second updates, how said second updates for a model are computed, and what said second updates entail); and 
updating each of the machine learning models in the first supersets based on the second updates such that each machine learning model in a respective first superset is the same (the application specification does not disclose what the first supersets comprise, neither does it disclose what is included in said second updates, therefore it does not enable one of ordinary skill in the art to know how to update each of the machine learning models in the first supersets).
2. The method of claim 1 wherein the first and second model synchronization operations each comprise combining gradients of loss from the respective sets and supersets of machine learning models (Several issues are present in this claim.  First, as explained in the rejection rationale for claim 1 above, the specification does not enable one to know how to create supersets of machine learning models using criteria the inventors had intended. Secondly, in neural network training, the concepts of gradient decent and loss function are known; however, gradients of loss is not commonly acknowledged concept, particularly when it is used in the context of “gradients of loss” for sets and supersets of machine learning models. More specific algorithms or formula for computing said gradient of loss have to be disclosed in the specification in order to enable one of ordinary skill in the art to make or use the claimed invention)
3. The method of claim 2 wherein the first and second model synchronization operations each further comprise computing a mean of the computed gradients and wherein the models in respective sets and supersets are updated based on the mean of the computed gradients (as the application specification lacks detailed disclosure on how the gradients are computed, one of ordinary skill in the art would not know either how to compute a mean of the gradients, or how to update the sets and supersets using the mean.  For instance, one would not know how to update the parameters/coefficients in a neural network model using the mean).
4. The method of claim 1 wherein the second layer model synchronization operation obtains a copy of parameters of the model (the model lacks antecedent basis) that is updated based on the first updates from a processor having a lowest latency connection to the second layer (it is unclear how a processor is related to either the model or the parameters; it is also unclear whether the second layer is a second layer of models or a second layer of processors).
5. The method of claim 1 wherein the first layer model synchronization operation occurs more frequently than second layer model synchronization operations (it is unclear whether this is an inherent outcome of the instant invention, or a criteria by design).
6. The method of claim 1 wherein model training and synchronization is performed on distributed processors, wherein communication connections between processors in the first layer are faster than communication connections between the first and second layers.
7. The method of claim 6 wherein the second layer synchronization operation generates the second updates based on communication with one machine learning model from each set of machine learning models (it remains unclear what is included in said second updates. One skilled in the art is still unable to know how to generate the second updates).
8. The method of claim 7 wherein the one machine learning model from each subset has the lowest communication latency to the second layer (it is unclear what inventor considers as communication latency and how it is determined in the context of the instant invention).
9. The method of claim 1 and further comprising: performing a third layer model synchronization operation in a third layer for second supersets of machine learning models (the application specification did not disclose what process steps are involved in a third layer model synchronization operation, neither does it disclose how said third layer is created or the structure of said first layer model), wherein each model synchronization in the third layer generates third updates for each of the machine learning models in each respective second superset (the application specification did not disclose the process steps involved in generating third updates, how said third updates for a model are computed, and what said third updates entail); and 
updating each of the machine learning models based on the third updates such that each machine learning model is the same (the application specification does not disclose what is included in said third updates, therefore it does not enable one of ordinary skill in the art to know how to update each of the machine learning models based on the third updates), wherein the first layer model synchronization operation occurs more frequently than second layer model synchronization operations and the second layer model synchronization operations occur more frequently than the third layer model synchronization operations (it is unclear whether this is an inherent outcome of the instant invention, or a criteria by design).
Claims 10-20 recite subject matter that may be found in one or more of claims 1-9, therefore they are rejected based on the same rationale.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 1-20 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor regards as the invention.
Claim 1. A computer implemented method of training distributed sets of machine learning models, the method comprising: 
training each of the distributed machine learning models on different subsets of a set of training data; 
performing a first layer model synchronization operation in a first layer for each set of machine learning models (it is unclear what components said first layer comprises, i.e. is it a first layer of machine learning models or a first layer of processors; it is also unclear how each set of machine learning models is related to the “each of the distributed machine learning models” in the previous clause), wherein each model synchronization operation in the first layer generates first updates for each of the machine learning models in each respective set.; 
updating the machine learning models based on the first updates (it is unclear how “the machine learning models” in this clause is related to “each set of machine learning models” and “each of the distributed machine learning models” in the previous two clauses); 
performing a second layer model synchronization operation in a second layer for first supersets of the machine learning models, wherein each first superset comprises different sets of machine learning models (it is unclear what first supersets of the machine learning models are related to those machine learning models recited in the previous three clauses), and wherein each model synchronization in the second layer generates second updates for each of the machine learning models in each respective first superset (the application specification did not disclose the process steps involved in generating second updates, how said second updates for a model are computed, and what said second updates entail); and 
updating each of the machine learning models in the first supersets based on the second updates such that each machine learning model in a respective first superset is the same.
2. The method of claim 1 wherein the first and second model synchronization operations each comprise combining gradients of loss from the respective sets and supersets of machine learning models (it is unclear what gradients of loss is and what combining the gradients of loss entails, resulting in indefinite claim scope)
3. The method of claim 2 wherein the first and second model synchronization operations each further comprise computing a mean of the computed gradients and wherein the models in respective sets and supersets are updated based on the mean of the computed gradients (said “the computed gradients” lacks antecedent basis).
4. The method of claim 1 wherein the second layer model synchronization operation obtains a copy of parameters of the model (the model lacks antecedent basis) that is updated based on the first updates from a processor having a lowest latency connection to the second layer (it is unclear how a processor is related to either the model or the parameters; it is also unclear whether the second layer is a second layer of models or a second layer of processors).
5. The method of claim 1 wherein the first layer model synchronization operation occurs more frequently than second layer model synchronization operations (“more frequently” is a relative term that results in the scope of the claim element being indefinite)
6. The method of claim 1 wherein model training and synchronization is performed on distributed processors, wherein communication connections between processors in the first layer are faster than communication connections between the first and second layers (it is unclear what is considered as “faster” communication by inventors/applicants).
7. The method of claim 6 wherein the second layer synchronization operation generates the second updates based on communication with one machine learning model from each set of machine learning models (it remains unclear what is included in said second updates. One skilled in the art is still unable to know how to generate the second updates).
8. The method of claim 7 wherein the one machine learning model from each subset has the lowest communication latency to the second layer (it is unclear what inventor considers as communication latency and how it is determined in the context of the instant invention).
9. The method of claim 1 and further comprising: performing a third layer model synchronization operation in a third layer for second supersets of machine learning models (it is unclear how said second supersets of machine learning models are related to the machine learning models in the previous claims), wherein each model synchronization in the third layer generates third updates for each of the machine learning models in each respective second superset; and 
updating each of the machine learning models based on the third updates such that each machine learning model is the same, wherein the first layer model synchronization operation occurs more frequently than second layer model synchronization operations and the second layer model synchronization operations occur more frequently than the third layer model synchronization operations (it is unclear whether this is an inherent outcome of the instant invention, or a criteria by design).
Claims 10-20 recite subject matter that may be found in one or more of claims 1-9, therefore they are rejected based on the same rationale.
Related Prior Art
	The cited prior art Wang et al. (US 2015/0019214) disclosed a methid and device for parallel training of a deep neural network (DNN) model using layers of training modules (Wang, Abstract and Fig. 4).  However, in view of the pending claims’ issues under 35 U.S.C. 112, Applying the reference to the claims in their current state would introduce a significant amount of confusion and debate with regard to how the reference’s teaching could be properly mapped to the claimed invention. Therefore, Examiner has decided to hold the reference in abeyance.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIRLEY X ZHANG whose telephone number is (571)270-5012.  The examiner can normally be reached on 8:30am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, William Trost can be reached on 571-272-7872.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SHIRLEY X ZHANG/Primary Examiner, Art Unit 2442                                                                                                                                                                                                        
5/26/2022