DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
EXAMINER’S AMENDMENT
2.	An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
          Authorization for this examiner’s amendment was given in a telephone interview with Michael Walters on 3/18/2022.
IN THE CLAIMS:
1. (Currently Amended) A computing system for jointly learning compact models, the computing system comprising: 
one or more processors; and 
one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computing system to perform operations, the operations comprising: 
obtaining a pre-trained machine-learned model associated with a user; 
obtaining one or more user selections associated with the user; 
generating a training pipeline comprising one or more user-created schema based on the one or more user selections; 
, wherein the compact model comprises an inference speed parameter defined by the user; and 
jointly training the compact model and the pre-trained machine-learned model on a set of training data, wherein the compact model has a smaller model size relative to the pre-trained machine-learned model, wherein jointly training the compact model and the pre-trained machine-learned model comprises backpropagating a gradient of a loss function, wherein the gradient is based on a first comparison between an output of the compact model for an input and a ground truth output and a second comparison between the output of the compact model and an output of the pre-trained machine-learned model, wherein jointly training reduces a number of bits used for one or more model weights and one or more activations, and wherein jointly training comprises: 
validating the one or more user-created schema of a training pipeline; and 
training the compact model and the pre-trained machine-learned model based at least in part on the user-created schema. 
2. (Previously Presented) The computing system of claim 1, wherein jointly training the compact model and the pre-trained machine-learned model comprises backpropagating the gradient of the loss function, wherein the loss function comprises a combined loss function that comprises: 
a first loss term that describes a trainer prediction error; 
a second loss term that describes a student simulation error; and 
a third loss term that describes a student prediction error. 
3. (Previously Presented) The computing system of claim 2, wherein the first loss term describes a distance between the output of the pre-trained machine-learned model and a ground truth. 
4. (Previously Presented) The computing system of claim 2, wherein the second loss term describes a distance between the output of the pre-trained machine-learned model and the output of the compact model. 
5. (Canceled) 
6. (Previously Presented) The computing system of claim 2, wherein the combined loss function comprises first, second, and third weighting values that respectively weight the first, second, and third loss terms relative to each other. 
7. (Previously Presented) The computing system of claim 1, wherein the operations further comprise: 
receiving a user input that selects one of a number of available compact model architectures; 
wherein generating the compact model comprises generating the compact model that has the compact model architecture selected by the user input. 

receiving a user input that specifies one or both of a desired model size or a desired inference speed; 
wherein generating the compact model comprises generating the compact model that exhibits one or more both of the desired model size or the desired inference speed. 
9. (Previously Presented) The computing system of claim 1, wherein the operations further comprise: 
receiving a user input that specifies a type of input data; 
wherein generating the compact model comprises generating the compact model that is configured to receive and process the type of input data specified by the user input. 
10. (Previously Presented) The computing system of claim 1, wherein the operations further comprise: 
receiving a user input that specifies a prediction task; 
wherein generating the compact model comprises generating the compact model that is configured to perform the prediction task. 
11. (Previously Presented) The computing system of claim 1, wherein: 
generating the compact model comprises generating a plurality of compact models that have different architectures; and 
jointly training the compact model and the pre-trained machine-learned model comprises jointly training the plurality of compact models and the pre-trained machine-learned model. 
.
13. (Previously Presented) The computing system of claim 1, wherein the compact model comprises a projection neural network. 
14. (Canceled) 
15. (Canceled) 
16. (Previously Presented) The computing system of claim 1, wherein the pre-trained machine-learned model comprises a checkpoint from a production model deployed for cloud-based inference. 
17. (Previously Presented) The computing system of claim 1, wherein the pre-trained machine-learned model is selected by the user from a set of provided pre-trained models. 
18. (Currently Amended) A computer-implemented method, comprising: 
obtaining, by one or more computing devices, a pre-trained machine-learned model associated with a user; 
receiving, by the one or more computing devices, user input that describes one or more desired model characteristics, the one or more desired model characteristics comprising one or more of: a desired model size, a desired inference speed, a type of input data, or a prediction type; 
generating, by the one or more computing devices, a training pipeline comprising one or more user-created schema based on the user input, wherein the user input comprises one or more selections associated with the user; 
automatically generating, by the one or more computing devices, a compact model specification based at least in part on the user input, wherein the compact model comprises an inference speed parameter defined by the user; and jointly training, by the one or more computing devices, a compact model that has the compact model specification and the pre-trained machine-learned model on a set of training data, wherein the compact model has a smaller model size relative to the pre-trained machine-learned model, and wherein the compact model exhibits the one or more desired model characteristics, wherein jointly training the compact model and the pre-trained machine-learned model comprises backpropagating a gradient of a loss function, wherein the gradient is based on a first comparison between an output of the compact model for an input and a ground truth output and a second comparison between the output of the compact model and an output of the pre-trained machine-learned model, wherein jointly training reduces a number of bits used for one or more model weights and one or more activations, and wherein jointly training comprises: 
validating the one or more user-created schema of a training pipeline; and 
training the compact model and the pre-trained machine-learned model based at least in part on the user-created schema. 
19. (Original) The computer-implemented method of claim 18, wherein jointly training, by the one or more computing devices, the compact model that has the compact model specification and the pre-trained machine-learned model comprises backpropagating, by the one or more computing devices, a gradient of a combined loss function that comprises: 
a first loss term that describes a trainer prediction error exhibited between an output the pre-trained machine-learned model and a ground truth; 

a third loss term that describes a student prediction error exhibited between the output of the compact model and the ground truth. 
20. (Currently Amended) One or more non-transitory computer-readable media that collectively store instructions that, when executed by one or more processors, cause the one or more processors to perform operations, the operations comprising: 
obtaining a pre-trained machine-learned model associated with a user; 
receiving user input that describes one or more desired model characteristics, the one or more desired model characteristics comprising one or more of: a desired model size, a desired inference speed, a type of input data, or a prediction type;
generating, by the one or more computing devices, a training pipeline comprising one or more user-created schema based on the user input, wherein the user input comprises one or more selections associated with the user;
automatically generating a compact model specification based at least in part on the user input, wherein the compact model comprises an inference speed parameter defined by the user; and
jointly training a compact model that has the compact model specification and the pre-trained machine-learned model on a set of training data, wherein the compact model has a smaller model size relative to the pre-trained machine-learned model, and wherein the compact model exhibits the one or more desired model characteristics, wherein jointly training the compact model and the pre-trained machine-learned model , wherein jointly training reduces a number of bits used for one or more model weights and one or more activations, and wherein jointly training comprises:
validating the one or more user-created schema of a training pipeline; and
training the compact model and the pre-trained machine-learned model based at least in part on the user-created schema.   
21. (Previously Presented) The computing system of claim 1, wherein the operations further comprise: 
determining one or more particular weights of the compact model that are low-scoring weights; and 
removing the one or more particular weights of the compact model. 
22. (Canceled) 
23. (Canceled)

Allowable Subject Matter
3.	Claims 1 – 4, 6 – 13 and 16 - 21 are allowed.
4.	The following is an examiner’s statement of reasons for allowance: 
          Regarding claim 1, the closest prior art is Kang et al. (Foreign Publican EP 3 144 859 A2), Li et al. (U.S. Publication 2016/0078339) and Hwang et al. (U.S. Publication 2018/0060722).  Kang teaches a computing system for jointly learning compact models, the computing system comprising: one or more processors; and one or more non-
	However, the art of record does not teach, nor render obvious a
computing system for jointly learning compact models, the computing system comprising: one or more processors; and  one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computing system to perform operations, the operations comprising: obtaining a pre-trained machine-learned model associated with a user;  obtaining one or more user selections associated with the user; generating a training pipeline comprising one or more user-created schema based on the one or more user selections;  generating a compact model, wherein the compact model comprises an inference speed parameter defined by the user; and jointly training the compact model and the pre-trained machine-learned model on a set of training data, wherein the , wherein jointly training reduces a number of bits used for one or more model weights and one or more activations, and wherein jointly training comprises: validating the one or more user-created schema of a training pipeline; and training the compact model and the pre-trained machine-learned model based at least in part on the user-created schema. 
Claims 18 and 20 are variants of claim 1 are allowed for at least the reasons of claim 1, as are claims 2 – 4, 6 – 13, 16, 17 and 21, which depend from claim 1 and claim 19 which depends from claim 18.  
          Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Conclusion
5.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM C WOOD whose telephone number is (571)272-5285.  The examiner can normally be reached on Monday - Friday, 8:00 am - 4:30 pm.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chat C Do can be reached on 571-272-3721.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/WILLIAM C WOOD/
Examiner, Art Unit 2193               
                                                                                                                                                                     

/Chat C Do/Supervisory Patent Examiner, Art Unit 2193