Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 5, 6 and 8-12 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Pat. Pub. No. 2019/0138901 to Meyer et al. (hereinafter Meyer).
Per claim 5, Meyer discloses a method (fig. 1) of compressing a pre-trained deep neural network model (fig. 1 and ¶16…the artificial neural networks (ANNs) used are DNNs: ”the ANNs referred to herein are deep neural networks (DNNs)”; ¶39…method 100 involves compressing of DNNs: ”the methods and systems described herein may also be used for DNN compression…”; fig. 1:110-112 and ¶25…each subsequent iteration of compression method 100 has baseline ANN that has been trained and tested 110, hence a pre-trained ANN is used in subsequent iterations of method 100: “the current performance baseline is optionally updated based on the candidate ANN if the actual performance characteristics of the candidate ANN do exceed the current performance baseline, then the performance baseline is updated to include the candidate ANN”), the deep neural network model comprising a plurality of layers (DNNs, by definition, have at least two hidden layers; ¶30…example DNN with two hidden layers; ¶3…hyperparameters specifying DNN includes the number of hidden layers), each of the plurality of layers comprising at least one node (¶30…example DNN with two layers with 20 nodes for each layer; ¶3…hyperparameters specifying DNN includes the number of nodes per layer) representing a weight (¶3, 39…weights correspond to each node specified in hyper-parameters for DNN, i.e., weight sparsification relates to removal of extraneous nodes), the method comprising:
inputting the pre-trained deep neural network model as a candidate
model (figs. 1:102-104 and ¶20-21…selecting a candidate set of hyperparameters for a ”candidate ANN” that is compared to a baseline ANN, the candidate ANN construed as the candidate model; fig. 1:110-112 and ¶25…baseline ANN being compared can be a previous iteration ANN, which can be supplanted by the current iteration candidate ANN if the latter has better performance than the former…thus baseline ANN can also be construed as the candidate model); 
compressing the candidate model by removing at least one node or layer to form a compressed model (¶39…”methods and systems described herein may be used for DNN compression” by pruning: “ANN weight sparsification and removal of extraneous node connections”); and 
utilizing the compressed model for inference utilizing a set of unknown data (fig. 1:116 and ¶7, 27…final identified suitable DNN applied to specific/given application; ¶14…example utilization of identified model in autonomous vehicle localization and control where unknown vehicle environment data input in the final identified suitable DNN).
Per claim 6, Meyer discloses claim 5, further disclosing increasing sparsity of the candidate model by removing at least one node from at least one layer before utilizing the compressed model for inference (¶39…”ANN weight sparsification and removal of extraneous node connections”).
Per claim 8, Meyer discloses claim 5, further disclosing comprising quantizing all remaining weights into fixed-point representation to form the compressed model before utilizing the compressed model for inference (¶39…”ANN weight quantization including, but not limited to, per-layer fixed-point quantization, weight binarization, and weight ternarization”).
Per claim 9, Meyer discloses claim 5, further disclosing determining accuracy of the compressed model utilizing an end-user training and validation data set (fig. 1:110 and ¶24…”candidate ANN is trained with corpus of data and tested to obtain actual performance characteristics”).
Per claim 10, Meyer discloses claim 9, further disclosing repeating compressing the candidate model utilizing the respective compressed model as the candidate model when the accuracy improves before utilizing the compressed model for inference (fig. 1:110-112 and ¶25-26…iterating/repeating method 100 based on performance characteristic comparison, e.g., accuracy improvements relative to the baseline, and updating current baseline model with the candidate ANN model when the candidate ANN model is better performing then the current baseline model).
Per claim 11, Meyer discloses claim 9, further disclosing adjusting hyper parameters utilized for compressing the candidate model (¶20…for iterating method 100, subsequent candidate set of ANN parameters are varied, which can be by varying only one parameter from a preceding candidate set of ANN parameters or can vary a plurality of parameters) and repeat compressing the candidate model (fig. 1…performing another iteration of method 100) when the accuracy declines before utilizing the compressed model for inference (¶23…when accuracy of current candidate model is less than baseline, repeat method 100 for subsequent candidate ANN with different hyperparameters: ”If the candidate ANN does not have performance characteristics which exceed the current performance baseline, the candidate ANN is rejected, and the method 100 returns to step 102 to evaluate a new candidate ANN”).
Per claim 12, Meyer discloses claim 9, further storing the compressed model to a memory (fig. 1:116 and ¶27…identified suitable ANN is stored as part of a collection of candidate ANNs having the most ideal performance characteristics) when the accuracy exceeds an end-user performance metric and target before utilizing the compressed model for inference (fig. 1:106-108…comparing at least one performance characteristic, e.g., accuracy, against a ”current performance baseline”; ¶16…”As used herein, a modelling ANN is an ANN that is trained to estimate one or more performance characteristics of a candidate ANN, and may be used for optimizing for one or more performance characteristics, including error (or accuracy) and at least one of computation time, latency, energy efficiency, implementation cost ( e.g., time, hardware, power, etc.), computational complexity, and the like”).
Allowable Subject Matter
Claims 1-4 and 14-17 are allowed.
Claim 7 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is the statement of reasons for the indication of allowable subject matter:  The prior art disclosed by the applicant and cited by the Examiner fail to teach or suggest, alone or in combination, all the limitations of the independent and intervening claims (claims 1, 5 and 14), as highlighted in Figures 1 and 2 of the instant Drawings.  
The most pertinent discovered prior art is to US Pat. Pub. No. 2019/0138901 to Meyer as applied above.  However, Meyer does not disclose removing at least one batch normalization layer present in the candidate model before utilizing the compressed model for inference.  Pertinent references disclosing this specific aspect of batch normalization layer removal includes Pruning Filters for Efficient Convnets to Li et al. (Section 4, first paragraph) and US Pat. Pub. No. 2019/0295282 to Smolyanskiy et al. (¶28).  However, these references do not appear to teach or suggest several other compression limitations of the instant claims.  Additional other prior art made of record and not relied upon is considered pertinent to incremental model compression of neural networks.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALAN CHEN whose telephone number is (571) 272-4143. The examiner can normally be reached M-F 10-7.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on (571) 272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ALAN CHEN/Primary Examiner, Art Unit 2125