DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 3-8, 10-16 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.


Claim Objections
Claims 1 are objected to because of the following informalities:  
In claim 1, line 6, add ~,~ after “stores the compressed deep learning model.
Appropriate correction is required.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 3-8, 10-16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al. (US2021/0073644) in view of Senn (US2020/0364572) and Wang et al. (CN110119745).
To claim 1, Lin teach a method for compressing a deep learning model, comprising: 
acquiring a to-be-compressed deep learning model (Fig. 1, 302 of Fig. 3); 
pruning each layer of weights of the to-be-compressed deep learning model in units of channels to obtain a compressed deep learning model (300 of Fig. 3, paragraphs 0047-0048);
wherein the pruning each layer of weights of the to-be-compressed deep learning model in units of channels comprises: 
taking, for each layer of the to-be-compressed deep learning model (paragraph 0075, all layers and/or branches of the trained neural network are processed for compression), the layer of weights, as a first preset number of filters; pruning a second preset number of filters from the first preset number of filters according to an importance of each filter, wherein the second preset number is smaller than the first preset number (paragraph 0006, select certain layers/branches having a threshold number of weights; paragraphs 0048, 0056 determine the complexity of a layer or branch based on a number of weights included in the layer or branch; paragraph 0047, set weights of a filter in a channel to be 0 based on the compression).
But, Lin do not expressly disclose sending the compressed deep learning model to a terminal device, so that the terminal device stores the compressed deep learning model;	the first preset number being a value of a dimension of the four-dimensional array and a filter being a three-dimensional array obtained by removing the dimension for the first preset number form the four-dimensional array.
However, Lin does teach different dimensions of filter (paragraph 0061), wherein four-dimensional array would have been an obvious implementation.
	Senn teach method for deep neural network compression (abstract) comprising: training the DNN, wherein the DNN has at least one layer having a number of filters; clustering the filters of the at least one layer to reduce the number of filters of the at least one layer; applying dimension reduction on the clustered filters of the at least one layer; and retraining the DNN (paragraph 0054), which makes removing a dimension from a spatial filter from the dimension for the first preset number form the four-dimensional array would have been an obvious implementation.
	Wang teach sending the compressed deep learning model to a terminal device, so that the terminal device stores the compressed deep learning model (Fig. 1, paragraph 0039).
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate teaching of Senn and Wang into the method of Lin, in order to further deep neural network compression and deploy compressed deep learning model on a resource limited mobile terminal.

To claim 8, Lin, Senn and Wang teach an apparatus for compressing a deep learning model (as explained in response to claim 1 above).

To claim 15, Lin, Senn and Wang teach an electronic device (as explained in response to claim 1 above).

To claim 16, Lin, Senn and Wang teach a computer-readable medium, storing a computer program, wherein when the computer program is executed by a processor (as explained in response to claim 1 above).



To claims 3 and 10, Lin, Senn and Wang teach claims 1 and 8.
Lin, Senn and Wang teach wherein the pruning a second preset number of filters from the first preset number of filters comprises: calculating, for each layer of the to-be-compressed deep learning model, an L1-norm of each filter of the layer; and pruning the filters of which the L1 norms are smaller than a preset threshold from the layer (Lin, paragraphs 0072-0073).

To claims 4 and 11, Lin, Senn and Wang teach claims 1 and 8.
Lin, Senn and Wang teach wherein the pruning each layer of weights of the to-be-compressed deep learning model in units of channels comprises: pruning the to-be-compressed deep learning model layer by layer, and retraining the to-be-compressed deep learning model by using a training sample set each time a layer is pruned (Lin, paragraph 0028, information associated with the nodes is shared among the different layers and each layer retains information as information is processed; paragraphs 0072-0073, input training samples).

To claims 5 and 12, Lin, Senn and Wang teach claims 4 and 11.
Lin, Senn and Wang teach wherein the pruning the to-be-compressed deep learning model layer by layer comprises: first pruning, for each layer of the to-be-compressed deep learning model, convolutional weights of the layer before batch normalization (paragraphs 0065-0067), and then pruning batch normalization parameters of the layer (Lin, paragraph 0005, pruning certain weights, e.g., set to 0, of a channel of a single-branch or multi-branch neural network).

To claims 6 and 13, Lin, Senn and Wang teach claim 1 and 8.
Lin, Senn and Wang teach wherein the deep learning model comprises at least one of the following: a head and shoulder detection model, an object detection model, a human detection model, and a target detection model (Lin, paragraphs 0037, 0039, trained neural network for task such as object detection; paragraph 0095, task of compressed model in object detection).

To claims 7 and 14, Lin, Senn and Wang teach claims 6 and 13.
Lin, Senn and Wang teach wherein the deep learning model is a head and shoulder detection model, and the terminal device performs head and shoulder detection by the following steps: acquiring an image of a preset region; and inputting the image to a compressed deep learning model to obtain head and shoulder detection boxes in the image (obvious, despite lack of disclosing specific task of detection model, a head and shoulder detection model and related application method is well-known in the art, which would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate for design preference, hence Official Notice is taken).


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZHIYU LU whose telephone number is (571)272-2837. The examiner can normally be reached Weekdays: 8:30AM - 5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, EDWARD URBAN can be reached on (571) 272-7899. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

ZHIYU . LU
Primary Examiner
Art Unit 2669



/ZHIYU LU/Primary Examiner, Art Unit 2665                                                                                                                                                                                                        October 17, 2022