The present application, filed on or after 16 March 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION

This office action is in response to Applicant’s submission filed on 1 December 2020.     THIS ACTION IS NON-FINAL.

Status of Claims

Claims 1-20 are pending.
Claim 1-20 are rejected under 35 U.S.C. 101 for being directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.
Claims 1-20 are rejected under 35 U.S.C. 103 as unpatentable.

Claim Rejections - 35 USC § 101

35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Judicial Exception
Claims 1-20 of the claimed invention are directed to a judicial exception, an abstract idea, without significantly more. 
 (Independent Claims) With regards to claim 1 / 11 / 20, the claim recites a method / system / product, which falls into one of the statutory categories.
2A – Prong 1: Claim 1 / 14 / 20, in part, recites “determining a number P of weight values to be pruned for each convolution kernel of the target convolution layer based on a number of weight values MxN in the convolution kernel and a target compression ratio, where P is a positive integer smaller than MxN; and setting P weight values with the smallest absolute values in each convolution kernel of the target convolution layer to zero to form a pruned convolution layer” (mental process), as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components.  That is, other than reciting a computing device, nothing in the claim element precludes the step from practically being performed in the mind.  For example, but for the language about generic computer components, “determining”, “setting”, in the limitation citied above could be performed by human using paper / pen / calculator (e.g., a human statistical data model builder could collect information and adjust model parameters for making statistical inferences), see Appendix 1 to October 2019 Update: Subject Matter Eligibility Life Sciences & Data Processing Examples, Example 43, Step 2A Prong One, p.4, “Note that even if most humans would use a physical aid (e.g., pen and paper, a slide rule, or a calculator) to help them complete the recited calculation, the use of such physical aid does not negate the mental nature of this limitation”.  .  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
2A – Prong 2: This judicial exception is not integrated into a practical application.  In particular, claim 1 / 11 / 20 recites the additional elements: (a) using generic computer elements (like a processor executing program stored in memory); (b) “obtaining one target convolution layer from the one or more convolution layers in the neural network …” (insignificant extra solution activity, MPEP 2106(g)).  For (a), these computer components are recited at a high-level of generality (i.e., as a generic processor performing a generic computer function of state transition probability calculation) such that it amounts no more than mere instructions to apply the exception using a generic computer component.  For (b), these steps are insignificant extra solution activity, like mere data gathering, MPEP.2106.05(g).   There is no additional elements showing integration of the abstract idea into a practical application and/or providing anything significantly more to the abstract idea.  Claim 1 / 11 / 20 is directed to an abstract idea.
2B Analysis:  The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the use of one or more generic hardware processors and database(s) and data IO is WURC and/or insignificant extra-solution activity, hence does not add anything significant to the abstract idea. The claim is not patent eligible.
 (Dependent claims) 
Claims 2-10, 19 / 12-18 are dependent on claim 1 / 11 and include all the limitations of claim 1 / 11. Therefore, claims 2-10, 19 / 12-18 recite the same abstract ideas. 
With regards to claims 2-3, 5-10, 19  / 11-13, 15-18, the claim recites further limitation on model handling, and does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea.  The claim is not patent eligible.
With regards to claims 4 / 14, the claim recites additional limitation: “performing a Hadamard multiplication operation on the mask tensor and the error gradient tensor”, as drafted, is a process that, under its broadest reasonable interpretation, covers mathematical concepts but for the recitation of generic computer components.  That is, other than reciting generic computing components, the steps of performing a Hadamard multiplication, describe mathematical relationships and algorithms.  Mathematical relationship and algorithms have been found by the courts to be abstract ideas, e.g., see MPEP 2106.04(a)(2). If a claim limitation, under its broadest reasonable interpretation, covers mathematical relationships, then it falls within the “Mathematical Concepts” grouping of abstract ideas.  There is no additional elements showing integration of the abstract idea into a practical application and/or providing anything significantly more to the abstract idea. The claim is not patent eligible.





Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-3, 5, 7-13, 15, 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over LI et al., US-PGPUB NO.20190050734A1 [hereafter LI] in view of Nurvitadhi, et al., US-PGPUB NO.20190205746A1 [hereafter Nurvitadhi].

With regards to claim 1, LI teaches 
“A method for pruning one or more convolution layers in a neural network, comprising: obtaining one target convolution layer from the one or more convolution layers in the neural network ….. (LI, FIG.2,

    PNG
    media_image1.png
    403
    626
    media_image1.png
    Greyscale

); determining a number P of weight values to be pruned for each convolution kernel of the target convolution layer based on a number of weight values MxN in the convolution kernel and a target compression ratio, where P is a positive integer smaller than MxN (LI, FIG.17, [0042], ‘said compression strategy at least comprising: the target compression ratio of each pruning operation within said compression cycle’, [0074], ‘determine the initial densities (or, the initial compression ratio) for each matrix in the neural network’); and setting P weight values with the smallest absolute values …. to zero to form a pruned convolution layer (LI, FIG.17, 

    PNG
    media_image2.png
    561
    733
    media_image2.png
    Greyscale

[0014], ‘elements with larger weights represent important connections, while other elements with smaller weights have relatively small impact and can be removed (e.g., set to zero)’); ….”.
LI does not explicitly detail “the target convolution layer comprising C filters each comprising K convolution kernels, and each of the K convolution kernels comprising M rows and N columns of weight values, where C, K, M and N are positive integers greater than or equal to one”, “for each convolution kernel of the target convolution layer based on a number of weight values MxN in the convolution kernel and a target compression ratio, where P is a positive integer smaller than MxN”, and “in each convolution kernel of the target convolution layer”
However Nurvitadhi teaches detailed structure of convolution layer comprising filter, kernels, rows, and columns of weights (Nurvitadhi, FIG.21A-D, [0194], ‘The nodes in CNN input layer are organized into a set of filters ….the second function can be referred to as the convolution kernel…. The convolution kernel can be a multidimensional array of parameters, where the parameters are adapted by the training process for the neural network’,

    PNG
    media_image3.png
    327
    370
    media_image3.png
    Greyscale

    PNG
    media_image4.png
    388
    394
    media_image4.png
    Greyscale

)”.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of LI and Nurvitadhi before him or her, to modify neural network pruning method and system of LI to include detailed structure of convolutional neural network as shown in Nurvitadhi.   
The motivation for doing so would have been to facilitate processing of sparse matrix for convolutional neural networks (Nurvitadhi, Abstract). 

With regards to claim 2, LI in view of Nurvitadhi teaches 
“The method of claim 1, further comprising: retraining the target neural network with the pruned convolution layer to form an updated neural network, wherein the updated neural network comprises an updated convolution layer generated by retraining the pruned convolution layer, and weight values of the updated convolution layer at positions corresponding to positions of the weight values set to zero in the pruned convolution layer are zero (Nurvitadhi, FIG.14, [0147] ‘In step 1430, it retrains the network after pruning …’,

    PNG
    media_image5.png
    441
    390
    media_image5.png
    Greyscale

)”

With regards to claim 3, LI in view of Nurvitadhi teaches 
“The method of claim 2, wherein retraining the target neural network with the pruned convolution layer to form an updated neural network comprises: 
generating a mask tensor, wherein each element in the mask tensor corresponds to a respective weight value in the pruned convolution layer, elements of the mask tensor at positions corresponding to the positions of the weight values set to zero in the pruned convolution layer are zero, and elements of the mask tensor at other positions are one (Nurvitadhi, FIG.14, [0145], ‘As is shown in FIG.14, in step 1410, it prunes the network to be compressed nnet and obtains a mask matrix M which records the distribution of non-zero elements in corresponding sparse matrix’) ….”
LI does not explicitly detail “and setting gradient values of an error gradient tensor at positions corresponding to the positions of the weight values set to zero in the pruned convolution layer to zero by using the mask tensor, so as to set the weight values of the updated convolution layer at the positions corresponding to positions of the weight values set to zero in the pruned convolution layer to zero”
However Nurvitadhi teaches “and setting gradient values of an error gradient tensor at positions corresponding to the positions of the weight values set to zero in the pruned convolution layer to zero by using the mask tensor, so as to set the weight values of the updated convolution layer at the positions corresponding to positions of the weight values set to zero in the pruned convolution layer to zero (Nurvitadhi, [0326], ‘The training algorithm typically runs a forward propagation on the NN for each training sample, then runs a back propagation in the reverse direction in order to compute the gradients.  The weights that are unimportant (e.g., aero values) continue to be multiplied by gradients for no reason. The present GPU architecture can take advantage of this by utilizing hardware techniques to exploit this behavior’, and FIG.41-43, [0328 ]- [0342] shows learning process using compressed matrix with weight positions being masked off,

    PNG
    media_image6.png
    730
    572
    media_image6.png
    Greyscale

),
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of LI and Nurvitadhi before him or her, to modify neural network pruning method and system of LI to include learning with weight positions skipped in matrix as shown in Nurvitadhi.   
The motivation for doing so would have been to facilitate processing of sparse matrix for convolutional neural networks (Nurvitadhi, Abstract). 

With regards to claim 5, LI in view of Nurvitadhi teaches 
“The method of claim 2, wherein the target compression ratio is set based on a target accuracy (LI, FIG.2, [0013], ‘to obtain a trained neural network with a desired accuracy’, [0014], ‘by find-tuning … the pruned network, the remaining weights in the matrices may be adjusted, minimizing the accuracy loss’, [0074], ‘determine the initial densities (or, initial compression ratios) for each matrix in the neural network’), and the target compression ratio enables the updated neural network to perform a neural network operation with an accuracy greater than or equal to the target accuracy (LI, FIG.2, FIG.13, [0013], ‘By compressing a dense neural network into a sparse neural network, the computation amount and storage amount can be effectively reduced, achieving acceleration of running an ANN while maintaining its accuracy’)”

With regards to claim 7, LI in view of Nurvitadhi teaches 
“The method of claim 1, wherein setting P weight values with the smallest absolute values in each convolution kernel of the target convolution layer to zero comprises: …. setting the P weight values with the smallest absolute values among the MxN weight values in each row to zero (LI, FIG.17, 

    PNG
    media_image2.png
    561
    733
    media_image2.png
    Greyscale

[0014], ‘elements with larger weights represent important connections, while other elements with smaller weights have relatively small impact and can be removed (e.g., set to zero)’.)”
LI does not explicitly detail “…. expanding all the weight values of the target convolution layer into a two- dimensional matrix with CxK rows and MxN columns; ranking the MxN weight values in each row of the two-dimensional matrix according to their respective absolute values; …. and rearranging the two-dimensional matrix to obtain the pruned convolution layer, wherein the pruned convolution layer comprises C filters corresponding to the target convolution layer, each of the C filters comprises K convolution kernels, and each of the K convolution kernels comprises M rows and N columns of weight values”
However Nurvitadhi teaches “expanding all the weight values of the target convolution layer into a two- dimensional matrix with CxK rows and MxN columns; ranking the MxN weight values in each row of the two-dimensional matrix according to their respective absolute values (Nurvitadhi, FIG.21-22,

    PNG
    media_image7.png
    234
    261
    media_image7.png
    Greyscale

    PNG
    media_image8.png
    313
    330
    media_image8.png
    Greyscale

); …. and rearranging the two-dimensional matrix to obtain the pruned convolution layer, wherein the pruned convolution layer comprises C filters corresponding to the target convolution layer, each of the C filters comprises K convolution kernels, and each of the K convolution kernels comprises M rows and N columns of weight values (Nurvitadhi, FIG.21-22, [0341], ‘encodes a sparse matrix into CSR encoded matrix’,

    PNG
    media_image9.png
    468
    380
    media_image9.png
    Greyscale

),
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of LI and Nurvitadhi before him or her, to modify neural network pruning method and system of LI to include 2D matrix representation of weights as shown in Nurvitadhi.   
The motivation for doing so would have been to facilitate processing of sparse matrix for convolutional neural networks (Nurvitadhi, Abstract). 

With regards to claim 8, LI in view of Nurvitadhi teaches 
‘The method of claim 2, wherein the target convolution layer or the updated convolution layer is used to perform a convolution operation with K input channels of an input layer, so as to generate C operation results to be output via C output channels of an output layer (LI, FIG.1,

    PNG
    media_image10.png
    447
    365
    media_image10.png
    Greyscale

).”

With regards to claim 9, LI in view of Nurvitadhi teaches 
“The method of claim 1, wherein the neural network is a convolutional neural network (CNN) (LI, FIG.5(a), [0021], ‘The deep learning model shown in FIG.5a includes CNN (Convolutional Neural Network) module’,

    PNG
    media_image11.png
    416
    198
    media_image11.png
    Greyscale

)”.

With regards to claim 10, LI in view of Nurvitadhi teaches 
“The method of claim 1, wherein, when the method is used to prune more than one convolution layer in the neural network, the method further comprises: obtaining another target convolution layer from the more than one convolution layer in the neural network, the another target convolution layer comprising C' filters each comprising K' convolution kernels, and each of the K' convolution kernels comprising M' rows and N' columns of weight values, where C', K', M' and N' are positive integers greater than or equal to one; determining a number P' of weight values to be pruned for each convolution kernel of the another target convolution layer based on a number of weight values M'xN' in the convolution kernel and another target compression ratio, where P' is a positive integer smaller than M'xN'; and setting P' weight values with the smallest absolute values in each convolution kernel of the another target convolution layer to zero to form another pruned convolution layer (LI, FIG.12, [0111] – [0121] shows pruning of weights in multiple layers, and following the same process as shown in claim1  rejection for pruning weights in each layer.

    PNG
    media_image12.png
    747
    470
    media_image12.png
    Greyscale

).”

With regards to claim 19, LI in view of Nurvitadhi teaches 
“The device of claim 10, wherein, when the device is used to prune more than one convolution layer in the neural network, the program instructions further cause the processor to perform: obtaining another target convolution layer from the more than one convolution layer in the neural network, the another target convolution layer comprising C' filters each comprising K' convolution kernels, and each of the K' convolution kernels comprising M' rows and N' columns of weight values, where C', K', M' and N' are positive integers greater than or equal to one; determining a number P' of weight values to be pruned for each convolution kernel of the another target convolution layer based on a number of weight values M'xN' in the convolution kernel and another target compression ratio, where P' is a positive integer smaller than M'xN'; and setting P' weight values with the smallest absolute values in each convolution kernel of the another target convolution layer to zero to form another pruned convolution layer (LI, FIG.12, [0111] – [0121] shows pruning of weights in multiple layers, and following the same process as shown in claim1  rejection for pruning weights in each layer.

    PNG
    media_image12.png
    747
    470
    media_image12.png
    Greyscale

).”


Claims 11-13, 15, 17-18, 20 are substantially similar to claims 1-3, 5, 7-10, 19. The arguments as given above for claims 1-3, 5, 7-10, 19are applied, mutatis mutandis, to claims 11-13, 15, 17-18, 20, therefore the rejection of claims 1-3, 5, 7-10, 19 are applied accordingly.

The combined teaching described above will be referred as LI + Nurvitadhi hereafter.

Claims 4, 14 are rejected under 35 U.S.C. 103 as being unpatentable over LI et al., US-PGPUB NO.20190050734A1 [hereafter LI] in view of Nurvitadhi, et al., US-PGPUB NO.20190205746A1 [hereafter Nurvitadhi] and Kim et al., “Hadamard Product for Low-Rank Bilinear Pooling”, ICLR 2017 [hereafter Kim].

With regards to claim 4, LI + Nurvitadhi teaches 
“The method of claim 3, wherein setting gradient values of an error gradient tensor at positions corresponding to the positions of the weight values set to zero in the pruned convolution layer to zero by using the mask tensor (as shown in claim 3 rejection)”
LI + Nurvitadhi does not explicitly detail “comprises: performing a Hadamard multiplication operation on the mask tensor and the error gradient tensor”
However Kim shows Hadamard product (Kim, 1 Introduction, ‘using Hadamard product (element-wise multiplication), which is commonly used in various scientific computing frameworks as one of tensor operations’).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of LI + Nurvitadhi and Kim before him or her, to modify neural network pruning method and system of LI + Nurvitadhi to include Hadamard multiplication as shown in Kim.   
The motivation for doing so would have been to reduce energy consumption for multimodal learning (Kim, Abstract). 

Claim 14 is substantially similar to claim 4. The arguments as given above for claim 4 are applied, mutatis mutandis, to claim 14, therefore the rejection of claim 4 are applied accordingly.

Claims 6, 16 are rejected under 35 U.S.C. 103 as being unpatentable over LI et al., US-PGPUB NO.20190050734A1 [hereafter LI] in view of Nurvitadhi, et al., US-PGPUB NO.20190205746A1 [hereafter Nurvitadhi] and Yang et al., “Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning”, CVPR 2017 [hereafter Yang].

With regards to claim 6, LI + Nurvitadhi teaches 
“The method of claim 5”
LI + Nurvitadhi does not explicitly detail “further comprising: obtaining an updated accuracy of a neural network operation performed by the updated neural network; comparing the updated accuracy with the target accuracy; and increasing the target compression ratio and re-determining the number P of weight values to be pruned based on the increased target compression ratio, in response to that the updated accuracy is less than the target accuracy”
However Yang teaches “further comprising: obtaining an updated accuracy of a neural network operation performed by the updated neural network; comparing the updated accuracy with the target accuracy (Yang, FIG.2, 

    PNG
    media_image13.png
    539
    563
    media_image13.png
    Greyscale

); and increasing the target compression ratio and re-determining the number P of weight values to be pruned based on the increased target compression ratio, in response to that the updated accuracy is less than the target accuracy (Yang, FIG.2, 4. Energy-Aware Pruning, ‘Step 3 and 4 increase the compression ratio’),
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of LI + Nurvitadhi and Yang before him or her, to modify neural network pruning method and system of LI + Nurvitadhi to include adjusting compression ratio as shown in Yang.   
The motivation for doing so would have been to reduce energy consumption for convolutional neural networks (Yang, Abstract). 

Claim 16 is substantially similar to claim 6. The arguments as given above for claim 6 are applied, mutatis mutandis, to claim 16, therefore the rejection of claim 6 are applied accordingly.

Examiner's Note

The Examiner respectfully requests of the Applicant in preparing responses, to fully consider the entirety of the reference(s) as potentially teaching all or part of the claimed invention.  It is noted, REFERENCES ARE RELEVANT AS PRIOR ART FOR ALL THEY CONTAIN.  “The use of patents as references is not limited to what the patentees describe as their own inventions or to the problems with which they are concerned.  They are part of the literature of the art, relevant for all they contain.”  In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)).  A reference may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art, including non-preferred embodiments (see MPEP 2123).  The Examiner has cited particular locations in the reference(s) as applied to the claim(s) above for the convenience of the Applicant.  Although the specified citations are representative of the teachings of the art and are applied to the specific limitations within the individual claim(s), typically other passages and figures will apply as well. 


Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to TSU-CHANG LEE whose telephone number is 571-272-3567.  The fax number is 571-273-3567.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas, can be reached 571-272-2589.  
 Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/TSU-CHANG LEE/
Primary Examiner, Art Unit 2128