The present application, filed on or after 16 March 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION

This office action is in response to Applicant’s submission filed on 9 October 2020.     THIS ACTION IS NON-FINAL.

Status of Claims

Claims 1-28 are pending.
Claim 15-21 are rejected under 35 U.S.C. 101 for being directed to software per se.
Claim 1-14, 22-28 are rejected under 35 U.S.C. 101 for being directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.
Claims 1-28 are rejected under 35 U.S.C. 103 as unpatentable.

Claim Rejections - 35 USC § 101

35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Non-Statutory Subject Matter
Claims 15-21 are directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because they could be software.  See specification, [0078], ‘The various operations of methods described above may be performed by any suitable means capable of performing the corresponding functions.  The means may include various hardware and/or software component(s) and /or module(s) …”

Judicial Exception
Claims 1-14, 22-28 of the claimed invention are directed to a judicial exception, an abstract idea, without significantly more. 
 (Independent Claims) With regards to claim 1 / 8 / 22, the claim recites a method / system / product, which falls into one of the statutory categories.
2A – Prong 1: Claim 1 / 8 / 22, in part, recites 
 “… determining a pruning threshold for pruning a first set of pre-trained weights of a plurality of pre-trained weights based on a function of a classification loss and a regularization loss; pruning weights, from the first set of pre-trained weights, with and adjusting a second set of pre-trained weights of the plurality of pre-trained weights in response to a second value of each pre-trained weight in the second set of pre-trained weights being greater than the pruning threshold” (mental process), as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components.  That is, other than reciting a computing device, nothing in the claim element precludes the step from practically being performed in the mind.  For example, but for the language about generic computer components, “determining”, “pruing”, “adjusting”, in the limitation citied above could be performed by human using paper / pen / calculator (e.g., a human network model analyzer could adjust model structure based on parameters learned ), see Appendix 1 to October 2019 Update: Subject Matter Eligibility Life Sciences & Data Processing Examples, Example 43, Step 2A Prong One, p.4, “Note that even if most humans would use a physical aid (e.g., pen and paper, a slide rule, or a calculator) to help them complete the recited calculation, the use of such physical aid does not negate the mental nature of this limitation”.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.
2A – Prong 2: This judicial exception is not integrated into a practical application.  In particular, claim 1 / 8 / 22 recites the additional elements of using generic computer elements (like processor coupled to memory / executing program stored in a non-transitory computer readable medium), which are recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer component.  Accordingly, this additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.  Claim 1 / 8 / 22 is directed to an abstract idea.
2B Analysis:  The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the use of one or more generic hardware processors is WURC and does not add anything significant to the abstract idea. The claim is not patent eligible.
 (Dependent claims) 
Claims 2-7 / 9-14 / 23-28 are dependent on claim 1 / 8 / 22 and include all the limitations of claim 1 / 8 / 22. Therefore, claims 2-7 / 9-14 / 23-28 recite the same abstract ideas. 
With regards to claims 2 / 9 / 23, the claim recites further limitation “further comprising determining a differentiable pruned weight as a product of a pre-trained weight from the plurality of pre-trained weights and a differentiable function, the differentiable function determining a pruning value based on the pre- trained weight, the pruning threshold, and a temperature for smoothing the differentiable function”, as drafted, is a process that, under its broadest reasonable interpretation, covers mathematical concepts but for the recitation of generic computer components.  That is, other than reciting generic computers, the claim include limitations, based on their broadest reasonable interpretation, describe mathematical relationships and algorithms.  Mathematical relationship and algorithms have been found by the courts to be abstract ideas, e.g., see MPEP 2106.04(a)(2).  The claim is not patent eligible.
With regards to claims 3-7 / 10-14 / 24-28, the claim recites further limitations on data processed, and does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea.  The claim is not patent eligible.


Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-5, 8-12, 15-19, 22-26 rejected under 35 U.S.C. 103 as being unpatentable over DENG et al., US-PGPUB NO.20190180184A1 [hereafter DENG] in view of Ramakrishnan et al., “Differentiable Mask for Pruning Convolutional and recurrent Networks”, arXiv.org, Cornell University Library, 201 Olin Library Cornell University, Ithaca, NY 14853, 10 September 2019 [hereafter Ramakrishnan].

With regards to claim 8, DENG teaches 
“An apparatus, comprising: a memory; and at least one processor coupled to the memory, the at least one processor being configured (DENG, FIG.8,

    PNG
    media_image1.png
    671
    513
    media_image1.png
    Greyscale

): to determine a pruning threshold for pruning a first set of pre-trained weights of a plurality of pre-trained weights (DENG, [0023], ‘In one embodiment, a weight-pruning technique uses an analytic threshold function that optimally reduces the number weights’) based on a function of a classification loss and a regularization loss (DENG, [0007], ‘minimizing a difference between the output and the training data to determine a set of weights w that enhance a speed performance of the neural network, an accuracy of the neural network, or combination thereof, by minimizing a cost function C’, [0024], ‘The cost function may include regularization terms’); to prune weights, from the first set of pre-trained weights, with a first value that is less  than the pruning threshold (DENG, FIG.7, 

    PNG
    media_image2.png
    827
    416
    media_image2.png
    Greyscale

[0023], ‘In one embodiment, a weight-pruning technique uses an analytic threshold function that optimally reduces the number of weights ….The analytic threshold function may be applied to the weights of the various layers of the neural network so that weights having magnitudes that are less than a threshold are set to zero and the weights that are greater than the threshold are not affected’); and to adjust a second set of pre-trained weights of the plurality of pre-trained weights in response to a second value of each pre-trained weight in the second set of pre-trained weights being greater than the pruning threshold (DENG, [0039], The weights may be updated as

    PNG
    media_image3.png
    219
    970
    media_image3.png
    Greyscale

)”.
DENG does not explicitly detail “and outputting a prediction based on the extracted history state of the external memory”.
However Allen teaches “and outputting a prediction based on the extracted history state of the external memory (Allen, FIG.6, C4L22-27, ‘A variation of this embodiment provides for filtering transaction records based upon discovered predictive relationships …., and outputs these association rules for viewing by a human …’)”.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Lee and Allen before him or her, to modify the associative memory network of Lee to include prediction output as shown in Allen.   
The motivation for doing so would have been for making prediction about data record from an incoming stream of data record (Allen, Abstract). 

With regards to claim 9, DENG in view of Ramakrishnan teaches 
“The apparatus of claim 8, in which the at least one processor is further configured to determine a differentiable pruned weight as a product of a pre-trained weight from the plurality of pre-trained weights and a differentiable function, the differentiable function determining a pruning value based on the pre-trained weight, the pruning threshold (DENG, FIG.7, 

    PNG
    media_image4.png
    139
    488
    media_image4.png
    Greyscale


    PNG
    media_image5.png
    104
    862
    media_image5.png
    Greyscale

and equations (5)(6) shows differentiable function,

    PNG
    media_image6.png
    405
    643
    media_image6.png
    Greyscale

), and a temperature for smoothing the differentiable function (DENG, [0029], ‘ is a parameter that controls a sharpness of the threshold function …’)”.

With regards to claim 10, DENG in view of Ramakrishnan teaches 
“The apparatus of claim 9, in which the at least one processor is further configured to minimize the classification loss based on a first gradients of the pruning threshold and second gradients of the plurality of pre-trained weights (DENG, [0037] – [0039], 

    PNG
    media_image7.png
    74
    641
    media_image7.png
    Greyscale


    PNG
    media_image3.png
    219
    970
    media_image3.png
    Greyscale

)”.

With regards to claim 11, DENG in view of Ramakrishnan teaches 
“The apparatus of claim 9”.
DENG does not explicitly detail “in which the at least one processor is further -3- 82082524v.1Docket No.: 200091configured to determine the regularization loss based on the pruning value determined for each layer of an artificial neural network”.
However Ramakrishnan teaches “in which the at least one processor is further -3- 82082524v.1Docket No.: 200091configured to determine the regularization loss based on the pruning value determined for each layer of an artificial neural network (Ramakrishnan, p.222,

    PNG
    media_image8.png
    360
    758
    media_image8.png
    Greyscale

)”.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of DENG and Ramakrishnan before him or her, to modify the network pruning with threshold of DENG to include regularization for layers shown in Ramakrishnan.   
The motivation for doing so would have been for network compression (Ramakrishnan, Abstract). 

With regards to claim 12, DENG in view of Ramakrishnan teaches 
“The apparatus of claim 8”
DENG does not explicitly detail “in which the at least one processor is further configured to normalize the regularization loss based on a pruning preference value, in which determining the pruning threshold comprises determining a total loss based on the classification loss and the normalized regularization loss”.
However Ramakrishnan teaches “in which the at least one processor is further configured to normalize the regularization loss based on a pruning preference value, in which determining the pruning threshold comprises determining a total loss based on the classification loss and the normalized regularization loss (Ramakrishnan, p.223,

    PNG
    media_image9.png
    432
    458
    media_image9.png
    Greyscale


    PNG
    media_image10.png
    375
    464
    media_image10.png
    Greyscale

)”.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of DENG and Ramakrishnan before him or her, to modify the network pruning with threshold of DENG to include normalization for layers shown in Ramakrishnan.   
The motivation for doing so would have been for network compression (Ramakrishnan, Abstract). 

Claims 1-5, 15-19, 22-26 are substantially similar to claims 8-12. The arguments as given above for claims 8-12are applied, mutatis mutandis, to claims 1-5, 15-19, 22-26, therefore the rejection of claims 8-12 are applied accordingly.

The combined teaching described above will be referred as DENG + Ramakrishnan hereafter.

Claims 6-7, 13-14, 20-21, 27-28 rejected under 35 U.S.C. 103 as being unpatentable over DENG et al., US-PGPUB NO.20190180184A1 [hereafter DENG] in view of Ramakrishnan et al., “Differentiable Mask for Pruning Convolutional and recurrent Networks”, arXiv.org, Cornell University Library, 201 Olin Library Cornell University, Ithaca, NY 14853, 10 September 2019 [hereafter Ramakrishnan] and Chai, et al., US-PGPUB NO.20220091837A1 [hereafter Chai].

With regards to claim 13, DENG + Ramakrishnan teaches 
“The apparatus of claim 12”
DENG + Ramakrishnan does not explicitly detail “in which the at least one processor is further configured: to distribute an artificial neural network to a user device of a federated learning system; and to configure the pruning preference value based on a hardware profile of the user device”.
However Chai teaches “in which the at least one processor is further configured: to distribute an artificial neural network to a user device of a federated learning system; and to configure the pruning preference value based on a hardware profile of the user device (Chai, p.223, [0075], ‘In further implementation, the on-device training can enable participation of the device in a federated learning scheme’, [0140], ‘The framework permits efficient distributed training but can be optimized to produce a neural network model with low memory footprint’)”.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of DENG + Ramakrishnan and Ramakrishnan before him or her, to modify the network pruning with threshold of DENG + Ramakrishnan to include federated learning shown in Chai.   
The motivation for doing so would have been for generation and management of ML models (Chai, Abstract). 

With regards to claim 14, DENG + Ramakrishnan teaches 
“The apparatus of claim 13, in which the at least one processor is further configured to exclude the first set of pre-trained weights from the distributed artificial neural network (DENG, FIG.7, [0045], ‘FIG.7 depicts a flow diagram of an example method 700 to prune weights of neural network … to determine the set of weights w that optimize the speed performance of the neural network’)”.

Claims 6-7, 20-21, 27-28 are substantially similar to claims 13-14. The arguments as given above for claims 13-14 are applied, mutatis mutandis, to claims 6-7, 20-21, 27-28, therefore the rejection of claims 13-14 are applied accordingly.


Additional Relevant Art

The prior art made of record is considered pertinent to applicant’s disclosure and is recorded on Form PTO-892. Applicant is required under 37 C.F.R. § 1.111 (c) to consider these references fully when responding to this action, with particular attention paid to:
Manessi et al., “Automated Pruning for Deep Neural Network Compression”, arXiv:1712.012721v2 [cs.CV], 6 Jan 2019 [hereafter Manessi] shows Deep Neural Network pruning based on threshold.

Examiner's Note

The Examiner respectfully requests of the Applicant in preparing responses, to fully consider the entirety of the reference(s) as potentially teaching all or part of the claimed invention.  It is noted, REFERENCES ARE RELEVANT AS PRIOR ART FOR ALL THEY CONTAIN.  “The use of patents as references is not limited to what the patentees describe as their own inventions or to the problems with which they are concerned.  They are part of the literature of the art, relevant for all they contain.”  In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)).  A reference may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art, including non-preferred embodiments (see MPEP 2123).  The Examiner has cited particular locations in the reference(s) as applied to the claim(s) above for the convenience of the Applicant.  Although the specified citations are representative of the teachings of the art and are applied to the specific limitations within the individual claim(s), typically other passages and figures will apply as well. 


Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to TSU-CHANG LEE whose telephone number is 571-272-3567.  The fax number is 571-273-3567.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas, can be reached 571-272-2589.  
 Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/TSU-CHANG LEE/
Primary Examiner, Art Unit 2128