DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is responsive to the original application filed on 12/22/2020 and the Remarks and Amendments filed on 2/10/2022 and 1/27/2022.  Acknowledgement is made with respect to priority claimed to Provisional Application 63/082,009 filed on 9/23/2020.

Claim Interpretation

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.


This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder 161.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 

(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.

Claim Rejections - 35 USC § 101

35 U.S.C. 101 reads as follows:



Claims 1-16, 18-21, 23, and 25-27 are rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.  The analysis of the claims will follow the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”).

When considering subject matter eligibility under 35 U.S.C. 101, it must be determined whether the claim is directed to one of the four statutory categories of invention, i.e., process, machine, manufacture, or composition of matter (Step 1). If the claim does fall within one of the statutory categories, the second step in the analysis is to determine whether the claim is directed to a judicial exception (Step 2A). The Step 2A analysis is broken into two prongs. In the first prong (Step 2A, Prong 1), it is determined whether or not the claims recite a judicial exception (e.g., mathematical concepts, mental processes, certain methods of organizing human activity). If it is determined in Step 2A, Prong 1 that the claims recite a judicial exception, the analysis proceeds to the second prong (Step 2A, Prong 2), where it is determined whether or not the claims integrate the judicial exception into a practical application. If it is determined at step 2A, Prong 2 that the claims do not integrate the judicial exception into a practical application, the analysis proceeds to determining whether the claim is a patent-eligible application of the exception (Step 2B). If an abstract idea is present in the claim, any element or combination of elements in the claim must be sufficient to ensure that the 

Claim 1
Step 1:  The claim recites a method; thus, it is directed to the statutory category of a process.
Step 2A Prong 1:  The claim recites, inter alia:
determining if one or more additional values of the neural network exist: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of determining whether additional values such as outliers exist in a neural network, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
determined using a probability density function defining a probability that each additional value belongs to a distribution:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical process of determining additional value or outliers using a probability density function disclosed as math in paragraph [0079] of the originally filed specification.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Specifically, the additional elements consist of “memory”, “storing, in the memory, one or more values of the neural network each as a reference to a representative value”, and “if the one or more additional values exist, storing, in the memory, the one or more additional values of the neural network”.  The additional element of “memory” is a generic computer component recited in a manner that represents no more than mere directed to the abstract idea
Step 2B:  The claim does not contain significantly more than the judicial exception.  The additional element of “memory” is a generic computer component recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  The additional elements of “storing, in the memory, one or more values of the neural network each as a reference to a representative value” and “if the one or more additional values exist, storing, in the memory, the one or more additional values of the neural network” are insignificant extra-solution activities that do not amount to an inventive concept (see MPEP §2106.05(g); “storing data”; and MPEP § 2106.05(d); “Storing and retrieving information in memory” are well‐understood, routine, and conventional functions when they are claimed in a merely generic manner).  Nothing in the claim provides significantly more than that abstract idea.  As such, the claim is ineligible.

Claim 2
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “the reference being the representative value”. This limitation merely places restrictions on the type of data used in the analysis and the technological environment in which the judicial exception is performed, and does not negate the mental nature of the underlying process
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “the reference being the representative value”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 3
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “each of the references being generated by quantizing one of the one or more values of the neural network”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of quantizing data of a neural, which are evaluations or observations that are practically capable of being performed in the human mind with the assistance of pen and paper.  For instance, one can mentally, with the assistance of pen and paper, quantize data from larger representation into a smaller, more compact representation.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 4
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “storing, in the memory, a reconstruction table storing each representative value for each reference”. This limitation does not negate the mental nature of the underlying process
Step 2A Prong 2, Step 2B:  The additional element of “memory” is a generic computer component recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  The additional element of “storing, in the memory, a reconstruction table storing each representative value for each reference” is an insignificant extra-solution activity that does not amount to an inventive concept (see MPEP §2106.05(g); “storing data”; and MPEP § 2106.05(d); “Storing and retrieving information in memory” are well‐understood, routine, and conventional functions when they are claimed in a merely generic manner).  Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.


Claim 5
Step 1:  The claim recites a computer-implemented method; therefore, it is directed to the statutory category of a process.
Step 2A Prong 1:  The claim recites, inter alia:
[a]ssigning each of the one or more values of the neural network to a cluster: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of assigning one or more values to a cluster, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
[s]electing a selected value from the cluster as the representative value for each of the one or more values of the neural network of the cluster:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of selecting a value as a representative value, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Specifically, the additional elements consist of, as disclosed in independent claim 1 from which claim 5 depends, “storing, in the memory, one or more values of the neural network each as a reference to a representative value; and storing, in the memory, one or more additional values of the neural network”.  The additional elements of “storing, in the memory, one or more values of the neural network each as a reference to a representative value; and storing, in the memory, one or more additional values of the neural network” are elements that the courts have recognized as well‐understood, routine activities or as insignificant extra-solution activities that do not integrate a judicial directed to the abstract idea
Step 2B:  The claim does not contain significantly more than the judicial exception.  The additional elements of “storing, in the memory, one or more values of the neural network each as a reference to a representative value; and storing, in the memory, one or more additional values of the neural network” are elements that the courts have recognized as well‐understood, routine activities or as insignificant extra-solution activities that do not provide significantly more than the judicial exception (see MPEP § 2106.05(d)(II)(iv); “Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93”).  Nothing in the claim provides significantly more than that abstract idea.  As such, the claim is ineligible.

Claim 6
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “each selected value being a centroid, the centroid being an average of the one or more values of the neural network in the cluster that the selected value is selected from”. This limitation merely places restrictions on the type of 
Step 2A Prong 2, Step 2B:  This claim recites “each selected value being a centroid, the centroid being an average of the one or more values of the neural network in the cluster that the selected value is selected from” (which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 7
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “minimizing a sum of each distance between a) each of the one or more values of the neural network and b) the selected value of the cluster that the value is assigned to by iteratively: performing the assigning, the assigning further comprising reassigning a first value of the one or more values of the neural network from an original cluster of the clusters to a new cluster of the clusters where an original distance of the first value to the selected value of the original cluster is greater than a new distance of the first value to the selected value of the new cluster; and subsequently performing the selecting on at least the original cluster and the new cluster; wherein the first value is a different value of the one or more values of the neural network upon each iteration”. Under its broadest reasonable interpretation in light of the specification, these limitations encompass the mental processes of minimizing a sum of each distance, performing 
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “wherein the first value is a different value of the one or more values of the neural network upon each iteration” (which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 8
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “generating an output from performing one or more multiply-accumulate operations A1B1 + - + AnBn on input vectors A and input vectors B, wherein n is the n-th input vector and wherein one or more of input vectors B are each one of the representative values, by accumulating input vectors A to an accumulated sum of input vectors A per input vector B having the same representative value and subsequently multiplying each of the accumulated sums of input vectors A by the representative value of the input vector B.” Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical concepts of generating an output from performing multiply accumulate operations.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 9
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites the mental processes of claim 8 from which it depends.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “storing, in the memory, one or more additional values of the neural network wherein one or more of the input vectors B are each one of the additional values”, which is an element that the courts have recognized as a well‐understood, routine activity or as an insignificant extra-solution activity that does not provide significantly more than the judicial exception (see MPEP § 2106.05(d)(II)(iv); “Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93”). This additional element does not integrate the abstract idea into a practical application nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 10
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites the mental processes of claim 1 from which it depends.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “each of the additional values satisfying a criterion of a distribution, content, or value count of a component of the neural network”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 11
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites the mental processes of claim 1 from which it depends.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “each of the additional values being in a component of the neural network and having a probability density function (pd) less than a threshold value, the pdf defined by pdf (x1p, 6z) = ( V2 2 e 262 , wherein x is the additional value, u is a mean of one or more parameters in the component of the neural network having x, and a is a standard deviation of one or more parameters in the component of the neural network having x”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 12
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites the mental processes of claim 1 from which it depends.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “each of the additional values being outside a threshold range from a distribution fit of both the values and the additional values in a component of the neural network”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 13
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites the mental processes of claim 1 from which it depends.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “at least one of the references being encoded using three bits or four bits”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 14
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites the mental processes of claim 1 from which it depends.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “at least one of the one or more values of the neural network being an embedding”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 15
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites the mental processes of claim 1 from which it depends.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “at least one of the one or more values of the neural network being a weight”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 16
Step 1:  The claim recites a system, thus it is directed to the statutory category of a manufacture.
Step 2A Prong 1:  The claim recites, inter alia:
[a]ccumulate one or more values of the neural network for each of one or more references to an identical representative value to generate an output for each accumulation: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of accumulating values to generate an output, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
[m]ultiply the output with the identical representative value respective to the output to generate a final output:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical process of multiplying an output to generate a final output.
[a]ccumulate one or more of the final outputs with one or more additional values if present: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of accumulating outputs with additional values, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
determined using a probability density function defining a probability that each additional value belongs to a distribution:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical 
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Specifically, the additional elements consist of “one or more accumulators” and “shared multiplier-accumulator”.  The additional elements of “one or more accumulators” and “shared multiplier-accumulator”, in view of the 112(f) interpretation of claim 16 above, are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  Thus the additional elements do not provide any meaningful limits on the execution of the abstract idea. Even when viewed in combination, these additional elements do not integrate the abstract idea into a practical application and the claim is thus directed to the abstract idea
Step 2B:  The claim does not contain significantly more than the judicial exception.  The additional elements of “one or more accumulators” and “shared multiplier-accumulator”, in view of the 112(f) interpretation of claim 16 above, are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  Nothing in the claim provides significantly more than that abstract idea.  As such, the claim is ineligible.

Claim 18
Step 1:  The claim recites a system; thus, it is directed to the statutory category of a manufacture.
Step 2A Prong 1:  The claim recites, inter alia:
determining if one or more additional values of the neural network exist: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of determining whether additional values such as outliers exist in a neural network, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
determined using a probability density function defining a probability that each additional value belongs to a distribution:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical process of determining additional value or outliers using a probability density function disclosed as math in paragraph [0079] of the originally filed specification.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Specifically, the additional elements consist of “a memory; at least one processor in communication with the computer memory”, “storing, in the memory, one or more values of the neural network each as a reference to a representative value”, and “if the one or more additional values exist, storing, in the memory, the one or more additional values of the neural network”.  The additional elements of “a memory; at least one processor in communication with the computer memory” are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  The additional elements of “storing, in the memory, one or more values of the neural network each as a reference to a representative value” and “if the one or more additional values exist, storing, in the memory, the one or more additional values of the neural network” are insignificant extra-solution activities that do not amount to an directed to the abstract idea
Step 2B:  The claim does not contain significantly more than the judicial exception.  The additional elements of “a memory; at least one processor in communication with the computer memory” are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).   The additional elements of “storing, in the memory, one or more values of the neural network each as a reference to a representative value” and “if the one or more additional values exist, storing, in the memory, the one or more additional values of the neural network” are insignificant extra-solution activities that do not amount to an inventive concept (see MPEP §2106.05(g); “storing data”; and MPEP § 2106.05(d); “Storing and retrieving information in memory” are well‐understood, routine, and conventional functions when they are claimed in a merely generic manner).  Nothing in the claim provides significantly more than that abstract idea.  As such, the claim is ineligible.


Claim 19
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites, inter alia:
[a]ssigning each of the one or more values of the neural network to a cluster: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of assigning one or more values to a cluster, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
[s]electing a selected value from the cluster as the representative value for each of the one or more values of the neural network of the cluster:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of selecting a value as a representative value, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Specifically, the additional elements consist of, as disclosed in independent claim 18 from which claim 19 depends, “a memory; at least one processor in communication with the computer memory, the memory comprising instructions which, when executed by the at least one processor, carries out the steps of”, “storing, in the memory, one or more values of the neural network each as a reference to a representative value; and storing, in the memory, one or more additional values of the neural network”. The additional element of “a memory; at least one processor in communication with the computer memory, the memory comprising instructions which, when executed by the at least one processor, carries out the steps of” are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  The additional elements of “storing, in the ‐understood, routine activities or as insignificant extra-solution activities that do not integrate a judicial exception into a practical application (see MPEP § 2106.05(d)(II)(iv); “Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93”).  Thus the additional elements do not provide any meaningful limits on the execution of the abstract idea. Even when viewed in combination, these additional elements do not integrate the abstract idea into a practical application and the claim is thus directed to the abstract idea
Step 2B:  The claim does not contain significantly more than the judicial exception.  The additional element of “a memory; at least one processor in communication with the computer memory, the memory comprising instructions which, when executed by the at least one processor, carries out the steps of” are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)). The additional elements of “storing, in the memory, one or more values of the neural network each as a reference to a representative value; and storing, in the memory, one or more additional values of the neural network” are elements that the courts have recognized as well‐understood, routine activities or as insignificant extra-solution activities that do not provide significantly more than the judicial exception (see MPEP § 2106.05(d)(II)(iv); “Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 

Claim 20
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “minimizing a sum of each distance between a) each of the one or more values of the neural network and b) the selected value of the cluster that the value is assigned to by iteratively: performing the assigning, the assigning further comprising reassigning a first value of the one or more values of the neural network from an original cluster of the clusters to a new cluster of the clusters where an original distance of the first value to the selected value of the original cluster is greater than a new distance of the first value to the selected value of the new cluster; and subsequently performing the selecting on at least the original cluster and the new cluster; wherein the first value is a different value of the one or more values of the neural network upon each iteration”. Under its broadest reasonable interpretation in light of the specification, these limitations encompasses the mental processes of minimizing a sum of each distance, performing assigning and reassigning of values, and selecting based on clusters, each of which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “wherein the first value is a different value of the one or more values of the neural network upon each iteration” (which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 

Claim 21
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “generating an output from performing one or more multiply-accumulate operations A1B1 + - + AnBn on input vectors A and input vectors B, wherein n is the n-th input vector and wherein one or more of input vectors B are each one of the representative values, by accumulating input vectors A to an accumulated sum of input vectors A per input vector B having the same representative value and subsequently multiplying each of the accumulated sums of input vectors A by the representative value of the input vector B.” Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical concepts of generating an output from performing multiply accumulate operations.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 23
Step 1:  The claim recites a non-transient computer-readable medium; therefore, it is directed to the statutory category of a manufacture.
Step 2A Prong 1:  The claim recites, inter alia:
[a]ssigning each of the one or more values of the neural network to a cluster: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of assigning one or more values to a cluster, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
[s]electing a selected value from the cluster as the representative value for each of the one or more values of the neural network of the cluster:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of selecting a value as a representative value, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application.  Specifically, the additional elements consist of, as disclosed in independent claim 22 from which claim 23 depends, “computer-readable instructions which, when executed by a computer processor, perform a method of”, “storing, in the memory, one or more values of the neural network each as a reference to a representative value; and storing, in the memory, one or more additional values of the neural network”. The additional element of “computer-readable instructions which, when executed by a computer processor, perform a method of” are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).  The additional elements of “storing, in the memory, one or more values of the neural network each as a reference to a representative ‐understood, routine activities or as insignificant extra-solution activities that do not integrate a judicial exception into a practical application (see MPEP § 2106.05(d)(II)(iv); “Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93”).  Thus the additional elements do not provide any meaningful limits on the execution of the abstract idea. Even when viewed in combination, these additional elements do not integrate the abstract idea into a practical application and the claim is thus directed to the abstract idea
Step 2B:  The claim does not contain significantly more than the judicial exception.  The additional element of “computer-readable instructions which, when executed by a computer processor, perform a method of” are generic computer components recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)). The additional elements of “storing, in the memory, one or more values of the neural network each as a reference to a representative value; and storing, in the memory, one or more additional values of the neural network” are elements that the courts have recognized as well‐understood, routine activities or as insignificant extra-solution activities that do not provide significantly more than the judicial exception (see MPEP § 2106.05(d)(II)(iv); “Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93”).  Nothing 

Claim 25
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “minimizing a sum of each distance between a) each of the one or more values of the neural network and b) the selected value of the cluster that the value is assigned to by iteratively: performing the assigning, the assigning further comprising reassigning a first value of the one or more values of the neural network from an original cluster of the clusters to a new cluster of the clusters where an original distance of the first value to the selected value of the original cluster is greater than a new distance of the first value to the selected value of the new cluster; and subsequently performing the selecting on at least the original cluster and the new cluster; wherein the first value is a different value of the one or more values of the neural network upon each iteration”. Under its broadest reasonable interpretation in light of the specification, these limitations encompasses the mental processes of minimizing a sum of each distance, performing assigning and reassigning of values, and selecting based on clusters, each of which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “wherein the first value is a different value of the one or more values of the neural network upon each iteration” (which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. 

Claim 26
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “generating an output from performing one or more multiply-accumulate operations A1B1 + - + AnBn on input vectors A and input vectors B, wherein n is the n-th input vector and wherein one or more of input vectors B are each one of the representative values, by accumulating input vectors A to an accumulated sum of input vectors A per input vector B having the same representative value and subsequently multiplying each of the accumulated sums of input vectors A by the representative value of the input vector B.” Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical concepts of generating an output from performing multiply accumulate operations.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 27
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites the mental process of claim 26 from which it depends.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “storing, in the memory, one or more additional values of the neural network wherein one or more of the input vectors B are each one of the additional values”, which is an element that the courts have recognized as a well‐understood, routine activity or as an insignificant extra-solution activity that does not provide significantly more than the judicial exception (see MPEP § 2106.05(d)(II)(iv); “Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93”). This additional element does not integrate the abstract idea into a practical application nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having 


Claims 1-6, 10-13, 15, 18, 19, and 22-24 are rejected under 35 U.S.C. 103 as being obvious over Han et al. (Han et al., “DEEP COMPRESSION: COMPRESSING DEEP NEURAL NETWORKS WITH PRUNING, TRAINED QUANTIZATION AND HUFFMAN CODING”, Feb. 15, 2016, ICLR 2016, pp. 1-14, hereinafter “Han”) in view of Park et al. (Park et al., “Energy-efficient Neural Network Accelerator Based on Outlier-aware Low-precision Computation”, July 23, 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), pp. 688-698, hereinafter “Park”) and Banner et al., (Banner et al., “ACIQ: Analytical Clipping for Integer Quantization of neural networks”, Sep., 27, 2018, ICLR 2019 Conference Blind Submission, pp. 1-11, hereinafter “Banner”).

Regarding claim 1, Han discloses [a] computer-implemented method for memory storage, comprising: storing a neural network in memory by: (Abstract; “we introduce “deep compression”, a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35× to 49× without affecting their accuracy”, which discloses a method for improving memory storage through neural network quantization and deep compression.  Note that the experiment of Han is inherently performed on a computer; and Page 2, ¶4; “ Our goal is to reduce the storage and energy required to run inference on such large networks so they can be deployed on mobile devices. To achieve this goal, we present “deep compression”: a threestage pipeline (Figure 1) to reduce the storage required by neural network in a manner that preserves the original accuracy”, which further discloses the method for memory storage)
storing, in the memory, one or more values of the neural network each as a reference to a representative value; and (Page 2, ¶5; “Our main insight is that, pruning and trained quantization are able to compress the network without interfering each other, thus lead to surprisingly high compression rate. It makes the required storage so small (a few megabytes) that all weights can be cached on chip instead of going to off-chip DRAM which is energy consuming”, which, under a broadest reasonable interpretation of the claim language, discloses storing in a memory (in a cache on a chip) one or more values (weights) of a neural network as a reference to a representative value, the reference being a quantized weight; and Page 2, ¶4; “Next, the weights are quantized so that multiple connections share the same weight, thus only the codebook (effective weights) and the indices need to be stored”, which discloses that the one or more values are stored as a reference to a representative value in that the weights are quantized and indexed, and the index is a “reference” to a representative value or quantized value of the weights of the neural network; and Abstract; the abstract makes it clear that we are considering weights of a neural network when quantizing them; and Page 5, §3.3; “An index into the shared weight table is stored for each connection”).
Han fails to explicitly disclose but Park discloses determining if one or more additional values of the neural network exist (Abstract; “In this study, we propose a hardware accelerator, called the outlier-aware accelerator (OLAccel). It performs dense and low-precision computations for a majority of data (weights and activations) while efficiently handling a small number of sparse and highprecision outliers (e.g., amounting to 3% of total data)”, which discloses determining one or more additional values of the neural network, which as disclosed in the originally filed specification and in Park, are outliers in a distribution of neural network weights; and Page 688, Column 2; “Our proposed accelerator is based on a novel quantization method called outlier-aware quantization [11], which divides the distribution of data (weights or activations) into two regions, of low and high precision. It applies reduced precision, e.g., 4-bit representation, to the low-precision region that contains a majority of the data. The high precision region contains only a small portion (e.g., 3%) of the total data, and maintains the original, high precision, e.g., 16-bit representation”, which further discloses the determining the existence of additional values of the neural network in the form of outliers, where the outliers are not quantized like the majority of weights in a distribution of weights in a neural network, as disclosed in the specification)
if the one or more additional values exist, storing, in the memory the one or more additional values of the neural network (Page 691, Column 1; “Note that the outlier activations are stored only in the swarm buffer while outlier weights can be stored in the swarm buffer and the cluster/group weight buffers”, which discloses, under a broadest reasonable interpretation of the claim language, upon determining if the additional values or outliers exist, storing these outlier values in memory such as a buffer; and Page 681, Column 2; “As shown in the box of the cluster weight buffer in the figure, the weights are stored at a granularity of 80-bit weight chunks (entries in the table). The weight chunk consists of 16 4-bit weights (= 4b×16), an 8-bit pointer (OLptr), a 4-bit pointer (OLidx), and the most significant four bits of an outlier weight (OLMSB)”, the storing of the additional values or outlier weights being accomplished using a cluster weight buffer; and Figure 5; “cluster weight buffer”).
Han and Park are analogous art because both are concerned with neural network quantization.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural network quantization to combine the consideration of additional values or outliers as disclosed by Park with the method of Han to yield the predictable result of storing, in the memory, one or more additional values of the neural network. The motivation for doing so would be to implement outlier-aware quantization, which provides a majority of data with fine-grained quantization while maintaining the precision of important outliers (Park; Conclusion).
Han fails to explicitly disclose but Banner discloses wherein the one or more additional values are determined using a probability density function defining a probability that each additional value belongs to a distribution (Page 4, Figure 1; the figure discloses wherein each of the additional values or outlier weights of the neural network are a component or weight of the network and have a probability less than a threshold value (thus being determined using a probability density function), the threshold value being the alpha value in the figure which is used to determine the outliers of the weight distribution for clipping; and Page 4, §4; “Let X be a high precision random variable with a probability density function f(x). Without loss of generality, we assume a prepossessing step has been made so that the average value in the tensor zero i.e., X = µ = 0”; and Page 5, ¶2; “we consider only smooth probability density functions (e.g., Gaussian or Laplace)”, which discloses the probability density function that is a Gaussian distribution.  Note that the equation of the claim is a Gaussian distribution with a mean or mu value of zero).
Han, Park, and Banner are analogous art because all are concerned with neural network quantization.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural network quantization to combine the probability density function as taught by Banner with the method of Han and Park to yield the predictable result of wherein the one or more additional values are determined using a probability density function defining a probability that each additional value belongs to a distribution. The motivation for doing so would be to establish the optimal clipping values under either Gaussian or Laplace distributions for weights of a neural network (Banner; Page 4, §4).


Regarding claim 18, it is a system claim corresponding to the steps of claim 1, and is rejected for the same reasons as claim 1.
	
Regarding claim 22, Han discloses [a] non-transient computer-readable medium containing computer-readable instructions which, when executed by a computer processor, perform a method of: (Abstract; “we introduce “deep compression”, a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35× to 49× without affecting their accuracy”, which discloses a method for improving memory storage through neural network quantization and deep compression.  Note that the experiment of Han is inherently performed on a computer with a processor and a non-transient computer-readable medium containing; and Page 2, ¶4; “Our goal is to reduce the storage and energy required to run inference on such large networks so they can be deployed on mobile devices. To achieve this goal, we present “deep compression”: a threestage pipeline (Figure 1) to reduce the storage required by neural network in a manner that preserves the original accuracy”, which further discloses the method for memory storage)
storing, in the memory, one or more values of the neural network each as a reference to a representative value; and (Page 2, ¶5; “Our main insight is that, pruning and trained quantization are able to compress the network without interfering each other, thus lead to surprisingly high compression rate. It makes the required storage so small (a few megabytes) that all weights can be cached on chip instead of going to off-chip DRAM which is energy consuming”, which, under a broadest reasonable interpretation of the claim language, discloses storing in a memory (in a cache on a chip) one or more values (weights) of a neural network as a reference to a representative value, the reference being a quantized weight; and Page 2, ¶4; “Next, the weights are quantized so that multiple connections share the same weight, thus only the codebook (effective weights) and the indices need to be stored”, which discloses that the one or more values are stored as a reference to a representative value in that the weights are quantized and indexed, and the index is a “reference” to a representative value or quantized value of the weights of the neural network; and Abstract; the abstract makes it clear that we are considering weights of a neural network when quantizing them)
storing, in the memory, one or more additional values of the neural network (Page 2, ¶5; “Our main insight is that, pruning and trained quantization are able to compress the network without interfering each other, thus lead to surprisingly high compression rate. It makes the required storage so small (a few megabytes) that all weights can be cached on chip instead of going to off-chip DRAM which is energy consuming”, which, under a broadest reasonable interpretation of the claim language, discloses storing in a memory (in a cache on a chip) one or more values (weights) of a neural network; and Abstract; the abstract makes it clear that we are considering weights of a neural network when quantizing them).
Han fails to explicitly disclose the one or more additional values of the neural network. 
Park discloses one or more additional values of the neural network (Abstract; “In this study, we propose a hardware accelerator, called the outlier-aware accelerator (OLAccel). It performs dense and low-precision computations for a majority of data (weights and activations) while efficiently handling a small number of sparse and high precision outliers (e.g., amounting to 3% of total data)”, which discloses the one or more additional values of the neural network, which as disclosed in the originally filed specification and in Park, are outliers in a distribution of neural network weights; and Page 688, Column 2; “Our proposed accelerator is based on a novel quantization method called outlier-aware quantization [11], which divides the distribution of data (weights or activations) into two regions, of low and high precision. It applies reduced precision, e.g., 4-bit representation, to the low-precision region that contains a majority of the data. The high precision region contains only a small portion (e.g., 3%) of the total data, and maintains the original, high precision, e.g., 16-bit representation”, which further discloses the additional values of the neural network in the form of outliers, where the outliers are not quantized like the majority of weights in a distribution of weights in a neural network, as disclosed in the specification).
The motivation to combine Han and Park is the same as discussed above with respect to claim 1.

Regarding claim 2, the rejection of claim 1 is incorporated and Han further discloses the reference being the representative value (Page 2, ¶4; “Next, the weights are quantized so that multiple connections share the same weight, thus only the codebook (effective weights) and the indices need to be stored”, which discloses that the reference or index is the representative value or quantized weight that is stored in memory).

Regarding claim 3, the rejection of claim 1 is incorporated and Han further discloses each of the references being generated by quantizing one of the one or more values of the neural network (Page 2, ¶4; “Next, the weights are quantized so that multiple connections share the same weight, thus only the codebook (effective weights) and the indices need to be stored”, which discloses that the references are generated by quantizing the one or more values or weights of the neural network; and Page 3, §3, ¶2; “The weights are quantized to 4 bins (denoted with 4 colors), all the weights in the same bin share the same value, thus for each weight, we then need to store only a small index into a table of shared weights”).

Regarding claim 4, the rejection of claim 1 is incorporated and Han further discloses storing, in the memory, a reconstruction table storing each of the representative values for each of the references (Page 2, ¶4; “Next, the weights are quantized so that multiple connections share the same weight, thus only the codebook (effective weights) and the indices need to be stored”, which discloses, under a broadest reasonable interpretation of the claim language, storing a reconstruction table or codebook that contains quantized weights or representative values for each of the references or indices.  This claim is interpreted in view of the 112b rejection above where it is assumed that we are only talking about one reference to one representative value; and Page 3, §3, ¶2; “The weights are quantized to 4 bins (denoted with 4 colors), all the weights in the same bin share the same value, thus for each weight, we then need to store only a small index into a table of shared weights”, the table of shared weights being the reconstruction table).

Regarding claims 5, 19, and 23, the rejection of claims 1, 18, and 22 are incorporated and Han further discloses assigning each of the one or more values of the neural network to a cluster; and (§ 1, 3.1-3.2 and 6.2; the sections disclose assigning or initializing one or more values of the neural network to a cluster)
for each cluster, selecting a selected value from the cluster as the representative value for each of the one or more values of the neural network of the cluster (§ 1, 3.1-3.2 and 6.2; the sections disclose selecting a selected value as the representative value or “shared weight” for each of the one or more values or weights of the neural network or cluster through an initialization of shared weights for each cluster).

Regarding claims 6 and 24, the rejection of claims 1, 5, and 22 are incorporated and Han further discloses each selected value being a centroid, the centroid being an average of the one or more values of the neural network in the cluster that the selected value is selected from (§ 1, 3.1-3.2 , 6.2, and Figure 3; the sections and particularly figure 3 discloses that the selected value or weight is a centroid, the centroid being an average, specifically a weighted average, of the one or more values or weights of the neural network in the cluster or cluster index that the selected value is selected from.  Note that figure 3 discloses that the centroid are a weighted average of the one or more values of the neural network in the cluster.  Further note that claim 24 is interpreted as if it depended from claim 23 as discussed in the 112b rejection above).

Regarding claim 10, the rejection of claim 1 is incorporated but Han fails to explicitly disclose each of the additional values satisfying a criterion of a distribution, content, or value count of a component of the neural network.
Banner discloses each of the additional values satisfying a criterion of a distribution, content, or value count of a component of the neural network (Page 4, Figure 1; the figure discloses wherein each of the additional values or outlier satisfy a criterion of a distribution, specifically a Gaussian distribution in that the outliers are determined as being an alpha distance away from a mean of a Gaussian distribution, and thus are said to satisfy a criterion of distribution).


Regarding claim 11, the rejection of claim 1 is incorporated but Han fails to explicitly disclose each of the additional values being in a component of the neural network and having a probability pdf less than a threshold value, the probability pdf defined by pdf(xp, 6') = e- 2 , wherein x is the additional value, p is a mean of one or more parameters in the component of the neural network having x, and a is a standard deviation of one or more parameters in the component of the neural network having x.
Banner discloses each of the additional values being in a component of the neural network and having a probability pdf less than a threshold value, the probability pdf defined by pdf(xp, 6') = e- 2 , wherein x is the additional value, p is a mean of one or more parameters in the component of the neural network having x, and a is a standard deviation of one or more parameters in the component of the neural network having x (Page 4, Figure 1; the figure discloses wherein each of the additional values or outlier weights of the neural network are a component or weight of the network and have a probability less than a threshold value, the threshold value being the alpha value in the figure which is used to determine the outliers of the weight distribution for clipping; and Page 4, §4; “Let X be a high precision random variable with a probability density function f(x). Without loss of generality, we assume a prepossessing step has been made so that the average value in the tensor zero i.e., X = µ = 0; which discloses that the mean or mu of figure 1’s Gaussian distribution is zero; and Page 5, ¶2; “we consider only smooth probability density functions (e.g., Gaussian or Laplace)”, which discloses the probability density function that is a Gaussian distribution.  Note that the equation of the claim is a Gaussian distribution with a mean or mu value of zero; and Page 6, §4.2; the section discloses the equation of the claim under equation 11, where the value of “mu” is zero).
The motivation to combine Han, Park, and Banner is the same as discussed above with respect to claim 10.

Regarding claim 12, the rejection of claim 1 is incorporated but Han fails to explicitly disclose each of the additional values being outside a threshold range from a distribution fit of both the values and the additional values in a component of the neural network.
Banner discloses each of the additional values being outside a threshold range from a distribution fit of both the values and the additional values in a component of the neural network (Page 4, Figure 1; the figure discloses wherein each of the additional values or outlier weights of the neural network being outside of a threshold range from a distribution fit (Gaussian distribution) of both the values (the weights that will be quantized) and the additional values (outlier weights) in a component of the neural network, the component being a weight in a layer of the neural network).
The motivation to combine Han, Park, and Banner is the same as discussed above with respect to claim 10.


Regarding claim 13, the rejection of claim 1 is incorporated but Han fails to explicitly disclose at least one of the references being encoded using three bits or four bits.
Park discloses at least one of the references being encoded using three bits or four bits (Page 688, Column 2; “It applies reduced precision, e.g., 4-bit representation, to the low-precision region that contains a majority of the data”, which discloses that one of the references or quantized weights is encoded or quantized to a four bit representation).
The motivation to combine Han and Park is the same as discussed above with respect to claim 1.

Regarding claim 15, the rejection of claim 1 is incorporated and Han further discloses at least one of the one or more values of the neural network being a weight (Page 2, ¶4; “Next, the weights are quantized so that multiple connections share the same weight, thus only the codebook (effective weights) and the indices need to be stored”, which disclose that the one or more values of the NN is a weight that is ultimately quantized; and Abstract; “Next, we quantize the weights to enforce weight sharing, finally, we apply Huffman coding”).

Claims 7, 20, 25 are rejected under 35 U.S.C. 103 as being obvious over Han in view of Park and Banner and further in view of Teknomo (Teknomo, “K-Means Clustering Tutorial”, Jul. 2007, pp. 1-12, hereinafter “Teknomo”).

Regarding claims 7, 20, and 25, the rejection of claims 1, 5, 18, 19, 22, and 24 are incorporated and Han further discloses the analysis of neural networks (Abstract; the abstract and the paper in general discloses the analysis of neural network quantization).
Han fails to explicitly disclose minimizing a sum of each distance between a) each of the one or more values of the neural network and b) the selected value of the cluster that the value is assigned to by iteratively: performing the assigning, the assigning further comprising reassigning a first value of the one or more values of the neural network from an original cluster of the clusters to a new cluster of the clusters where an original distance of the first value to the selected value of the original cluster is greater than a new distance of the first value to the selected value of the new cluster; and subsequently performing the selecting on at least the original cluster and the new cluster; wherein the first value is a different value of the one or more values of the neural network upon each iteration.
minimizing a sum of each distance between a) each of the one or more values of the neural network and b) the selected value of the cluster that the value is assigned to by iteratively: (Page 1, ¶1; “Simply speaking it is an algorithm to classify or to group your objects based on attributes/features into K number of group. K is positive integer number. The grouping is done by minimizing the sum of squares of distances between data and the corresponding cluster centroid”; and Page 11; “There are a lot of applications of the K-mean clustering, range from unsupervised learning of neural network”; and Page 1 at the bottom; “iterate until stable”)
performing the assigning, the assigning further comprising reassigning a first value of the one or more values of the neural network from an original cluster of the clusters to a new cluster of the clusters where an original distance of the first value to the selected value of the original cluster is greater than a new distance of the first value to the selected value of the new cluster; and (Page 1 at the bottom; “Iterate until stable (= no object move group): 1. Determine the centroid coordinate 2. Determine the distance of each object to the centroids 3. Group the object based on minimum distance (find the closest centroid)”; and Page 2, Figure; the figure discloses the assigning and reassigning through each iteration of the k means clustering algorithm, and the assigning is based on distances that are new distances that are smaller than an original distance)
subsequently performing the selecting on at least the original cluster and the new cluster; (Page 2, Figure; the figure discloses the selecting on at least the original cluster and the new cluster)
wherein the first value is a different value of the one or more values of the neural network upon each iteration (Page 2, Figure; the figure discloses that the values are different upon each iteration as the values are based on a distance to a centroid; and Page 3-5; steps 1-9 show that the values are different for the objects in each iteration of the k-means algorithm).
Han, Park, Banner, and Teknomo are analogous art because all are concerned with artificial intelligence algorithms such as neural networks.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural networks and artificial intelligence to combine the k-means clustering algorithm of Teknomo with the method of Han and Park and Banner to yield the predictable result of minimizing a sum of each distance between a) each of the one or more values of the neural network and b) the selected value of the cluster that the value is assigned to by iteratively: performing the assigning, the assigning further comprising reassigning a first value of the one or more values of the neural network from an original cluster of the clusters to a new cluster of the clusters where an original distance of the first value to the selected value of the original cluster is greater than a new distance of the first value to the selected value of the new cluster; and subsequently performing the selecting on at least the original cluster and the new cluster; wherein the first value is a different value of the one or more values of the neural network upon each iteration. The motivation for doing so would be to classify or to group your objects based on attributes/features into K numbers of groups (Teknomo; Page 1).

8, 9, 21, 26, and 27 are rejected under 35 U.S.C. 103 as being obvious over Han in view of Park and Banner and further in view of Garland et al. (Garland et al., “Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional Neural Networks”, Jan. 22, 2017, IEEE COMPUTER ARCHITECTURE LETTERS, VOL. 16, NO. 2, pp. 132-135, hereinafter “Garland”).

Regarding claims 8, 21, and 26, the rejection of claims 1, 18, and 22 are incorporated but Han fails to explicitly disclose generating an output from performing one or more multiply-accumulate operations A1B1 + - + AnBn on input vectors A and input vectors B, wherein n is the n-th input vector and wherein one or more of input vectors B are each one of the representative values, by accumulating input vectors A to an accumulated sum of input vectors A per input vector B having the same representative value and subsequently multiplying each of the accumulated sums of input vectors A by the representative value of the input vector B.
Garland discloses generating an output from performing one or more multiply-accumulate operations A1B1 + - + AnBn on input vectors A and input vectors B, wherein n is the n-th input vector and wherein one or more of input vectors B are each one of the representative values, by accumulating input vectors A to an accumulated sum of input vectors A per input vector B having the same representative value and subsequently multiplying each of the accumulated sums of input vectors A by the representative value of the input vector B (Page 132, §3; “We propose to reduce the area and power consumption of the MACs by re-architecting the MAC to do the accumulation first followed by a shared post-pass multiplication. Rather than computing the Sum Of Products (SOP) in the MAC directly, we instead count how many times each of the b weight indexes appears and store the corresponding image value in a register bin. For example, if the shared weight with index 2 had the value 19 and were multiplied and accumulated with the image value 25, then a weight sharing MAC would compute 19 25 ¼ 475 and add this value to the accumulator. Instead we keep b separate accumulators, one for each weight value. If we encounter the shared weight with index 2, value 19 and image value 25, then rather than performing any multiplication, we instead add 25 to accumulator number 2 in the local b-entry register file. Storing this result in a register file that is local the MAC unit reduces unnecessary data movement . . . The system computes the dot product by computing the total of how many of each of the weights appear in the sum. This turns the multiply-accumulate step into an array-index and-add operation” (emphasis added), which discloses, under a broadest reasonable interpretation of the claim language, accumulating vectors or weights in an accumulated sum of input vectors or weights and then multiplying the accumulated sums by a representative value of the input vector or weight.  The paper further discloses the dot product operation as claimed, which is the A1B1 -operation.  Note that this is in view of the 112b indefiniteness rejection above where “n” is interpreted to be an index of a vector or weight; and Figure 3).
Han, Park, Banner, and Garland are analogous art because all are concerned with neural network computing.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural networks to combine the accumulate then multiply algorithm of Garland with the method of Han and Park and Banner to yield 

Regarding claims 9 and 27, the rejection of claims 1, 8, 22, and 26 are incorporated and Han further discloses storing, in the memory, one or more additional values of the neural network wherein one or more of the input vectors B are each one of the additional values (Page 2, ¶5; “Our main insight is that, pruning and trained quantization are able to compress the network without interfering each other, thus lead to surprisingly high compression rate. It makes the required storage so small (a few megabytes) that all weights can be cached on chip instead of going to off-chip DRAM which is energy consuming”, which, under a broadest reasonable interpretation of the claim language, discloses storing in a memory (in a cache on a chip) one or more values (weights) of a neural network, wherein one of the input vectors B is one of the values or weights; and Abstract; the abstract makes it clear that we are considering weights of a neural network when quantizing them).
Han fails to explicitly disclose the one or more additional values of the neural network. 
one or more additional values of the neural network (Abstract; “In this study, we propose a hardware accelerator, called the outlier-aware accelerator (OLAccel). It performs dense and low-precision computations for a majority of data (weights and activations) while efficiently handling a small number of sparse and high precision outliers (e.g., amounting to 3% of total data)”, which discloses the one or more additional values of the neural network, which as disclosed in the originally filed specification and in Park, are outliers in a distribution of neural network weights; and Page 688, Column 2; “Our proposed accelerator is based on a novel quantization method called outlier-aware quantization [11], which divides the distribution of data (weights or activations) into two regions, of low and high precision. It applies reduced precision, e.g., 4-bit representation, to the low-precision region that contains a majority of the data. The high precision region contains only a small portion (e.g., 3%) of the total data, and maintains the original, high precision, e.g., 16-bit representation”, which further discloses the additional values of the neural network in the form of outliers, where the outliers are not quantized like the majority of weights in a distribution of weights in a neural network, as disclosed in the specification).
The motivation to combine Han and Park is the same as discussed above with respect to claim 1.



Claim 14 is rejected under 35 U.S.C. 103 as being obvious over Han in view of Park and Banner and further in view of Shen et al., (Shen et al., “Q-BERT: Hessian 

Regarding claim 14, the rejection of claim 1 is incorporated but Han fails to explicitly disclose at least one of the one or more values of the neural network being an embedding.
Shen discloses at least one of the one or more values of the neural network being an embedding (Abstract; “We can achieve comparable performance to baseline with at most 2.3% performance degradation, even with ultra-low precision quantization down to 2 bits, corresponding up to 13× compression of the model parameters, and up to 4× compression of the embedding table as well as activations”, which discloses that one or more values of the neural network being an embedding contained in an embedding table; and Page 8816, Column 1; “As will be discussed in Sec. 4.1, we find that the embedding layer is much more sensitive to quantization than the encoder layers”, further disclosing that one of the values is an embedding contained in an embedding layer; and Page 8817, Column 2; “Assume that the input sequence has n words and each word has a d-dim embedding vector”).
Han, Park, Banner, and Shen are analogous art because all are concerned with neural network quantization.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural network quantization to combine the embedding value of Shen with the method of Han and Park and Banner to yield the predictable result of at least one of the one or more values of the neural network being 

Claim 16 is rejected under 35 U.S.C. 103 as being obvious over Garland in view of Park and Banner.

Regarding claim 16, Garland discloses [a] system for computation of layers in a neural network, comprising: (Abstract; “One approach to reducing data sizes and memory traffic in CNN accelerators is “weight sharing”, where the full range of values in a trained CNN are put in bins and the bin index is stored instead of the original weight value. In this paper we propose a novel MAC circuit that exploits binning in weight-sharing CNNs. Rather than computing the MAC directly we instead count the frequency of each weight and place it in a bin. We then compute the accumulated value in a subsequent multiply phase”, which discloses the system for a computation of layers in a neural network by weight sharing)
one or more accumulators each configured to accumulate one or more values of the neural network for each of one or more references to an identical representative value to generate an output for each accumulation; and (Page 132, §3; “We propose to reduce the area and power consumption of the MACs by re-architecting the MAC to do the accumulation first followed by a shared post-pass multiplication. Rather than computing the Sum Of Products (SOP) in the MAC directly, we instead count how many times each of the b weight indexes appears and store the corresponding image value in a register bin. For example, if the shared weight with index 2 had the value 19 and were multiplied and accumulated with the image value 25, then a weight sharing MAC would compute 19 25 ¼ 475 and add this value to the accumulator. Instead we keep b separate accumulators, one for each weight value. If we encounter the shared weight with index 2, value 19 and image value 25, then rather than performing any multiplication, we instead add 25 to accumulator number 2 in the local b-entry register file. Storing this result in a register file that is local the MAC unit reduces unnecessary data movement . . . The system computes the dot product by computing the total of how many of each of the weights appear in the sum. This turns the multiply-accumulate step into an array-index and-add operation” (emphasis added), which discloses, under a broadest reasonable interpretation of the claim language, accumulating one or more values of the neural network for each of one or more references to an identical representative value to generate an output for each accumulation; and Figure 3; the figure discloses the accumulation operation and processing elements to perform the accumulation, which are generic computing elements as discussed in the 112f interpretation above).
a shared multiplier-accumulator configured to, for each of the outputs from each processing element, multiply the output with the identical representative value respective to the output to generate a final output (Page 132, §3; “We propose to reduce the area and power consumption of the MACs by re-architecting the MAC to do the accumulation first followed by a shared post-pass multiplication. Rather than computing the Sum Of Products (SOP) in the MAC directly, we instead count how many times each of the b weight indexes appears and store the corresponding image value in a register bin. For example, if the shared weight with index 2 had the value 19 and were multiplied and accumulated with the image value 25, then a weight sharing MAC would compute 19 25 ¼ 475 and add this value to the accumulator. Instead we keep b separate accumulators, one for each weight value. If we encounter the shared weight with index 2, value 19 and image value 25, then rather than performing any multiplication, we instead add 25 to accumulator number 2 in the local b-entry register file. Storing this result in a register file that is local the MAC unit reduces unnecessary data movement . . . The system computes the dot product by computing the total of how many of each of the weights appear in the sum. This turns the multiply-accumulate step into an array-index and-add operation” (emphasis added), which discloses, under a broadest reasonable interpretation of the claim language, for each of the outputs from each processing element, multiply the output with the identical representative value respective to the output to generate a final output; and Figure 3; the figure discloses the multiplication operation and shared processing elements to perform the multiplication, which are generic computing elements as discussed in the 112f interpretation above; and Abstract; “One approach to reducing data sizes and memory traffic in CNN accelerators is “weight sharing”, where the full range of values in a trained CNN are put in bins and the bin index is stored instead of the original weight value. In this paper we propose a novel MAC circuit that exploits binning in weight-sharing CNNs. Rather than computing the MAC directly we instead count the frequency of each weight and place it in a bin. We then compute the accumulated value in a subsequent multiply phase”)
Garland fails to explicitly disclose but Park discloses wherein the shared multiplier-accumulator is further configured to accumulate one or more of the final outputs with one or more additional values if present, (Page 690, Figure 4; the figure discloses accumulating one or more final output (normal PE groups) with one or more additional values or outliers (outlier PE group)).
Garland and Park are analogous art because both are concerned with neural networks.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural networks to combine the accumulation of outputs with additional values as disclosed by Park with the system of Garland to yield the predictable result of wherein the shared multiplier-accumulator is further configured to accumulate one or more of the final outputs with one or more additional values if present. The motivation for doing so would be to implement outlier-aware quantization, which provides a majority of data with fine-grained quantization while maintaining the precision of important outliers (Park; Conclusion).
Garland fails to explicitly disclose but Banner discloses the one or more additional values being determined using a probability density function defining a probability that each additional value belongs to a distribution (Page 4, Figure 1; the figure discloses wherein each of the additional values or outlier weights of the neural network are a component or weight of the network and have a probability less than a threshold value (thus being determined using a probability density function), the threshold value being the alpha value in the figure which is used to determine the outliers of the weight distribution for clipping; and Page 4, §4; “Let X be a high precision random variable with a probability density function f(x). Without loss of generality, we assume a prepossessing step has been made so that the average value in the tensor zero i.e., X = µ = 0”; and Page 5, ¶2; “we consider only smooth probability density functions (e.g., Gaussian or Laplace)”, which discloses the probability density function that is a Gaussian distribution.  Note that the equation of the claim is a Gaussian distribution with a mean or mu value of zero).
Garland, Park, and Banner are analogous art because all are concerned with neural network computations.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural networks to combine the probability density function as taught by Banner with the system of Garland and Park to yield the predictable result of the one or more additional values being determined using a probability density function defining a probability that each additional value belongs to a distribution. The motivation for doing so would be to establish the optimal clipping values under either Gaussian or Laplace distributions for weights of a neural network (Banner; Page 4, §4).

Response to Arguments

Applicant’s arguments, filed on 1/27/2022, and amendments, filed on 2/10/2022, with respect to the 35 USC § 112(f) interpretation of claim 16 have been fully considered and are not persuasive.  

Applicant argues on page 10, first paragraph, of the Remarks, filed on 1/27/2022, that “Claim 16 has also been amended to no longer be interpreted under 35 U.S.C. 112(f) by replacing "processing elements" with "accumulators" and "shared processing unit" with "shared multiplier-accumulator"”.  Examiner respectfully disagrees.  Applicant has not provided any explanation as to why the terms “accumulator” and “multiplier-

Applicant’s arguments, filed on 1/27/2022, and amendments, filed on 2/10/2022, with respect to the 35 USC § 103 rejection of claims 1-6, 13, 15, 18, 19, and 22-24 and the 35 USC § 102(a)(1) rejection of claim 16 have been fully considered and are not persuasive.  

Beginning on page 11, second full paragraph of the remarks, filed on 1/27/2022, Applicant argues, inter alia, that the art of record, specifically Park “does not provide advantages such as those demonstrated experimentally by the present Application”, and Applicant cites to various paragraphs of the originally filed specification to prove the alleged advantages of the present invention.  Examiner respectfully disagrees with Applicant’s prior art analysis.

In response to applicant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., maintaining model accuracy without reducing compression) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).

Applicant has not cited to or argued against any specific teaching of any identified prior art for any specific claim language of the present application.  Applicant merely makes a generalization that the advantages or motivations for the experiments in the prior art are not the same as the advantages of the techniques contained in the present invention.  This argument is not persuasive because the Applicant has not alleged any improvement that is reflected in the specific claim language of the present application.  Applicant has also failed to provide any specific evidence or arguments as to why the art of record does not teach any aspect of the claim language of the present invention.  

For these reasons, Applicant’s arguments against the prior art rejections STANDS.

Applicant’s arguments, filed on 1/27/2022, with respect to the 35 USC § 101 rejection of the claims have been fully considered and are not persuasive.  

Beginning on page 12, second paragraph of the remarks, filed on 10/7/2021, Applicant argues that “the subject-matter of these claims as amended are integrated into a practical application and provide significantly more than an abstract idea such as by providing improved computer functionality, improved processing speed, and 

First, a conclusory statement about improving the processing of neural networks does not guarantee eligibility (see MPEP §2106.05(g)).  Second, Applicant has not identified any particular additional element from any particular claim beyond an identified abstract idea/judicial exception that either integrates the abstract ideas into a practical application or provides significantly more than an abstract idea. Rather, Applicant has made conclusory statements that the subject matter of the claims provides “improvements to computational accuracy, reduction in computational processing time, and reduction in computational resources” without any reference to specific elements in the claim language that reflect these improvements.

Applicant further asserts on page 12, third paragraph of the remarks, filed on 1/27/2022, that “determining the values and the additional values cannot be performed accurately and efficiently without the particular computer-implemented method or system claimed. The described advantages, such as in reducing computational processing time and computational resources, are necessarily particular to a computer-implemented method or system, as such improvements are on a scale inconceivable to be carried out abstractly without computer implementation”.  Again, Applicant has failed to identify any specific claim language or evidence from the originally filed specification that demonstrates or proves these assertions.



Conclusion

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Brent Hoover whose telephone number is (303)297-4403. The examiner can normally be reached Monday - Friday 9-5 MST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on 571-270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/BRENT JOHNSTON HOOVER/Examiner, Art Unit 2127                                                                                                                                                                                                        


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 Note that the Specification appears to provide sufficient structural support for the “one or more accumulators each configured to accumulate” and “a shared multiplier-accumulator configured to . . . multiply” in at least Figure 6A and paragraphs [0127] and [0129] of the originally filed specification, and all of the components appear to be generic processing elements that contain adders and register files.