DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is responsive to the original application filed on 10/8/2019. Acknowledgement is made with respect to a claim of priority to Provisional Application No. 62/880,475 filed on 7/30/2019.

Claim Objections

Claim 8 is objected to because of the following informalities: Claim 8 recites the limitation “wherein the floating-point network is a recurrent neural network and all time steps use a same set of scaling and shift values” which should read as “wherein the floating-point neural network is a recurrent neural network and all time steps use a same set of scaling and shift values” (emphasis added) for better clarity.  Appropriate correction is required.

Claim 9 is objected to because of the following informalities: Claim 9 recites the limitation “wherein quantizing the floating point network produces a neural network using values quantized to a particular range for execution by a neural network inference circuit, the method further comprising generating a set of program instructions for executing the quantized neural network on the neural network inference circuit” which should read as “wherein quantizing the floating point neural network produces a neural network using values quantized to a particular range for execution by a neural network inference circuit, the method further comprising generating a set of program instructions for executing the quantized neural network on the neural network inference circuit” (emphasis added) for better clarity.  Appropriate correction is required.

Claim Rejections - 35 USC § 112

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim 1 recites the limitations “for each layer of a set of the layers: determining a distribution of values for a set of input value sets of the layer; based on the determined distribution, selecting a set of scaling and shift values for application to input values of the layer” (emphasis added). It is unclear if the selected set of scaling and shift values is applied to the set of input value sets or to some different input values of layer.  Please explain.  For examination purposes, the limitation will be interpreted to read “for each layer of a set of the layers: determining a distribution of values for input values of the layer; based on the determined distribution, selecting a set of scaling and shift values for application to the input values of the layer”.  
For the above reasons, claim 1 is rejected as being indefinite. This rejection applies equally to independent claim 14, as well as to dependent claims 2-13 and 15-20. Appropriate correction is required.

Claim 2 recites the limitation “wherein the set of constraints is based on the type of computation that the layer performs” (emphasis added). There is insufficient antecedent basis for this element in the claim.  For examination purposes, the limitation will be interpreted to read “wherein the set of constraints is based on a type of computation that the layer performs” (emphasis added) for better clarity.  This rejection applies equally to dependent claims 3-13 and 15.  Appropriate correction is required.

Claim 3 recites the limitation “wherein a particular layer performs an element-wise addition and the set of constraints comprises a constraint that the inputs must be scaled by a same scaling value” (emphasis added). It is unclear if this element in claim 3 is referring to the “set of input value values” or the “the input values of the layer” of claim 1.  Please explain.  For examination purposes, the limitation will be interpreted to read “wherein a particular layer performs an element-wise addition and the set of constraints comprises a constraint that the input values of the layer must be scaled by a same scaling value” (emphasis added) for better clarity.  Appropriate correction is required.

Claim 4 recites the limitation “wherein a particular layer performs an element-wise multiplication and the set of constraints comprises a constraint that the shift value must be zero” (emphasis added). It is unclear as to whether the shift value must be zero for the particular layer or each layer in the neural network.  Please explain.  For examination purposes, the limitation will be interpreted to read “wherein a particular layer performs an element-wise multiplication and the set of constraints comprises a constraint that the shift value of the particular layer/for each layer must be zero” (emphasis added) for better clarity.  Appropriate correction is required.

Claim 6 recites the limitation “wherein the selected set of scaling and shift values are a first set of scaling and shift values applied to inputs to the layers,” (emphasis added). It is unclear as to whether the inputs in this claim are any of the same inputs as disclosed in claim 1.  Please explain.  For examination purposes, the limitation will be interpreted to read “wherein the selected set of scaling and shift values are a first set of scaling and shift values applied to the input values of the layer” (emphasis added) for better clarity.  Claim 6 further recites “selecting a second set of scaling and shift values to apply to the output of the layers” (emphasis added).  There is insufficient antecedent basis for this element in the claim.  For examination purposes, the limitation will be interpreted to read “selecting a second set of scaling and shift values to apply to an output of the layers” (emphasis added) for better clarity. This rejection applies equally to dependent claims 7 and 16. Appropriate correction is required.

Claim 7 recites “wherein the second set of scaling and shift values is selected to cancel the effect of the first set of scaling and shift values on the output of the layers” (emphasis added).  There is insufficient antecedent basis for this element in the claim.  For examination purposes, the limitation will be interpreted to read “wherein the second set of scaling and shift values is selected to cancel an effect of the first set of scaling and shift values on the output of the layers” (emphasis added) for better clarity. This rejection applies equally to dependent claim 17. Appropriate correction is required.

Claim Rejections - 35 USC § 101

35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more.  The analysis of the claims will follow the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”).

When considering subject matter eligibility under 35 U.S.C. 101, it must be determined whether the claim is directed to one of the four statutory categories of invention, i.e., process, machine, manufacture, or composition of matter (Step 1). If the claim does fall within one of the statutory categories, the second step in the analysis is to determine whether the claim is directed to a judicial exception (Step 2A). The Step 2A analysis is broken into two prongs. In the first prong (Step 2A, Prong 1), it is determined whether or not the claims recite a judicial exception (e.g., mathematical concepts, mental processes, certain methods of organizing human activity). If it is determined in Step 2A, Prong 1 that the claims recite a judicial exception, the analysis proceeds to the second prong (Step 2A, Prong 2), where it is determined whether or not the claims integrate the judicial exception into a practical application. If it is determined at step 2A, Prong 2 that the claims do not integrate the judicial exception into a practical application, the analysis proceeds to determining whether the claim is a patent-eligible application of the exception (Step 2B). If an abstract idea is present in the claim, any element or combination of elements in the claim must be sufficient to ensure that the claim integrates the judicial exception into a practical application, or else amounts to significantly more than the abstract idea itself.

Claim 1
Step 1:  The claim recites a method; therefore, it is directed to the statutory category of a process.
Step 2A Prong 1:  The claim recites, inter alia:
for each layer of a set of the layers: determining a distribution of values for a set of input value sets of the layer;: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of determining a distribution of values for a set of input values set for a layer of a neural network, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
based on the determined distribution, selecting a set of scaling and shift values for application to input values of the layer:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of selecting a set of scaling and shift values for application to input values, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
quantizing the floating-point neural network using the selected set of scaling and shift values: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of quantizing a neural network using values, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper or a mathematical concept in the form of a mathematical calculation as evidenced by equations (5) and (21) of the originally filed specification.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application. Specifically, the additional element consists of “receiving a floating-point neural network definition comprising a plurality of layers”. The additional element of “receiving a floating-point neural network definition comprising a plurality of layers” is insignificant extra-solution activity that does not amount to an inventive concept (see MPEP §2106.05 (g); “mere data gathering”).   Thus, the claim does not recite any additional elements that integrate the abstract idea into a practical application, and the claim is thus directed to the abstract idea
Step 2B:  The claim does not contain significantly more than the judicial exception.  The additional element of “receiving a floating-point neural network definition comprising a plurality of layers” is insignificant extra-solution activity that does not amount to an inventive concept (see MPEP §2106.05 (g); “mere data gathering”), and is well-understood, routine, conventional activity that does not provide significantly more than an abstract idea (see MPEP § 2106.05(d); “Receiving or transmitting data over a network”). This claim does not recite any additional elements that provides significantly more than the above identified abstract ideas.  As such, the claim is ineligible.

Claim 2
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “determining a set of constraints on the scaling and shift values for each layer, wherein the set of constraints is based on the type of computation that the layer performs, and wherein the set of scaling and shift values for the layer are selected based on the set of constraints for the layer”. Under its broadest reasonable interpretation in light of the specification, these limitations encompass the metal processes of determining constraints on values, which are mental processes that are practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 3
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “wherein a particular layer performs an element-wise addition”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical concept of performing addition.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 4
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “wherein a particular layer performs an element-wise multiplication”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical concept of performing multiplication.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 5
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “wherein a particular layer performs a concatenation”. Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mathematical concept of performing concatenation.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 6
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “based on the selected first set of scaling and shift values, selecting a second set of scaling and shift values to apply to the output of the layers, wherein quantizing the floating-point network is based on the first and second selected set of scaling and shift values”. Under its broadest reasonable interpretation in light of the specification, these limitations encompass the metal processes of selecting a second set of shift and scaling values, which are observations or evaluations that are practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 7
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “wherein the second set of scaling and shift values is selected to cancel the effect of the first set of scaling and shift values on the output of the layers”. Under its broadest reasonable interpretation in light of the specification, this limitation just further describes the underlying mental process of selecting from which the claim depends.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 8
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “wherein the floating-point network is a recurrent neural network and all time steps use a same set of scaling and shift values”. This limitation merely places restrictions on the type of data used in the analysis and the technological environment in which the judicial exception is performed, and does not negate the mental nature of the underlying process.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “wherein the floating-point network is a recurrent neural network and all time steps use a same set of scaling and shift values”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 9
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “wherein quantizing the floating point network produces a neural network using values quantized to a particular range for execution by a neural network inference circuit”. Under its broadest reasonable interpretation in light of the specification, this limitation just further describes the underlying mathematical concept of quantizing from which the claim depends.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “for execution by a neural network inference circuit”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). The additional element of “generating a set of program instructions for executing the quantized neural network on the neural network inference circuit” is a generic computer component recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 10
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “wherein the quantized values are one of 8-bit and 4-bit values”. Under its broadest reasonable interpretation in light of the specification, this limitation just further describes the underlying mental processes and mathematical concepts which the claim depends.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 11
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites the underlying mental processes and mathematical concept from which it depends.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “wherein the floating point values used by the neural network are stored using a variable position of a binary point used to represent the floating point value”, which is insignificant extra-solution activity that does not amount to an inventive concept (see MPEP §2106.05 (g);), and is well-understood, routine, conventional activity that does not provide significantly more than an abstract idea (see MPEP § 2106.05(d); “Storing and retrieving information in memory”). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 12
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “wherein the values quantized to a particular range use a fixed binary point position”. Under its broadest reasonable interpretation in light of the specification, this limitation just further describes the underlying mental processes and mathematical concepts which the claim depends.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 13
Step 1:  A method, as above.
Step 2A Prong 1:  The claim recites “constraining the set of weights to a set of ternary values comprising -1, 0, and 1 and constraining the set of output values of each layer to a set of quantized values using a number of bits less than used by a floating point value and using a fixed binary point position”. Under its broadest reasonable interpretation in light of the specification, these limitations encompass the metal processes of constraining weights to ternary values and constraining output values to quantized values”, which are observations or evaluations that are practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “wherein the floating-point neural network comprises a set of weights associated with each layer, wherein the set of weights and a set of output values of each layer are floating-point values”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 14
Step 1:  The claim recites a non-transitory machine readable medium; therefore, it is directed to the statutory category of a manufacture.
Step 2A Prong 1:  The claim recites, inter alia:
for each layer of a set of the layers: determining a distribution of values for a set of input value sets of the layer;: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of determining a distribution of values for a set of input values set for a layer of a neural network, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
based on the determined distribution, selecting a set of scaling and shift values for application to input values of the layer:  Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of selecting a set of scaling and shift values for application to input values, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper.
quantizing the floating-point neural network using the selected set of scaling and shift values: Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of quantizing a neural network using values, which is an evaluation or observation that is practically capable of being performed in the human mind with the assistance of pen and paper or a mathematical concept in the form of a mathematical calculation as evidenced by equations (5) and (21) of the originally filed specification.
Step 2A Prong 2:  This judicial exception is not integrated into a practical application. Specifically, the additional element consists of “storing a program for execution by a set of processing units, the program for transforming a neural network that uses floating point values into a neural network that uses values quantized to a particular range, the program comprising sets of instructions for” and “receiving a floating-point neural network definition comprising a plurality of layers”. The additional element of “storing a program for execution by a set of processing units, the program for transforming a neural network that uses floating point values into a neural network that uses values quantized to a particular range, the program comprising sets of instructions for” is recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)). The additional element of “receiving a floating-point neural network definition comprising a plurality of layers” is insignificant extra-solution activity that does not amount to an inventive concept (see MPEP §2106.05 (g); “mere data gathering”).   Thus, the claim does not recite any additional elements that integrate the abstract idea into a practical application, and the claim is thus directed to the abstract idea
Step 2B:  The claim does not contain significantly more than the judicial exception.  The additional element of “storing a program for execution by a set of processing units, the program for transforming a neural network that uses floating point values into a neural network that uses values quantized to a particular range, the program comprising sets of instructions for” is recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)).The additional element of “receiving a floating-point neural network definition comprising a plurality of layers” is insignificant extra-solution activity that does not amount to an inventive concept (see MPEP §2106.05 (g); “mere data gathering”), and is well-understood, routine, conventional activity that does not provide significantly more than an abstract idea (see MPEP § 2106.05(d); “Receiving or transmitting data over a network”). This claim does not recite any additional elements that provides significantly more than the above identified abstract ideas.  As such, the claim is ineligible.

Claim 15
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “determining a set of constraints on the scaling and shift values for each layer, wherein the set of constraints is based on the type of computation that the layer performs, and wherein the set of scaling and shift values for the layer are selected based on the set of constraints for the layer”. Under its broadest reasonable interpretation in light of the specification, these limitations encompass the metal processes of determining constraints on values, which are observations or evaluations that are practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 16
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “based on the selected first set of scaling and shift values, selecting a second set of scaling and shift values to apply to the output of the layers, wherein quantizing the floating-point network is based on the first and second selected set of scaling and shift values”. Under its broadest reasonable interpretation in light of the specification, these limitations encompass the metal processes of selecting a second set of shift and scaling values, which are observations or evaluations that are practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 17
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “wherein the second set of scaling and shift values is selected to cancel the effect of the first set of scaling and shift values on the output of the layers”. Under its broadest reasonable interpretation in light of the specification, this limitation just further describes the underlying mental process of selecting from which the claim depends.
Step 2A Prong 2, Step 2B:  This claim does not recite any additional elements that integrate the abstract idea into a practical application or provides significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 18
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “wherein the floating-point network is a recurrent neural network and all time steps use a same set of scaling and shift values”. This limitation merely places restrictions on the type of data used in the analysis and the technological environment in which the judicial exception is performed, and does not negate the mental nature of the underlying process.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “wherein the floating-point network is a recurrent neural network and all time steps use a same set of scaling and shift values”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 19
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “wherein quantizing the floating point network produces a neural network using values quantized to a particular range for execution by a neural network inference circuit”. Under its broadest reasonable interpretation in light of the specification, this limitation just further describes the underlying mathematical concept of quantizing from which the claim depends.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “for execution by a neural network inference circuit”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). The additional element of “generating a set of program instructions for executing the quantized neural network on the neural network inference circuit” is a generic computer component recited in a manner that represents no more than mere instructions to apply the judicial exception on a computer (see MPEP § 2106.05(f)). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.

Claim 20
Step 1:  A manufacture, as above.
Step 2A Prong 1:  The claim recites “constraining the set of weights to a set of ternary values comprising -1, 0, and 1 and constraining the set of output values of each layer to a set of quantized values using a number of bits less than used by a floating point value and using a fixed binary point position”. Under its broadest reasonable interpretation in light of the specification, these limitations encompass the metal processes of constraining weights to ternary values and constraining output values to quantized values”, which are observations or evaluations that are practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2, Step 2B:  This claim recites the additional element of “wherein the floating-point neural network comprises a set of weights associated with each layer, wherein the set of weights and a set of output values of each layer are floating-point values”, which is a field of use limitation under MPEP § 2106.05(h); MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 32. [ID:(S2AP2)1130]). Nothing in the claim integrates the abstract idea into a practical application, nor does it provide significantly more than the abstract idea, and thus the claim is subject-matter ineligible.


Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 9-12, 14, and 19 are rejected under 35 U.S.C. § 103 as being obvious over Baum et al. (US 20180285736 A1, hereinafter “Baum”) in view of Lee et al. (US 20190042948 A1, hereinafter “Lee”).

Regarding claim 1, Baum discloses [a] method for transforming a neural network [[that uses floating point values]] into a neural network that uses values quantized to a particular range, the method comprising: ([0002]; “a system and method of data driven quantization optimization of weights and input data in an artificial neural network (ANN)”; and [0017]; and [0098]; “Now, assume it is desired to quantize the weights such that the output value range is narrower”; and [0114])
for each layer of a set of the layers: determining a distribution of values for a set of input value sets of the layer; ([0105]; “A flow diagram illustrating a first example quantization scheme is shown in FIG. 11. Based on the input data distribution, the scale factor γ and shift or bias β parameters are calculated” (emphasis added); and [0102]; “Statistics are gathered using a set of counters that count the level of activity observed at the neurons at each layer and which can be performed gradually as per resource availability, i.e. a larger network will take more time for the data collection”, which discloses that the statistics used for the input distribution values are observed on a per-layer basis)
based on the determined distribution, selecting a set of scaling and shift values for application to input values of the layer; and ([0105]; “A flow diagram illustrating a first example quantization scheme is shown in FIG. 11. Based on the input data distribution, the scale factor γ and shift or bias β parameters are calculated . . . The scale factor indicates the custom range selected and the bias indicates the shift” (emphasis added); and Figure 11)
quantizing the floating-point neural network using the selected set of scaling and shift values ([0106]; “Note that the quantization can be applied linearly or nonlinearly (step 192). The scale and shift parameters are then applied to the current weight quantization (step 194). The scale and shift parameters are applied to the current input data quantization (step 196). Note that the new quantization scheme can be applied to the weight, input data, or both”; and Figure 11, Elements 194 and 196).
Baum fails to explicitly disclose but Lee discloses a neural network that uses floating point values ([0005]; “analyzing a statistical distribution for each channel of floating-point parameter values of feature maps and a kernel for each channel from data of a pre-trained floating-point neural network”; and [0080]; “the initial neural network has floating-point parameters, for example, parameters of 32-bit floating-point precision”; and [0098]; “As can be seen from the comparison, although 3 bits are allotted to both the fixed-point expression 610 of Q2.0 and the fixed-point expression 620 of Q1.1, Q2.0 is able to express a wider range of fixed-point values than Q1”; and [0107])
 receiving a floating-point neural network definition comprising a plurality of layers; (Figure 2; the figure discloses the initial floating point neural network with four layers; and [0068]; “In the example illustrated in FIG. 2, the neural network 2 is a DNN including an input layer Layer 1, two hidden layers Layer 2 and Layer 3, and an output layer Layer 4”; and Figure 7; the figure discloses the FPNN with defined layers; and [0100]).
Baum and Lee are analogous art because both are concerned with quantizing neural networks.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural network quantization to combine the floating point neural network with a layer definition as taught by Lee with the method of Baum to yield the predictable result of [a] method for transforming a neural network that uses floating point values into a neural network that uses values quantized to a particular range and receiving a floating-point neural network definition comprising a plurality of layers. The motivation for doing so would be to generate a fixed-point quantized neural network from a pre-trained floating point neural network (Lee; [0005]).

Regarding claim 14, it is a non-transitory machine readable medium claim corresponding to the steps of claim 1, and is rejected for the same reasons as claim 1.

Regarding claims 9 and 19, the rejection of claims 1 and 14 are incorporated and Baum further discloses wherein quantizing the [[floating point network]] produces a neural network using values quantized to a particular range ([0098]; [0114]) for execution by a neural network inference circuit, ([0118]; and Figure 18)
the method further comprising generating a set of program instructions for executing the quantized neural network on the neural network inference circuit ([0118]; and Figure 18).


Regarding claim 10, the rejection of claims 1 and 9 are incorporated and Baum further discloses wherein the quantized values are on of 8-bit and 4-bit values ([0096]; “The optimum quantization is to have the weights represented with 4-bits corresponding to each level out of the 16 possible levels, i.e. 1/16, 2/16, . . . , 16/16”).
Regarding claim 11, the rejection of claim 1 is incorporated and Baum fails to explicitly disclose but Lee discloses wherein the floating point values used by the neural network are stored using a variable position of a binary point used to represent the floating point value (Figure 5; the figure discloses, under a broadest reasonable interpretation of the claim language, storing FP values using a variable position of a binary point (520) used to represent the FP number (510)).
The motivation to combine Baum and Lee is the same as discussed above with respect to claim 1.

Regarding claim 12, the rejection of claim 1 is incorporated and Baum fails to explicitly disclose but Lee discloses wherein the values quantized to a particular range use a fixed binary point position (Figure 5, Element 520; and [0094]; “Furthermore, fixed-point values 520 are expressed by “Qm.n”, where m and n are natural numbers”).
The motivation to combine Baum and Lee is the same as discussed above with respect to claim 1.


Claims 2 and 15 are rejected under 35 U.S.C. § 103 as being obvious over Baum in view of Lee and further in view of Bourges-Sevenier et al. (US 20190325314 A1, hereinafter “Bourges”).

Regarding claims 2 and 15, the rejection of claims 1 and 14 are incorporated and Baum further discloses the scaling and shift values for each layer ([0105]; “A flow diagram illustrating a first example quantization scheme is shown in FIG. 11. Based on the input data distribution, the scale factor γ and shift or bias β parameters are calculated . . . The scale factor indicates the custom range selected and the bias indicates the shift” (emphasis added); and Figure 11)
wherein the set of scaling and shift values for the layer are selected based on the set of constraints for the layer ([0105]; “The scale factor indicates the custom range selected and the bias indicates the shift”, the customization and selection is based on a used-supplied constraint for the layer)
Baum fails to explicitly disclose but Bourges discloses determining a set of constraints [[on the scaling and shift values]] for each layer, wherein the set of constraints is based on the type of computation that the layer performs ([0050-0051]; “the example quantizer 235 identifies constraints associated with execution of the model. (Block 418). In examples disclosed herein, the constraints may be user-supplied constraints (e.g., energy consumption preferences, preferences for executing on particular hardware, etc.). In some examples, the constraints represent hardware limitations (e.g., bit limitations, memory limitations, whether fast instructions of 8-bit values is enabled, vectorization, tiling, ALU vs. FPU performance, etc.). The example quantizer 235 identifies (e.g., selects) a layer of the model for processing. (Block 420. . . The example quantizer 235 performs quantization of the layer based on the constraints. (Block 430)”, which discloses, under a broadeast reasonable interpretation of the claim language, determining a set of constraints for each layer of the NN, wherein the constraints are based on computation the layer performs).
Baum, Lee, and Bourges are analogous art because all are concerned with quantizing neural networks.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural network quantization to combine the layer wise constraints as taught by Bourges with the method of Baum and Lee to yield the predictable result of determining a set of constraints on the scaling and shift values for each layer, wherein the set of constraints is based on the type of computation that the layer performs, and wherein the set of scaling and shift values for the layer are selected based on the set of constraints for the layer. The motivation for doing so would be to identify constraints associated with the execution of the ML model (Bourges; [0050]).

Claim 3 is rejected under 35 U.S.C. § 103 as being obvious over Baum in view of Lee and further in view of Olmschenk et al. (US 20190340500 A1, hereinafter “Olmschenk”).

Regarding claim 3, the rejection of claim 1 is incorporated and Baum fails to explicitly disclose but Lee discloses wherein a particular layer performs an element-wise addition ([0007]; “The convolution operation may include a partial sum operation between a plurality of channels, the partial sum operation may include a plurality of multiply-accumulate (MAC) operations and an Add operation”; and [0009]).
The motivation to combine Baum and Lee is the same as discussed above with respect to claim 1.
Baum fails to explicitly disclose but Olmschenk discloses the set of constraints comprises a constraint that the inputs must be scaled by a same scaling value ([0035]; “As one may realize, because hardware 30 for implementing neural networks typically imposes constraints in terms of coefficients at the layer level (or equivalently per weight matrix level), the scaling coefficients may advantageously be defined and updated on that same level. I.e., a single scaling value α need thus be used (at each iterative step of the training) for each layer of the network”).
Baum, Lee, and Olmschenk are analogous art because all are concerned with quantizing neural networks.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural network quantization to combine the same scaling values as taught by Olmschenk with the method of Baum and the addition operations of Lee to yield the predictable result of determining a set of constraints on the scaling and shift values for each layer, wherein the set of constraints is based on the type of computation that the layer performs, and wherein the set of scaling and shift values for the layer are selected based on the set of constraints for the layer. The motivation for doing so would be to identify constraints associated with the execution of the ML model (Bourges; [0050]).


Claims 13 and 20 are rejected under 35 U.S.C. § 103 as being obvious over Baum in view of Lee and further in view of Chai et al. (US 20200134461 A1, hereinafter “Chai”).

Regarding claims 13 and 20, the rejection of claims 1 and 14 are incorporated and Baum fails to explicitly disclose but Lee discloses constraining the set of output values of each layer to a set of quantized values using a number of bits less than used by a floating point value and using a fixed binary point position ([0090]; “in the neural network quantization apparatus (10 of FIG. 3) such as a PC or a server, the processor (110 of FIG. 3), which may be a GPU, pre-trains a floating-point neural network 410, for example, a 32-bit floating-point neural network. The neural network 410 that is pre-trained cannot be efficiently processed in a low power or low performance hardware accelerator because of its floating-point parameters. Accordingly, the processor 110 of the neural network quantization apparatus 10 quantizes the floating-point neural network 410 to a fixed-point neural network 420, for example, a 16-bit or low fixed-point type” (emphasis added); and Figure 5; and [0093-0094]).
The motivation to combine Baum and Lee is the same as discussed above with respect to claim 1.
Baum fails to explicitly disclose but Chai discloses wherein the floating-point neural network comprises a set of weights associated with each layer, ([0007]; “the DNN including a plurality of layers, wherein for each layer of the plurality of layers, the set of weights includes weights of the layer and a set of bit precision values includes a bit precision value of the layer”; and [0057]; “the floating point representation of high-precision weights”) wherein the set of weights and a set of output values of each layer are floating-point values, ([0057]; “the floating point representation of high-precision weights”) wherein quantizing the floating-point neural network comprises constraining the set of weights to a set of ternary values comprising -1, 0, and 1 ([078]; “For the purpose of exposition, consider two bins at {0, 1, −1} (i.e., ±1) and consider some weight that is equal to zero”, which discloses constraining the weights to ternary values; and 
Baum, Lee, and Chai are analogous art because all are concerned with quantizing neural networks.  Before the effective filing date of the claimed invention, it would have been obvious to one skilled in neural network quantization to combine the constraining of weights to ternary values as taught by Chai with the method of Baum and the floating point quantization of Lee to yield the predictable result of wherein the floating-point neural network comprises a set of weights associated with each layer, wherein the set of weights and a set of output values of each layer are floating-point values, wherein quantizing the floating-point neural network comprises constraining the set of weights to a set of ternary values comprising -1, 0, and 1 and constraining the set of output values of each layer to a set of quantized values using a number of bits less than used by a floating point value and using a fixed binary point position. The motivation for doing so would be to train a deep neural network (DNN) for reduced computational resource requirements (Chai; Abstract).

Conclusion

Claims 4-8 and 16-18 have been searched, but no prior art was uncovered.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Brent Hoover whose telephone number is (303)297-4403.  The examiner can normally be reached on Monday - Friday 9-5 MST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on 571-270-3169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
 
/BRENT JOHNSTON HOOVER/Examiner, Art Unit 2127