DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Drawings
The drawings are objected to under 37 CFR 1.83(a).  The drawings must show every feature of the invention specified in the claims.  Therefore, the quantizer adapted to receive a plurality of input values and further configured to quantize input weights to generate the plurality of weights must be shown as in claim 7, and the quantizer adapted to receive a plurality of input values and the quantizer corresponds to one or more activation layer of the artificial neural network must be shown as in claim 8, and the quantizer adapted to receive a plurality of input values and the quantizer corresponds to one or more weight layer of the artificial neural network must be shown as in claim 9 or the features canceled from the claims.  No new matter should be entered.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Objections
Claims 1-20 are objected to because of the following informalities.  
Regarding claim 1 lines 7-8, claim 17 lines 8-9, and claim 20 lines 11-12 recite the limitation “the quantized input values”.  This limitation lacks antecedent basis.  Antecedent basis is present for “the plurality of quantized input values”.  Claims 2-16 inherit the same deficiency as claim 1 by reason of dependence. Claims 18-19 inherit the same deficiency as claim 17 by reason of dependence.
Regarding claim 1 line 9, claim 17 line 10, and claim 20 line 13 recite the limitation “the output values”.  This limitation lacks antecedent basis.  Antecedent basis is present for “the plurality of output values”.  Claims 2-16 inherit the same deficiency as claim 1 by reason of dependence. Claims 18-19 inherit the same deficiency as claim 17 by reason of dependence.
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.



Claims 8-9, 10, 13, and 19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.

Claim 8 recites “the quantizer corresponds to one or more activation layer”.  It is unclear what is meant by “corresponds”.  Furthermore, the specification provides to further description of correspondence with respect to the quantizer and one or more activation layer to support determination as to the scope of the claim.  For purposes of examination, Examiner interprets “corresponds” to mean one or more activation layer comprises a quantizer or is operatively connected to a quantizer.
Claim 9 and claim 19 recite “the quantizer corresponds to one or more weight layer”.  It is unclear what is meant by “corresponds”.  Furthermore, the specification provides to further description of correspondence with respect to the quantizer and one or more weight layer to support determination as to the scope of the claim.  For purposes of examination, Examiner interprets “corresponds” to mean one or more weight layer comprises a quantizer or is operatively connected to a quantizer.
Claim 10 recites “the artificial neural network further comprises a quantizer for each weight layer and activation layer”.  It is unclear whether the element “each” modifies both the weight layer and the activation layer or just the weight layer.  If the former, Examiner recommends amending to recite “the artificial neural network further comprises a quantizer for each weight layer and each activation layer”. For purposes of examination, Examiner interprets the former construct.
Claim 13 recites “the computing node is configured to scale the gradient according to an inverse of the square root … and on an inverse square root (emphasis added). It is unclear whether the terms “according to” and “and on” are to be interpreted to have different scope.  If the same scope, Examiner suggests amending to recite the same modifiers or in a manner similar to claim 16 language.  For purposes of examination, Examiner interprets these modifiers to have the same scope consistent with “according to”.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 1-6, 10-11, 14 and 17-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by US 20200202218 Csefalvay (hereinafter “Csefalvay”).

Regarding claim 1, Csefalvay teaches the following:
an artificial neural network (Fig 1, [0001-0002] implemented in a series of hardware passes as in fig 12, [0136-0137]) comprising 
a quantizer having a configurable step size, the quantizer adapted to receive a plurality of input values and quantize the plurality of input values according to the configurable step size to produce a plurality of quantized input values (fig 4-422,[0049-0050] parameter b for configurable step size, receiving input values X1, producing quantized input values X1Q) 
at least one matrix multiplier configured to receive the plurality of quantized input values from the quantizer and to apply a plurality of weights to the quantized input values to determine a plurality of output values having a first precision ([0007] wherein deep neural network includes matrix multiplication, fig 12-1202, [0136] wherein fig 12 illustrates hardware logic configured to implement the DNN, [0141] convolution engine 1202 includes multipliers for performing matrix multiplication of the DNN according to the method of fig 10 an as in fig 4), and 
a multiplier configured to scale the output values to a second precision ([0059] multiplier in second layer as in figure 4 implemented by a multiplier in convolution engine 1202 in a hardware pass as in figure 12 with a second subset of fixed point format with different quantization parameter b2); 
a computing node operatively coupled to the artificial neural network ([0155, fig 13-1304, implementing fig 5 and fig 10 steps, coupled to DNN via quantisation parameter inputs as in figure 4) and configured to: 
provide training input data to the artificial neural network, and optimize the configurable step size based on a gradient through the quantizer and the training input data ([0047-0050], [0052], [0076-0083], [0113-0133].

Regarding claim 2, in addition to the teachings addressed in the claim 1 analysis,  Csefalvay teaches the following:
wherein the computing node is configured to optimize the configurable step size by backpropagation (title, abstract, [0046], [0050-0052], [0075-0076] [0105-0108]). 

 Regarding claim 3, in addition to the teachings addressed in the claim 2 analysis, Csefalvay teaches the following:
Wherein said backpropagation applies gradient descent (abstract, gradient of a cost metric, fig 8, fig 9, [0050], [0076-0077]).

Regarding claim 4, in addition to the teachings addressed in the claim 3 analysis, Csefalvay teaches the following:
the quantizer comprising a round function, the round function being a pass-through function in said back propagation ([0083-0084]).

Regarding claim 5, in addition to the teachings addressed in the claim 1 analysis, Csefalvay teaches the following:
Wherein the gradient is based on a neural network layer size and precision (fig 5, [0135] see especially last sentence).

Regarding claim 6, in addition to the teachings addressed in the claim 1 analysis, Csefalvay teaches the following:
the artificial neural network further comprises a second quantizer, the second quantizer adapted to quantize input weights according to the configurable step size to thereby generate the plurality of weights (Figure 4, Q2 for second quantizer, Weights W1 for input weights, configurable step size b2, plurality of weights W12 that are quantized).

Regarding claim 10, in addition to the teachings addressed in the claim 1 analysis, Csefalvay teaches the following:
wherein the artificial neural network comprises at least one weight layer and at least one activation layer, and wherein the artificial neural network further comprises a quantizer for each weight layer and activation layer. (Fig 1, Fig 4 showing multiple layers, layer 1 for weight layer, layer 2 for activation layer, fig 12 showing DNN accelerator configured to compute the various layers in respective passes, wherein configuration includes weight layers, 1202,1216, 1202, 1204, and activation layer 1202, 1208, [0135-0154]).

Regarding claim 11, in addition to the teachings addressed in the claim 1 analysis, Csefalvay teaches the following:
wherein the computing node is further configured to scale the gradient according to a number of features in an activation layer of the artificial neural network (Fig 1, Fig 4 showing multiple layers, layer 2 for activation layer, fig 12 showing DNN accelerator configured to compute the various layers in respective passes, wherein configuration includes activation layer 1202, 1208, wherein accumulated data output of accumulation buffer 1204 for number of features in an activation layer, and wherein quantization scaling the gradient is applied to input values (also called activations in some contexts), context being activation layer [0135-0154]).

Regarding claim 14, in addition to the teachings addressed in the claim 1 analysis, Csefalvay teaches the following:
wherein the computing node is further configured to scale the gradient according to a number of weights in a weight layer of the artificial neural network (Fig . Fig 1, Fig 4 showing multiple layers, layer 1 for weight layer, fig 12 showing DNN accelerator configured to compute the various layers in respective passes, wherein configuration includes weight layers, 1202,1216, 1202, 1204, [0046-0059], [0071-0076] [0135-0154]).
Claim 17 is directed to a method that is executable by the system as in claim 1.  All steps performed in claim 17 are performed by the system as configured in claim 1.  The claim 1 analysis applies equally to claim 17.

Claim 18 is directed to a method that is executable by the system as in claim 4.  All steps performed in claim 18 are performed by the system as configured in claim 4.  The claim 4 analysis applies equally to claim 18.

Claim 19 is directed to a method that is executable by the system as in claim 10.  All steps performed in claim 19 are performed by the system as configured in claim 10.  The claim 10 analysis applies equally to claim 19.

Claim 20 is directed to a computer product comprising a computer readable storage medium having program instructions executable by a processor to cause the processor to perform a method that is executable by the system as in claim 1.  All steps performed in claim 20 are performed by the system as configured in claim 1.  The claim 1 analysis applies equally to claim 20.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 7-9 are rejected under 35 U.S.C. 103 as being unpatentable over Csefalvay.

Regarding claim 7, in addition to the teachings addressed in the claim 1 analysis, Csefalvay teaches a second quantizer configured to quantize input weights according to the configurable step size to thereby generate the plurality of weights (Figure 4, Q2 for second quantizer, Weights W1 for input weights, configurable step size b2, plurality of weights W12 that are quantized).  
Csefalvay does not, however, explicitly disclose in the embodiment of figure 5 the quantizer and the second quantizer comprised within a single quantizer.  However in a separate section of the disclosure, Csefalvay discloses a teaching, suggestion, or motivation that the DNN accelerator being shown as comprising a number of functional blocks is not intended to define a strict division between different logic elements of such entities. But that each functional block may be provided in any suitable manner ([0160]).  It would therefore have been obvious to one of ordinary skill in the art before the effective filing date to combine the quantizer and the second quantizer into one quantizer to provide a single functional block suitable for functions of quantizing.  

Regarding claim 8, in addition to the teachings addressed in the claim 1 analysis, Csefalvay teaches a second quantizer corresponds to one or more activation layer of the artificial neural network (Fig 1, Fig 4 showing multiple layers, layer 2 for activation layer, fig 12 showing DNN accelerator configured to compute the various layers in respective passes, wherein configuration includes activation layer 1202, 1208, Figure 4, Q2 for second quantizer, and wherein quantization scaling the gradient is applied to input values (also called activations in some contexts), context being activation layer [0135-0154]).
Csefalvay does not, however, explicitly disclose in the embodiment of figure 5 the quantizer and the second quantizer comprised within a single quantizer.  However in a separate section of the disclosure, Csefalvay discloses a teaching, suggestion, or motivation that the DNN accelerator being shown as comprising a number of functional blocks is not intended to define a strict division between different logic elements of such entities. But that each functional block may be provided in any suitable manner ([0160]).  It would therefore have been obvious to one of ordinary skill in the art before the effective filing date to combine the quantizer and the second quantizer into one quantizer to provide a single functional block suitable for functions of quantizing.  

Regarding claim 9, in addition to the teachings addressed in the claim 1 analysis, Csefalvay teaches a second quantizer corresponds to one or more weight layer of the artificial neural network  (Figure 4, Q2 for second quantizer).  
Csefalvay does not, however, explicitly disclose in the embodiment of figure 5 the quantizer and the second quantizer comprised within a single quantizer.  However in a separate section of the disclosure, Csefalvay discloses a teaching, suggestion, or motivation that the DNN accelerator being shown as comprising a number of functional blocks is not intended to define a strict division between different logic elements of such entities. But that each functional block may be provided in any suitable manner ([0160]).  It would therefore have been obvious to one of ordinary skill in the art before the effective filing date to combine the quantizer and the second quantizer into one quantizer to provide a single functional block suitable for functions of quantizing.  

Claims 12-13, and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Csefalvay in view of Q. Jin, Towards Efficient Training for Neural Network Quantization, arXiv:1912.10207v1 [cs.V] 2019 (hereinafter “Jin”).

Regarding claim 12, in addition to the teachings addressed in the claim 1 analysis, Csefalvay teaches 
wherein the computing node is further configured to scale the gradient according to a number of features in an activation layer of the artificial neural network (Fig 1, Fig 4 showing multiple layers, layer 2 for activation layer, fig 12 showing DNN accelerator configured to compute the various layers in respective passes, wherein configuration includes activation layer 1202, 1208, wherein accumulated data output of accumulation buffer 1204 for number of features in an activation layer, and wherein quantization scaling the gradient is applied to input values (also called activations in some contexts), context being activation layer [0135-0154]).
Csefalvay does not, however, explicitly disclose scaling the gradient according to an inverse of the square root. However, in the same field of endeavor Jin discloses scaling according to an inverse square root (section 3.3.1, equation 7).  Although Jin discloses scaling with respect to weights, Csefalvay discloses applying the scaling to both weights and features of an activation layer as appropriate (Csefalvay [0135-0154]).  It would therefore have been obvious to one of ordinary skill in the art before the effective filing date to apply Jin’s scaling according to an inverse square root, to Csefalvay’s scaling of a number of features in an activation layer to achiever the benefit comparable or even better performance than full precision neural networks (Jin, abstract).

Regarding claim 13, in addition to the teachings addressed in the claim 1 analysis, Csefalvay teaches 
wherein the computing node is further configured to scale the gradient according to a number of features in an activation layer of the artificial neural network (Fig 1, Fig 4 showing multiple layers, layer 2 for activation layer, fig 12 showing DNN accelerator configured to compute the various layers in respective passes, wherein configuration includes activation layer 1202, 1208, wherein accumulated data output of accumulation buffer 1204 for number of features in an activation layer, and wherein quantization scaling the gradient is applied to input values (also called activations in some contexts), context being activation layer [0135-0154]).
Csefalvay does not, however, explicitly disclose scaling the gradient according to an inverse of the square root and on an inverse of the square root of a number of quantization levels of the quantizer. However, in the same field of endeavor Jin discloses scaling according to an inverse square root of weights in a layer (section 3.3.1, equation 7).  Although Jin discloses scaling with respect to weights, Csefalvay discloses applying the scaling to both weights and features of an activation layer as appropriate (Csefalvay [0135-0154]).  It would therefore have been obvious to one of ordinary skill in the art before the effective filing date to apply Jin’s scaling according to an inverse square root, to Csefalvay’s scaling of a number of features in an activation layer to achiever the benefit comparable or even better performance than full precision neural networks (Jin, abstract).
	Furthermore, Jin discloses scaling with respect to an inverse of the square root of a number of quantization levels of the quantizer (Jin 3.3.2, equation 10). It would therefore have been obvious to one of ordinary skill in the art before the effective filing date to apply Jin’s scaling according to an inverse square root of a number of quantization levels of the quantizer, to Csefalvay’s scaling of a number of features in an activation layer and on an inverse of the square root of a number of quantization levels of a quantizer to achiever the benefit comparable or even better performance than full precision neural networks (Jin, abstract).

Regarding claim 15, in addition to the teachings addressed in the claim 1 analysis, Csefalvay teaches 
wherein the computing node is further configured to scale the gradient according to a number of weights in a weight layer the artificial neural network (Fig 1, Fig 4 showing multiple layers, layer 1 for weight layer, Q2 for second quantizer, Weights W1 for input weights, configurable step size b2, plurality of weights W12 that are scaled,  fig 12 showing weight layer 1216, 1212, 1204, [0071-0076]).
Csefalvay does not, however, explicitly disclose scaling the gradient according to an inverse of the square root. However, in the same field of endeavor Jin discloses scaling according to an inverse square root with respect to weights (section 3.3.1, equation 7).  It would therefore have been obvious to one of ordinary skill in the art before the effective filing date to apply Jin’s scaling according to an inverse square root, to Csefalvay’s scaling of a number of weights in a weight layer to achieve the benefit comparable or even better performance than full precision neural networks (Jin, abstract).

Regarding claim 16, in addition to the teachings addressed in the claim 1 analysis, Csefalvay teaches 
wherein the computing node is further configured to scale the gradient according to a number of weights in a weight layer the artificial neural network (Fig 1, Fig 4 showing multiple layers, layer 1 for weight layer, Q2 for second quantizer, Weights W1 for input weights, configurable step size b2, plurality of weights W12 that are scaled,  fig 12 showing weight layer 1216, 1212, 1204, [0071-0076]).
Csefalvay does not, however, explicitly disclose scaling the gradient according to an inverse of the square root. However, in the same field of endeavor Jin discloses scaling according to an inverse square root with respect to weights (section 3.3.1, equation 7).    It would therefore have been obvious to one of ordinary skill in the art before the effective filing date to apply Jin’s scaling according to an inverse square root, to Csefalvay’s scaling of a number of weights in a weight layer to achieve the benefit comparable or even better performance than full precision neural networks (Jin, abstract).
Furthermore, Jin discloses scaling with respect to an inverse of the square root of a number of quantization levels of the quantizer (Jin 3.3.2, equation 10). It would therefore have been obvious to one of ordinary skill in the art before the effective filing date to apply Jin’s scaling according to an inverse square root of a number of quantization levels of the quantizer, to Csefalvay’s scaling of a number of features in an activation layer and on an inverse of the square root of a number of quantization levels of a quantizer to achiever the benefit comparable or even better performance than full precision neural networks (Jin, abstract).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US 20200193273 A1 Chung et al. discloses quantized precision operations in a neural network including training a neural network for use with a quantized model  and quantization of inputs and weights in a layer (abstract, fig 6, fig 7).
US 20200097818 A1 Li et al., discloses training a neural network including quantization operations on feature map and weight tensors (abstract, fig 6A, 6F).
	S.Wu et al., Training and Inference with Integers in Dep Neural Networks, arXiv:1802.04680v1 [cs.LG], 2018  discloses discretizing, quantizing weights, activations, gradients among layers of deep learning accelerators (abstract, fig 1).
S.K. Esser et al., Learned Step Size Quantization, arXiv:1902.08153v3 [cs.LG] 2020 (hereinafter “Esser”) disclosure by Applicant of aspects of the claimed invention (abstract, fig 1, fig 2, entire disclosure).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to EMILY E LAROCQUE whose telephone number is (469)295-9289.  The examiner can normally be reached on 10:00am - 1200pm, 2:00pm - 8pm ET M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor Jyoti Mehta can be reached on 571-270-3995.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/EMILY E LAROCQUE/Primary Examiner, Art Unit 2182