DETAILED ACTION
This action is responsive to the Application filed on 01/31/2022. Claims 1-3, 6-9, 11-15, 18-20 are pending in the case.  Claims 1, 9, and 18 are independent claims. Claims 1, 6, 9, 11, 15, 18, 19, 20 are amending. 
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

	Response to Arguments
Applicant's arguments filed 01/31/2022, with respect to the 35 U.S.C 103 have been fully considered but they are not persuasive. 
	With respect to claim 1/9/18,
	Applicant argues that “Hsu, at Section 3.2, Paragraph 2, explicitly states that "Algorithm 1 presents the mantissa- quantization, which is executed after the back-propagation." Back-propagation occurs after a forward training pass. Hsu, therefore, does not teach or suggest that mantissa-quantization is performed “during a forward training pass of an artificial neural network (ANN)"”. (top of page 10)
Examiner respectfully disagrees, Hsu states “direct quantization approach may cause performance degradations. To overcome this issue, we intend to make the model learn how to quantize the parameters during the training phase” (Section 3.1 ¶01). Later Hsu states “After one epoch is completed, we quantize all the parameters and force them to be less bitwidth parameters… Then, the model must use the quantized parameters in the feed-forward part in the following epoch.” (Section 3.1 ¶02). 

The rejection presented relies upon Hsu for the teaching of mantissa quantization during the training phase, while the Raghaven/Choi indicate that quantization occurs during or as part of the forward pass of the training phase.

Other(s) of applicant’s arguments filed 01/31/2022, with respect to the 35 U.S.C 101 and 35 U.S.C.112 have been fully considered. In light of applicant’s amendments, the rejections under 101 and 112 have been withdrawn.

Claim Rejections - 35 U.S.C. § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA  35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the 

The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claim 1, 3, 6, 8, 9, 11, 13, 15, 18, 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Raghaven et al. “bit-regularized optimization of neural nets” hereinafter Raghaven, and further in view of Choi et al. “Pact: parameterized clipping activation for quantized neural networks” hereinafter Choi, and further in view of Hsu et al. “a study on speech enhancement using exponent-only floating point quantized neural network (eofp-qnn)” hereinafter Hsu.

Claim 1
Raghaven teaches, A computer-implemented method, comprising ( Section 5 “We evaluate our algorithm, ‘BitNet’, on two benchmarks used for image classification, namely MNIST and CIFAR-10. MNIST: The MNIST… database of handwritten digits has a total of 70, 000 grayscale images of size 28 × 28. We used 50, 000 images for training and 10, 000 for testing” the algorithm described that utilizes the dense MNIST data set would need to be implemented in by a computer to deal with such data.) during a forward training pass of an artificial neural network (ANN), executing a quantizing function to quantize… weights for a layer of the ANN using a first bit width ( Section 4 ¶01 “We adopt layer-wise quantization to learn one W˜ (n) and B (n) for each layer n of the CNN… the sum of squared quantization errors defined in (2) is a continuous and piecewise differentiable function” for each layer a weight with an associated first bit width is quantized. The quantization is done during the forward pass, or in the quantization stage to calculate quantization error) computing a first gradient for the first bit width; computing a new first bit width for quantizing the… weights for the layer of the ANN based on the first gradient; during a backward training pass of the ANN ( Section 4 ¶05 “Overall, the update rule (5) followed by quantization (1) implements a projected gradient descent algorithm.” 
    PNG
    media_image1.png
    90
    675
    media_image1.png
    Greyscale
the equation illustrates the bit width is updated according to gradient descent. computation of gradients is done during the backward pass after the quantization step, or forward pass mapped previously.)
Raghaven does not explicitly teach, and executing the quantizing function to quantize… activation values input to the layer of the ANN using a second bit width; and a gradient for the second bit width, and computing a new second bit width for quantizing… the activation values input to the layer of the ANN based on the second gradient; and quantizing.. weights and activation values for the ANN at inference time using the new first bit width and the new second bit width wherein the first bit width defines a bit width for a mantissa for storing the weights, 
Choi however when addressing issues related to quantizing weights and activations of a neural network teaches, and executing the quantizing function to quantize… activation values input to the layer of the ANN using a second bit width; (Section 4 ¶01 “The truncated activation output is then linearly quantized to k bits for the dot-product computations” Activation output corresponds to activation values. Both truncation and “linearly quantized” amounts to executing a quantizing function. “To k bits” corresponds to using a second bit width) computing a second gradient for the second bit width, and computing a new second bit width for quantizing the… activation values input to the layer of the ANN based on the second gradient; (Section 4 “. α is dynamically adjusted via gradient descent-based training with the objective of minimizing the accuracy degradation arising from quantization… where α limits the range of activation to [0, α]. The truncated activation output is then linearly quantized to k bits for the dot-product computations… With this new activation function, α is a variable in the loss function, whose value can be optimized during training. For back-propagation, gradient ∂y/∂α can be computed using the Straight-Through Estimator” the gradient for alpha which corresponds to “the gradient for the second bit width” is used to update the quantization values and associated bit widths of the ANN in a manned similar to traditional backpropagation for weight updates) and quantizing weights and activation values for the ANN at inference time using the new first bit width and the new second bit width. (Section 5.2 “In this section, we demonstrate that although PACT targets activation quantization, it does not preclude us from using weight quantization as well. We used PACT to quantize activation of CNNs, and DoReFa scheme to quantize weights…. We also show the accuracy of CNNs when both the weight and activation are quantized by DoReFa’s scheme. As can be seen, with 4 bit precision for both weights and activation, PACT achieves full-precision accuracy consistently across the networks tested.” In order to determine the accuracy of the scheme the quantized weights and activations must have been used at inference time)
	It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a quantization scheme that is able to quantize both activations and weights as taught by Choi to the disclosed invention of Raghaven.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement a quantizing scheme in which “both weights and activations can be quantized to 4-bits of precision while still achieving accuracy comparable to full precision networks across a range of popular models and datasets” (Abstract Choi)
	Raghaven/Choi does not explicitly teach, wherein the first bit width defines a bit width for a mantissa for storing the weights, wherein the second bit width defines a bit width for a mantissa for storing the activation values; quantize a mantissa;
	Hsu however, when addressing issue related to mantissa quantization of neural network parameters, wherein the first bit width defines a bit width for a mantissa for storing the weights; quantize a mantissa ( Section 3.2 ¶02 “Algorithm 1 presents the mantissa-quantization… For each parameter p in the layer of the model, a conditional rounding arithmetic is used to quantize the value of the mantissa part… The last n bits are removed by taking the intersection with the binary mask, and the binary bits is converted back to the floating point p. After all the parameters are updated (quantized), the feed-forward process is then performed using the quantized model.” Section 4.2.1 ¶01 “Another way to remove bits is directly chopping, namely, keeping the first (32 − n) target bits and directly chopping the other n bits” this describes a quantization scheme that quantizes the mantissa part of a parameters, in this case the weight.) wherein the second bit width defines a bit width for a mantissa for storing the activation values; quantize mantissa ( Section 3.2 ¶02 “Algorithm 1 presents the mantissa-quantization… For each parameter p in the layer of the model, a conditional rounding arithmetic is used to quantize the value of the mantissa part… The last n bits are removed by taking the intersection with the binary mask, and the binary bits is converted back to the floating point p. After all the parameters are updated (quantized), the feed-forward process is then performed using the quantized model.” Section 4.2.1 ¶01 “Another way to remove bits is directly chopping, namely, keeping the first (32 − n) target bits and directly chopping the other n bits” this describes a quantization scheme that quantizes the mantissa part of a parameters, in this case the activation value.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a quantization scheme that is implemented on an integer and fractional part of floating point parameters of a neural network as taught by Hsu to the disclosed invention of Raghaven/Choi.
One of ordinary skill in the arts would have been motivated to make this modification in order to “EOFP-QNN learns how to quantize the mantissa bits of the model parameters while preserving the regression accuracy in the least mantissa precision… Experimental results showed that the model sizes can be significantly reduced” (Abstract Hsu)

Claim 3
	Raghaven/Choi/Hsu teaches claim 1.
Raghaven teaches, wherein the weights are learned during the forward training pass. (
    PNG
    media_image2.png
    81
    615
    media_image2.png
    Greyscale
as described above in equation 5, the weights are learned at the same time as the execution of the quantization during the forward pass. Learning the weights requires a forward pass in order to calculate the loss in the subsequent backward pass.)

Claim 6
	Raghaven/Choi/Hsu teaches claim 1.
	Hsu however, when addressing issue related to mantissa quantization of neural network parameters, wherein the bit width for the mantissa for storing the weights and the bit width for the mantissa for storing the activation values comprise fewer bits than a mantissa in a normal-precision floating- point representation. ( Section 3.2 ¶02 “Algorithm 1 presents the mantissa-quantization… For each parameter p in the layer of the model, a conditional rounding arithmetic is used to quantize the value of the mantissa part… The last n bits are removed by taking the intersection with the binary mask, and the binary bits is converted back to the floating point p. After all the parameters are updated (quantized), the feed-forward process is then performed using the quantized model.” Section 4.2.1 ¶01 “Another way to remove bits is directly chopping, namely, keeping the first (32 − n) target bits and directly chopping the other n bits” this describes a quantization scheme that quantizes the mantissa part of a parameters, in this case the activation value.)

Claim 8
	Raghaven/Choi/Hsu teaches claim 1.
	Further Choi teaches, wherein the quantizing function applies a weight decay to the new first bit width and the new second bit width ( pg 13 ¶04 “L2-regularizer with decay factor of 5 × 10−6 is applied to weight.” Weight decay is commonly implemented by the L2 regularizer during training to prevent the weights from “blowing up” or overfitting. A regularizer is added to a loss function which takes the new weights, first bit width and second bit width, as input. Wherein the quantizing function is part of the loss function. The loss function includes the quantization function and the L2 regularization term.)

Claim 9
Raghaven teaches, one or more processors; and at least one computer storage media having computer-executable instructions stored thereupon which, when executed by the one or more processors, will cause the computing device to ( Section 5 “We evaluate our algorithm, ‘BitNet’, on two benchmarks used for image classification, namely MNIST and CIFAR-10. MNIST: The MNIST… database of handwritten digits has a total of 70, 000 grayscale images of size 28 × 28. We used 50, 000 images for training and 10, 000 for testing” the algorithm that utilizes the MNIST data set is to be implemented in by a computer. The computer necessarily includes storage media, instructions, and processors working to implement the method.)
Further, the remaining limitation of claim 9 are rejected under Raghaven/Choi/Hsu for at least the same reasons as presented in claim 1
Claim 11
	Claim 10 is rejected for at least the reasons set forth in the rejections of claim 6 in view of claim 9
Claim 13
	Claim 13 is rejected for at least the reasons set forth in the rejection of claim 3 in view of claim 9

Claim 15
	Raghaven teaches, computer storage media having computer-executable instructions stored thereupon which, when executed by one or more processors, will cause a computing device to: (Section 5 “We evaluate our algorithm, ‘BitNet’, on two benchmarks used for image classification, namely MNIST and CIFAR-10. MNIST: The MNIST… database of handwritten digits has a total of 70, 000 grayscale images of size 28 × 28. We used 50, 000 images for training and 10, 000 for testing” the algorithm that utilizes the MNIST data set is to be implemented in by a computer, which includes computer storage media necessary to process the images accordingly)
	Further, the remaining limitation of claim 15 are rejected under Raghaven/Choi/Hsu for at least the same reasons as presented in claim 1
Claim 18
	Claim 18 is rejected for at least the reasons set forth in the rejections of claim 6 in view of claim 15

Claim 20
	Claim 20 is rejected for at least the reasons set forth in the rejection of claim 3 in view of claim 15 

Claim 2, 12, 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Raghaven/Choi/Hsu. Further in view of Yoojin et al. “Towards the limit of network quantization” hereinafter Yoojin.

Claim 2
	Raghaven/Choi/Hsu teaches claim 1.
	Raghaven/Choi/Hsu does not explicitly teach, wherein the weights are learned prior to the forward training pass
	Yoojin however, when addressing issue related to neural network quantization teaches, wherein the weights are learned prior to the forward training pass (Section 2 ¶01 “We consider a neural network that is already trained, pruned if employed and fine-tuned before quantization. If no network pruning is employed, all parameters in a network are subject to quantization.For pruned networks, our focus is on quantization of unpruned parameters.” The neural network weights are pre trained before implementing the quantization passes)
	It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a quantization scheme that is implemented on a pre-trained neural network as taught by Yoojin to the disclosed invention of Raghaven/Choi/Hsu.
One of ordinary skill in the arts would have been motivated to make this modification in order to “[use] the simple uniform quantization followed by Huffman coding, we show from our (Abstract Yoojin)
For the reasons to combine Raghaven/Choi/Hsu see the rejection of claim 1
Claim 12
	Claim 12 is rejected for at least the reasons set forth in the rejection of claim 2 in view of independent claim 9
Claim 19
	Claim 19 is rejected for at least the reasons set forth in the rejection of claim 2 in view of independent claim 15

Claim 7 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Raghaven/Choi. Further in view of Hirose et al. “Quantization Error-Based Regularization in Neural Networks” hereinafter Hirose.
Claim 7
	Raghaven/Choi teaches claim 1.
	Raghaven/Choi does not explicitly teach, wherein the quantizing function applies a floor function to round the new first bit width and the new second bit width down to an integer value
Hirose however, when addressing issues related to quantization of floating point values for neural networks, wherein the quantizing function applies a floor function to round the new first bit width and the new second bit width down to an integer value (Section 2 ¶01-¶02 “Quantization in neural network represents original floating point values by using reduced information, such as fixed point and binary…Quantized values are obtained by the round operation. The typical round operation is defined as follows: … 
    PNG
    media_image3.png
    81
    762
    media_image3.png
    Greyscale
in typical quantization the floor function is utilized to round down to an interger value or fixed point value.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a quantization scheme that implements the floor function as part of the round function as taught by Hirose to the disclosed invention of Raghaven/Choi.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement “typical round operations…quantized values are obtained by the round operation” (Section 2 ¶02 Hirose)
For the reasons to combine Raghaven/Choi/Hsu see the rejection of claim 1
Claim 14
	Claim 14 is rejected for at least the reasons set forth in the rejection of claim 7 and claim 8 in view of independent claim 9





Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached on Monday-Friday 7:30 am – 4:00 pm (EST).
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki, can be reached at telephone number 5712723719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://portal.uspto.gov/external/portal. Should you have questions about access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
	

/NICHOLAS KLICOS/Primary Examiner, Art Unit 2145