Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .	
 Remarks
Claims 1-20 are pending in this application.
The priority date is 12/20/2017.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitations uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitations are: “circuitry configured” in claim 1, and “comprising circuitry” in claims 2 and 5-7.  

If applicant does not intend to have these limitations interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.



Claims 1, 4-6, 8, 11-13, 15, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Lin (US 20160328646 A1).
Regarding claim 1,
Lin teaches a processor configured for (Lin, in paragraph 0095, recites “As another alternative, the processing system may be implemented with an application specific integrated circuit (ASIC) with the processor, the bus interface, the user interface, supporting circuitry, and at least a portion of the machine-readable media integrated into a single chip, or with one or more field programmable gate arrays (FPGAs),”)
adaptive quantization in an artificial neural network (ANN), the processor comprising:  (Lin, in paragraph 83, recites in part “In block 804, quantizer parameters for quantizing values of the floating point machine learning network are determined […]” where the floating point machine learning network is an artificial neural network)  
circuitry configured to calculate a distribution of ANN information; (Lin, in paragraph 83, recites in part “based on the selected moment of the input distribution of the floating point machine learning network.” where having a ‘selected moment of the input distribution’ shows that the distribution has been calculated) 
circuitry configured to select a quantization function from a set of quantization functions based on the distribution; (combining two earlier quotes from Lin in paragraph 84 as “Lin, in paragraph 83, recites in part “In block 804, quantizer parameters for quantizing values of the floating point machine learning network are determined based on the selected moment of the input distribution of the floating point machine learning network.” Where determining quantizer parameters constitutes selecting a quantization function as the quantizer parameters change (select) the function. The specification of the instant application, in paragraph 0046, recites in part “It is noted that any suitable set of possible quantization functions can be used.” And the teachings of Lin cover a set of functions distinguished by the quantizer parameters.)
Circuitry configured to apply the quantization function to the ANN information to generate quantized ANN information; (Lin, in paragraph 0082, recites in part “In one configuration, after quantizing the floating point model into a fixed point model [..]”)
circuitry configured to load the quantized ANN information into the ANN; (where the next teaching shown below shows that quantized ANN information must have been loaded into the ANN before it can used to generate an output)
and circuitry configured to generate an output based on the quantized ANN information. (Lin, in paragraph 0082, recites in part “the fixed point network is fine-tuned via additional training to further improve the network performance. Fine-tuning may include training via back-propagation.” Where training using training via back-propagation after quantizing shows that the quantizied data has been loaded into the neural network and it is being trained which inherently generates an output. )
Regarding claim 4,
Lin has already taught the processor of claim 1, (see the discussion of claim 1 above) and further teaches wherein the ANN information comprises a plurality of link weights. (Lin, in paragraph 0032, recites in part “In some artificial neural networks (ANNs), such as a deep convolutional network (DCN), quantization may be applied to activations of the normalization layer, weights, biases, and activations of the fully connected layer; and/or weights, biases, and activations of the convolution layer.”)
Regarding claim 5,
Lin has already taught the processor of claim 4, (see the discussion of claim 4 above), Lin further teaches further comprising circuitry configured to: calculate a distribution of link weights for each of a plurality of layers of the ANN; (Lin, paragraph 0064, recites in part “FIGS. 5A and 5B illustrate distributions of activation values and weights in different layers of an exemplary deep convolutional network. FIG. 5A shows the activation values for convolution layers zero to five (conv0. . . . . conv5) and fully connected layers one and two (fc1, fc2). FIG. 5B shows the weights 550 for convolution layers one to five (conv1,..., conv5) and fully connected layers one and two (fc1, fc2).”  This shows that the distributions of weights have been created on a layer by layer basis for each of a plurality of layers.)
select a quantization function to the plurality of link weights for each of the plurality of layers of the ANN based on each distribution; (Lin, continues from paragraph 0064 with “For example, the step sizes of a symmetric uniform quantizer for Gaussian, Laplacian, and Gamma distributions may be calculated with a deterministic function of the standard deviation of the input distribution, if it is assumed that the distributions have zero mean and unit variance.”  Where calculating the step size of a quantizer based on a distribution is equivalent to selecting a quantization function since changing the step size changes the function.)
and apply the respective quantization function to the link weights for each of the plurality of layers.  (Lin, in paragraph 0082, recites in part “In one configuration, after quantizing the floating point model into a fixed point model, the fixed point network is fine-tuned via additional training to further improve the network performance.” This shows that the networks is retrained after the weights have been quantized which means the weighs must have been loaded back into each layer of the network.)
Regarding claim 6,
Lin has already taught the processor of claim 4, (see the discussion of claim 1 above) and Lin also teaches further comprising circuitry configured to: calculate a distribution of link weights for each of a plurality of subsets of layers of the ANN; (Lin, paragraph 0064, recites in part “FIGS. 5A and 5B illustrate distributions of activation values and weights in different layers of an exemplary deep convolutional network. FIG. 5A shows the activation values for convolution layers zero to five (conv0. . . . . conv5) and fully connected layers one and two (fc1, fc2). FIG. 5B shows the weights 550 for convolution layers one to five (conv1,..., conv5) and fully connected layers one and two (fc1, fc2).”  This shows that the distributions of weights have been created on a layer by layer basis.)
select a quantization function to the plurality of link weights for each of the plurality of subsets of layers of the ANN based on each distribution; (Lin paragraph 84, recites in part “In block 804, quantizer parameters for quantizing values of the floating point machine learning network are determined based on the selected moment of the input distribution of the floating point machine learning network.”  Where selecting quantizing parameters constitutes selecting a function since the parameters change the function used for quantization.)
and apply the respective quantization function to the link weights for each of the plurality of subsets of layers.  (Lin, in paragraph 0082, recites in part “In one configuration, after quantizing the floating point model into a fixed point model, the fixed point network is fine-tuned via additional training to further improve the network performance.” This shows that the networks is retrained after the weights have been quantized which means the weighs must have been loaded back into each layer of the network.)
Regarding claim 8,
Claim 8 is substantially identical to claim 1, and as such the rejections for claim 1 apply to claim 8 as well.
Regarding claim 11, 
Claim 11 is substantially identical to claim 4, and as such the rejections for claim 4 apply to claim 11 as well.
Regarding claim 12,
Claim 12 is substantially identical to claim 5, and as such the rejections for claim 5 apply to claim 12 as well.
Regarding claim 13, 
Claim 13 is substantially identical to claim 6, and as such the rejections for claim 6 apply to claim 13 as well.
Regarding claim 15, 
Claim 15 is substantially identical to claim 1, and as such the rejections for claim 1 apply to claim 15 as well.
Additionally, claim 15 adds the limitation of a non-transitory computer-readable medium comprising instructions thereon which when executed by a processor
Lin teaches, a non-transitory computer-readable medium comprising instructions thereon which when executed by a processor (Lin, in paragraph 0013, recites in part “A non-transitory computer-readable medium having program code recorded thereon for quantizing a floating point machine learning network to obtain a fixed point machine learning network using a quantizer when executed by a processor may include program code to select at least one moment of an input distribution of the floating point machine learning network.”)
Regarding claim 18, 
Claim 18 is substantially identical to claim 4, and as such the rejections for claim 4 apply to claim 18 as well.
Regarding claim 19, 
Claim 19 is substantially identical to claim 5, and as such the rejections for claim 5 apply to claim 19 as well.
Regarding claim 20, 
Claim 20 is substantially identical to claim 6, and as such the rejections for claim 6 apply to claim 20 as well.
Claims 2, 9, 16  are rejected under 35 U.S.C. 103 as being unpatentable over Lin as applied to claim 1 above, and further in view of Ko et al. (“Adaptive weight compression for memory-efficient neural networks”).
Regarding claim 2,
Lin, has taught the processor of claim 1, (as shown in the discussion of claim 1 above). 
Lin teaches recalculate the distribution of ANN information; (Lin, in paragraph 0083, recites in part “At block 806, it is determined whether additional moments are available for the input distribution.   If so, blocks 802 and 804 may be repeated for each moment of the input distribution including, for example, the mean, the variance or other like moments of the input distribution.” Where block 802 is used to reselect (and recalculate) the distribution of weights (ANN information))
and reselect the quantization function from the set of quantization functions based on the recalculated distribution. (Lin, in paragraph 0083, recites in part “In block 804, quantizer parameters for quantizing values of the floating point machine learning network are determined based on the selected moment of the input distribution of the floating point machine learning network. At block 806, it is determined whether additional moments are available for the input distribution. If so, blocks 802 and 804 may be repeated for each moment of the input distribution including, for example, the mean, the variance or other like moments of the input distribution.”  Where determining the quantizer parameters for quantizing values on the reselected distribution will reselect from a set of quantization functions (the set of quantization functions being described by the quantizing parameters).)
However, Lin has not been shown to teach that the recalculation and reselection is based on a condition that the output does not sufficiently correlate with a known correct output.
Ko, in the same field of quantization of artificial neural network data, teaches further comprising circuitry configured to, on a condition that the output does not sufficiently correlate with a known correct output: (Ko, in section VII, recites in part “By adaptively determining the quantization level of JPEG based on the error-sensitivity of the weights, we show that the proposed approach achieves high compression ratio while preserving error-sensitive information.”  This shows that the system of Ko can detect when errors caused by quantization do not correlate with the correct output since it is aware of the error-sensitivity of weights.)
It would have been obvious to anyone of ordinary skill in the art at the time of the claimed invention, to combine the teachings of Lin with the teachings of Ko in order to be able to be able to change the level of quantization based on decreases in output accuracy (the conclusion section of Ko recites in part “By adaptively determining the quantization level of JPEG based on the error-sensitivity of the weights, we show that the proposed approach achieves high compression ratio while preserving error-sensitive information.”)
Regarding claim 9, 
Claim 9 is substantially identical to claim 2, and as such the rejections for claim 2 apply to claim 9 as well.
Regarding claim 16, 
Claim 16 is substantially identical to claim 2, and as such the rejections for claim 2 apply to claim 16 as well.
Claims 3, 10 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Lin as applied to claim 1 above, and further in view of Xu et al. (“Design Interpretable Neural Network Trees Through Self-Organized Learning of Features”).
Regarding claim 3,
Lin has taught the processor of claim 1, (see the discussion of claim 1 above) but does not teach wherein the ANN information comprises a set of training data.
Xu, in the same field of quantization of artificial neural network data, teaches wherein the ANN information comprises a set of training data. (Xu, in the beginning of section V, recites in part “Experiments were performed to verify the efficiency of NNTrees obtained from quantized datasets. Four datasets, dermatology, ecoli, ionosphere and liver are used in the experiments.” Where NNTrees is a type of neural network (Neural Network Trees) and where quantizied datasets constitute quantized training data (Xu, in section VI, recites in part “Experimental results show that the Trees built from the quantized training data are as good as those obtained from the original data.”).
It would have been obvious to anyone of ordinary skill in the art at the time of the claimed invention, to combine the teachings of Lin with the teachings of Xu with the motivation of being able to compress the input data using quantization without reducing the quality of the results (Xu, in the start of the conclusion and remarks section, recites in part “In this paper, we quantized the continuous inputs using self-organized learning in each dimension, and constructed the NNTrees from the quantized datasets. Experimental results show that the NNTrees built from the quantized training data are as good as those obtained from the original data.”).
Regarding claim 10, 
Claim 10 is substantially identical to claim 3, and as such the rejections for claim 3 apply to claim 10 as well.
Regarding claim 17, 
Claim 17 is substantially identical to claim 3, and as such the rejections for claim 3 apply to claim 17 as well.
Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Lin as applied to claim 1 above, and further in view of Khan (US 5,448,681).
Regarding claim 7,
 Lin has taught the processor of claim 1, (see the discussion of claim 1 above).
Lin teaches, recalculate the distribution of ANN information; (Lin, in paragraph 83, recites in part “In block 802, at least one moment of an input distribution of a floating point machine learning network is selected.”  This selection of input distribution demonstrates that the distribution must have been recalculated.)
 and reselect the quantization function from the set of quantization functions based on the recalculated distribution.  (Lin, in paragraph 83, recites in part “In block 804, quantizer parameters for quantizing values of the floating point machine learning network are determined based on the selected moment of the input distribution of the floating point machine learning network.” If the heuristic from Kahn is satisfied (by being the most accurate) then a new moment of the distribution may be selected which changes the quantizier parameters and hence changes (reselect) the quantization function.)
 but Lin does not teach further comprising circuitry configured to apply a heuristic to the output and a known correct output, and on a condition that the heuristic is satisfied, to: 
Khan, in the same field of artificial neural networks, teaches further comprising circuitry configured to apply a heuristic to the output and a known correct output, and on a condition that the heuristic is satisfied, to: (Khan, from line 66 of column 5 to line 23 of column 6, recites in part “a plant control system 100 using a controller in accordance with the present invention includes an "unsupervised' action network 102, a critic network 104 and a plant 106 (e.g. robotic mechanism), connected substantially as shown. […] The state signal 110 represents the state of a quantitative plant performance parameter (e.g. current draw, motor speed, temperature, etc.). The performance signal 111 represents a selected qualitative plant performance parameter (eg "good/bad', "successful/failure', etc) as discussed further below, the critic network 104, in selective accordance with the received plant state signal 110a ("Xr”) and plant performance signal 111a (“R”), provides a reinforcement signal 112 (“R”) to the unsupervised action network 102. Referring to FIG. 8A, an unsupervised action network 102 in accordance with the present invention comprises a three-layer neural network […]”  
Where the critic network is applying a heuristic based on the difference between the output of a neural network and the known correct output (note in that in the above quote signal 111 indicates if signal 110a is a correct value which indicates that the output of neural network 203 is also correct since unit 106 is producing the desire response based on the 108b output from 102 (see figure 7 below)). Further note that the critical network sends the reinforcement signal 112 to the neural network (the unsupervised action network 102) based on the heuristic judgment the critic network makes, hence covering the limitation of ‘and on a condition that the heuristic is satisfied, to’. ) 
 Khan further teaches in claim 5 “A plant controller as recited in claim 1, wherein the critic network comprises a neural network.” Indicating that the critic network is a neural network which is a type of heuristic.
    PNG
    media_image1.png
    643
    648
    media_image1.png
    Greyscale



It would have been obvious to anyone of ordinary skill in the art at the time of the claimed invention to combine the teachings of Lin with the teachings of Khan with the motivation of enhancing learning (Khan, in the abstract, recites in part “Learning is enhanced within the action network by using a neural network configured to operate according to unsupervised learning techniques based upon a Kohonen Feature Map. Learning is enhanced within the critic network by using a distance parameter which represents the difference between the actual and desired states of the quantitative performance, or output, of the plant when generating the reinforcement signal for the action network.”).
Regarding claim 14, 
Claim 14 is substantially identical to claim 7, and as such the rejections for claim 7 apply to claim 14 as well.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL EDWARD SHIPLEY whose telephone number is (408) 918-7530.  The examiner can normally be reached on Monday-Thursday and alternate Fridays 7:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Huang, Miranda can be reached on 571 270 7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/P.E.S./Examiner, Art Unit 2124                                                                                                                                                                                                        

/MIRANDA M HUANG/             Supervisory Patent Examiner, Art Unit 2124