DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/14/2021 has been entered.
 
Status of Claims
This action is in response to the amendment filed on 10/14/2021 for application 15/838,552 filed on 12/12/2017. Claim 1 – 20 are pending and have been examined.

The claim rejection based on 35 U.S.C. 112(a) has been withdrawn.

Part of the claim rejection based on 35 U.S.C 112(b) has been withdrawn. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, 365(c), or 386(c) is acknowledged. Applicant has not complied with one or more conditions for receiving the benefit of an earlier filing date under 35 U.S.C. The disclosure of the prior-filed provisional application, Application No. 62/481,492 and 62/531,372 fail to provide adequate support or enablement in the manner provided by 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph for one or more claims of this application. 
Claim 1, 9 and 16 recite method of optimizing quantization in an artificial neural network (ANN). However, quantization and the steps of collecting statistics, determining distribution and determining quantization are not mentioned in provisional application 62/481,492; limitations in claim 1 and 16 including scale factor indicates custom range… and shift indicate custom range begins are not mentioned in provisional application 62/531,372. Therefore, the claims 1, 16 and the dependent claims of this application are not given effective filing date from any of the provisional application. Claim 9 – 15 are given effective filing date of provisional application 62/531,372. 

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 9/29/2021, 10/6/2021, 11/1/2021, and 12/14/2021 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Objections
Claim 9 is objected to because of the following informalities:  Claim 9 recite repetitive verb “determining determining” in line 8.  Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.



Claim 10, 11 is rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention 

Claim 10 recite limitation “said input data”.  There is insufficient antecedent basis for this limitation in the claim or the depending claim. For examination purpose, examiner interpret the limitation as “input data”.

Claim 11 recited limitation “said data”. There are more than one data recited in the claim or the depending claim. For the examination purpose, examiner interpret the limitation as “data”.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 1 – 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.

As of Claim 1, in the Subject Matter Eligibility Test Step 1, the claimed method in Claim 1 as a whole falls within one or more statutory category. However, Claim 1 as a whole does not provide enough evidence of improving technology or functionality and thus requires further analysis at Step 2A to determine if the claim is directed to a judicial exception.
In the Subject Matter Eligibility Test Step 2A Prong One, the claimed method in Claim 1
recites the abstract ideas in the following limitations:
collecting histogram statistics during inference mode of operation by counting an activity level at a plurality of neurons in one or more layers of said ANN
based on said statistics collected, determining at least one histogram thereof and calculate a scale factor r and shift B
scale factor r indicate a custom range including an upper and lower bound whereby a full range of available quantization values are compressed and spread either linearly or nonlinearly throughout said custom range between said upper and lower bound
shift indicates a quantization value where said custom range begins
an original number of quantization values are reassigned to a narrower region compare to an original assignment to allow data between said upper and lower bounds to be represented by a larger number of values 
quantizing data … utilizing said reassigned quantization values
 	The steps of collecting statistics and determining histogram recite observation and evaluation of Mental Process group of abstract idea that can practically be performed in human mind with or without physical aid. The steps of calculate and quantizing recite mathematical calculation and falls under the Mathematical Concept group of abstract idea. And the limitation c. d. e. and f. recite mathematical relationship and calculations and falls under the Mathematical Concept group of abstract idea. Thus, the claim falls within judicial exception of abstract ideas and requires further analysis under Step 2A Prong Two.
In the Subject Matter Eligibility Test Step 2A Prong Two, Claim 1 recite the following additional elements along with the abstract ideas:
via a plurality of bin counters placed in one or more neurons in said ANN
said ANN implemented in a first circuit in a neural network processor integrated circuit (IC)
via a control circuit in said neural network processor IC
thereby reducing quantization error in said ANN by adjusting quantization to optimize the representation of the data actually observed by said ANN
The recited additional elements of “via a plurality of bin counter” and “via a controller circuit” are highly generic and merely a black box in the disclosed drawing which amounts to no more than a recitation of the word “apply it" (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a generic computer component. The additional element of implementing ANN in a neural network processor IC and placing control circuit in the network processor IC are also highly generic which add insignificant extra-solution activity to the judicial exception. The additional limitation of d. recites an intended result and does not bear patentable weight. Thus, the additional element in Claim 1 does not integrate the abstract idea into a practical application and the claim as a whole is directed to the judicial exception that requires further analysis under Step 2B. 
In the Subject Matter Eligibility Test Step 2B, the recited additional elements of “via a plurality of bin counter” and “via a controller circuit” are highly generic and merely a black box in the disclosed drawing which amounts to no more than a recitation of the word “apply it" (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a generic computer component. The additional element of implementing ANN in a neural network processor IC and placing control circuit in the network processor IC are well-understood, routine, conventional (Sze, Efficient Processing of Deep Neural Network: a Tutorial and Survey, sec. V. para. 1, where  recent hardware platforms have special feature that target DNN processing … DNN inference has also been demonstrated on various embedded System-on-Chip SoC). The additional limitation of d. recites an intended result and does not bear patentable weight. Thus, the additional element in Claim 1 does not contribute an inventive concept and Claim 1 is not eligible subject matter under 35 U.S.C. 101, as pointed out in Step 2B analysis.

As of Claim 2, depending on Claim 1. Claim 2 recite the additional abstract ideas as:
generating a distribution of … data input, weight distribution, data output
The generating of distribution or values is mathematical calculation that falls under mathematical concepts group of abstract idea. 
The additional element of “one or more layer” do not add meaningful limitation beyond linking the use of judicial exception to a particular technological environment or field of use at high level of generality and thus neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 2 is rejected under the same rationale as Claim 1.

As of Claim 3, depending on Claim 1. Claim 3 recite the additional abstract ideas as:
calculating a … error 
determining an updated … level
The determining can be performed in human mind and falls under mental process group of abstract idea. The calculation is mathematical calculation that falls under mathematical concepts group of abstract idea. 
The additional element of quantization error, utilized periodically do not add meaningful limitation beyond linking the use of judicial exception to a particular technological environment or field of use at high level of generality and thus neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 3 is rejected under the same rationale as Claim 1.

As of Claim 4, depending on Claim 1. Claim 4 recite the additional abstract ideas as:
analyzing
selecting … level that minimize … error
The analyzing and selecting can be performed in human mind and falls under mental process group of abstract idea. 
The additional element of data input, data output, weights, quantization level and quantization error do not add meaningful limitation beyond linking the use of judicial exception to a particular technological environment or field of use at high level of generality and thus neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 4 is rejected under the same rationale as Claim 1.

As of Claim 5, depending on Claim 1. Claim 5 recite the additional abstract ideas as:
assigning … value … linear or nonlinear manner …  to reduce … error 
The assigning value in a certain manner is a mathematical relationship. Both falls under mathematical concept group of abstract idea. 
The additional element of bit values, range of quantized weights and input data do not add meaningful limitation beyond linking the use of judicial exception to a particular technological environment or field of use at high level of generality and thus neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 5 is rejected under the same rationale as Claim 1.

As of Claim 6, depending on Claim 1. Claim 6 recite the additional abstract ideas as:
generating … scaling and bias parameter 
The generating parameter is mathematical calculation and falls under mathematical concept group of abstract idea. 
The additional element of scaling and bias parameter do not add meaningful limitation beyond linking the use of judicial exception to a particular technological environment or field of use at high level of generality and thus neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 6 is rejected under the same rationale as Claim 1.

As of Claim 7, depending on Claim 1. Claim 7 recite the additional abstract ideas as:
dropping … bits 
The dropping bits is mathematical calculation and falls under mathematical concept group of abstract idea. 
The additional element of quantized weights, input data and memory utilization do not add meaningful limitation beyond linking the use of judicial exception to a particular technological environment or field of use at high level of generality and thus neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 7 is rejected under the same rationale as Claim 1.

As of Claim 8, depending on Claim 1. Claim 8 recite the additional abstract ideas as:
input data value below said lower bound are set to a minimum value representation and above said upper bound are set to maximum value representation
The abstract idea recites a mathematical relationship of Mathematical Concept group. 
No additional element are recited in this claim. 

As of Claim 9, in the Subject Matter Eligibility Test Step 1, the claimed method in Claim 9 as a whole falls within one or more statutory category. However, Claim 9 as a whole does not provide enough evidence of improving technology or functionality and thus requires further analysis at Step 2A to determine if the claim is directed to a judicial exception.
In the Subject Matter Eligibility Test Step 2A Prong One, the claimed method in Claim 9
recites the abstract ideas in the following limitations:
collecting histogram statistics during inference mode of operation by counting an activity level at a plurality of neurons in one or more layers of said ANN
based on said statistics collected, determining determining at least one histogram thereof … a number of bits to drop from a value representation of data to yield a modified value representation
Quantizing data … utilizing said reassigned quantization values
 	The steps of collecting histogram statistics and determining histogram recite observation and evaluation of Mental Process group of abstract idea that can practically be performed in human mind with or without physical aid. The steps of determining a number of bits to drop and quantizing recite mathematical calculation and falls under the Mathematical Concept group of abstract idea. Thus, the claim falls within judicial exception of abstract ideas and requires further analysis under Step 2A Prong Two.
In the Subject Matter Eligibility Test Step 2A Prong Two, Claim 9 recite the following additional elements along with the abstract ideas:
via a plurality of bin counters placed in one or more neurons in said ANN
said ANN implemented in a first circuit in a neural network processor integrated circuit (IC)
via a control circuit in said neural network processor IC
configuring a memory system within said first circuit in said neural network processor IC implementing said ANN to operate with said modified value representation thereby improving memory utilization:  
thereby reducing quantization error in said ANN by adjusting quantization to optimize the representation of the data actually observed by said ANN
The recited additional elements of “via a plurality of bin counter” and “via a controller circuit” are highly generic and merely a black box in the disclosed drawing which amounts to no more than a recitation of the word “apply it" (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a generic computer component. The additional element of implementing ANN in a neural network processor IC and placing control circuit in the network processor IC are also highly generic which add insignificant extra-solution activity to the judicial exception. The additional elements of configuring memory system to operate within said ANN recites the operation of memory for storing and retrieving information and is recited in high generality that add insignificant extra solution activity to the judicial exception.  The additional limitation of e. recites an intended result and does not bear patentable weight. Thus, the additional element in Claim 9 does not integrate the abstract idea into a practical application and the claim as a whole is directed to the judicial exception that requires further analysis under Step 2B. 
In the Subject Matter Eligibility Test Step 2B, the recited additional elements of “via a plurality of bin counter” and “via a controller circuit” are highly generic and merely a black box in the disclosed drawing which amounts to no more than a recitation of the word “apply it" (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a generic computer component. The additional element of implementing ANN in a neural network processor IC and placing control circuit in the network processor IC are also highly generic which add insignificant extra-solution activity to the judicial exception. The additional elements of configuring memory system to operate within said ANN recites the operation of memory for storing and retrieving information and is well-understood, routine conventional activity recognized by the court (storing and retrieving information in memory, MPEP 2106.05(d)). The additional limitation of e. recites an intended result and does not bear patentable weight. Thus, the additional element in Claim 9 does not contribute an inventive concept and Claim 9 is not eligible subject matter under 35 U.S.C. 101, as pointed out in Step 2B analysis.

As of Claim 10, depending on Claim 9. Claim 10 recite the additional abstract ideas as:
generating a histogram 
The generation of histogram recites observation and evaluation of Mental Process group of abstract ideas. 
The additional element of input data do not add meaningful limitation beyond linking the use of judicial exception to a particular technological environment or field of use at high level of generality and thus neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 10 is rejected under the same rationale as Claim 9.

As of Claim 11, depending on Claim 9. Claim 11 recite the additional abstract ideas as:
calculating a … error 
The calculating error recites a mathematical calculation and falls under the mathematical concepts group of abstract ideas. 
The additional element of quantization error, weight and input data do not add meaningful limitation beyond linking the use of judicial exception to a particular technological environment or field of use at high level of generality and thus neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 11 is rejected under the same rationale as Claim 9.

As of Claim 12, depending on Claim 9. Claim 12 recite the additional element of “number of bits dropped from the value representation of data minimizing quantization error for an observed distribution of the data”. However, the additional element recite the intended result that does not bear patentable weight. Thus, Claim 12 neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 12 is rejected under the same rationale as Claim 9.

As of Claim 13, depending on Claim 9. Claim 13 recite the additional abstract ideas as:
assigning … value … linear or nonlinear manner …  to reduce … error 
The assigning value in a certain manner is a mathematical relationship. Both falls under mathematical concept group of abstract idea. 
The additional element of bit values, range of quantized weights and input data do not add meaningful limitation beyond linking the use of judicial exception to a particular technological environment or field of use at high level of generality and thus neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 13 is rejected under the same rationale as Claim 9.

As of Claim 14, depending on Claim 9. Claim 14 recite the additional abstract ideas as:
generating … scaling and bias parameter 
The generating parameter is mathematical calculation and falls under mathematical concept group of abstract idea. 
The additional element of scaling and bias parameter and quantization level do not add meaningful limitation beyond linking the use of judicial exception to a particular technological environment or field of use at high level of generality and thus neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 14 is rejected under the same rationale as Claim 9.

As of Claim 15, depending on Claim 9. Claim 15 recite the additional abstract ideas as:
input data value below a lower bound are set to a minimum value representation and above an upper bound are set to maximum value representation
The abstract idea recites a mathematical relationship of Mathematical Concept group. 
No additional element are recited in this claim. 

As of Claim 16, in the Subject Matter Eligibility Test Step 1, the claimed method in Claim 16 as a whole falls within one or more statutory category. However, Claim 16 as a whole does not provide enough evidence of improving technology or functionality and thus requires further analysis at Step 2A to determine if the claim is directed to a judicial exception.
In the Subject Matter Eligibility Test Step 2A Prong One, the claimed method in Claim 16
recites the abstract ideas in the following limitations:
collecting histogram statistics during inference mode of operation by counting … an activity level at a plurality of neurons in one or more layers of said ANN
based on said statistics collected, determining at least one histogram thereof and calculate a scale factor r and shift B as well as determining a number of bits to drop from a value representation of data to yield a modified value representation;
scale factor r indicate a custom range including an upper and lower bound whereby a full range of available quantization values are compressed and spread either linearly or nonlinearly throughout said custom range between said upper and lower bound
shift indicates a quantization value where said custom range begins
an original number of quantization values are reassigned … to a narrower region compare to an original assignment to allow data between said upper and lower bounds to be represented by a larger number of values 
Quantizing data … utilizing said reassigned quantization values
 	The steps of collecting histogram statistics and determining histogram recite observation and evaluation of Mental Process group of abstract idea that can practically be performed in human mind with or without physical aid. The steps of calculate, determining and quantizing recite mathematical calculation and falls under the Mathematical Concept group of abstract idea. And the limitation c. d. and e. recite mathematical relationship and falls under the Mathematical Concept group of abstract idea. Thus, the claim falls within judicial exception of abstract ideas and requires further analysis under Step 2A Prong Two.
In the Subject Matter Eligibility Test Step 2A Prong Two, Claim 16 recite the following additional elements along with the abstract ideas:
via a plurality of bin counters placed in one or more neurons in said ANN
said ANN implemented in a first circuit in a neural network processor integrated circuit (IC)
via a control circuit in said neural network processor IC
configuring a memory system within said first circuit in said neural network processor IC implementing said ANN to operate with said modified value representation thereby improving memory utilization:  
thereby reducing quantization error in said ANN by adjusting quantization to optimize the representation of the data actually observed by said ANN
The recited additional elements of “via a plurality of bin counter” and “via a controller circuit” are highly generic and merely a black box in the disclosed drawing which amounts to no more than a recitation of the word “apply it" (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a generic computer component. The additional element of implementing ANN in a neural network processor IC and placing control circuit in the network processor IC are also highly generic which add insignificant extra-solution activity to the judicial exception. The additional elements of configuring memory system to operate within said ANN recites the operation of memory for storing and retrieving information and is recited in high generality that add insignificant extra solution activity to the judicial exception.  The additional limitation of e. recites an intended result and does not bear patentable weight. Thus, the additional element in Claim 9 does not integrate the abstract idea into a practical application and the claim as a whole is directed to the judicial exception that requires further analysis under Step 2B. 
In the Subject Matter Eligibility Test Step 2B, the recited additional elements of “via a plurality of bin counter” and “via a controller circuit” are highly generic and merely a black box in the disclosed drawing which amounts to no more than a recitation of the word “apply it" (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a generic computer component. The additional element of implementing ANN in a neural network processor IC and placing control circuit in the network processor IC are also highly generic which add insignificant extra-solution activity to the judicial exception. The additional elements of configuring memory system to operate within said ANN recites the operation of memory for storing and retrieving information and is well-understood, routine conventional activity recognized by the court (storing and retrieving information in memory, MPEP 2106.05(d)). The additional limitation of e. recites an intended result and does not bear patentable weight. Thus, the additional element in Claim 9 does not contribute an inventive concept and Claim 9 is not eligible subject matter under 35 U.S.C. 101, as pointed out in Step 2B analysis.

As of Claim 17, depending on Claim 16. Claim 17 recite the additional abstract ideas as:
generating a histogram 
The generation of histogram recites observation and evaluation of Mental Process group of abstract ideas.. 
The additional element of input data do not add meaningful limitation beyond linking the use of judicial exception to a particular technological environment or field of use at high level of generality and thus neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 17 is rejected under the same rationale as Claim 16.

As of Claim 18, depending on Claim 16. Claim 18 recite the additional abstract ideas as:
analyzing
selecting … level that minimizes … error
The analyzing and selecting can be performed in human mind and falls under mental process group of abstract idea. 
The additional element of input data quantization level and quantization error do not add meaningful limitation beyond linking the use of judicial exception to a particular technological environment or field of use at high level of generality and thus neither integrate the abstract idea into a practical application in Step 1A Prong Two test nor contribute inventive concept in Step 2B test. Claim 18 is rejected under the same rationale as Claim 16.

As of Claim 19, depending on Claim 16, Claim 19 recite the additional abstract ideas as:
dropping one or more bits used to represent quantized weights and/or input data 
The steps of dropping recite mathematical calculation and falls under the Mathematical Concept group of abstract idea. 
The additional elements of “in an attempt to improve memory utilization” recite an intended result and does not bear patentable weight.

As of Claim 20, depending on Claim 16. Claim 20 recite the additional abstract ideas as:
input data value below said lower bound are set to a minimum value representation and above said upper bound are set to maximum value representation
The abstract idea recites a mathematical relationship of Mathematical Concept group. 
No additional element are recited in this claim. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.



Claim 1 - 20 are rejected under 35 U.S.C. 103 as being unpatentable over Pan et al. US10789734, Method and Device for Data Quantization, 2017 in view of Lee, Energy-Efficient Hybrid Stochastic –Binary Neural Networks for Near-Sensor computing, Design, Automation & Test in Europe Conference & Exhibition, March 2017, further in view of Lin, Fixed Point Quantization of Deep Convolutional Networks, arXiv, Jun 2016.

Regarding Claim 1, Pan discloses: a method of optimum quantization in an artificial neural network (ANN) (Pan, abs. ln. 1, where method for data quantization; fig. 5, where in a neural network), comprising: 
Collecting histogram statistics (Pan. fig. 3, step. S310, where the distribution [statistics] of data received in step S305 is collected and calculated) during an inference mode of operation (Pan. Col 5, para. 2, where quantization is not performed in the training or verification mode. Instead, using the actual input data to determine quantization, i.e., in the inference mode) by counting … an activity level at a plurality of neurons in one or more layers of said ANN (Pan. Col. 5, ln. 45 – 45 where the mean u is calculated based on the number of data [an activity level] output from neuron of prior layer [plurality of neurons in one or more layers]); 
based on said statistics collected, determining at least one histogram thereof (Pan, col 5, ln 41 – 42, where in some embodiment, the distribution of the data to be quantized [one histogram] maybe like a Gaussian distribution) and  calculating a scale factory r (Pan. eq. 1, where ∆0 [scale factor]; ∆ is calculated based on the statistics σ’ and a and is used to scale input x into the quantized value as show in eq. 10)  and shift B (Pan. eq. 10, where                 
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    x
                                    i
                                
                                
                                    ∆
                                
                            
                            +
                            0.5
                        
                    
                    =
                     
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    x
                                    i
                                    +
                                    0.5
                                    ∆
                                
                                
                                    ∆
                                
                            
                        
                    
                
            ,                 
                    0.5
                    ∆
                
             [shift B] shift the input xi before calculating floor function); 
wherein said scale factor r indicates a custom range including an upper and lower bound whereby a full range of available quantization values (Pan, fig. 4, step. 420 and 425, where the output of prior layer is quantized by all of the quantization steps [full range of available quantization values] of the prior layer and is quantized at step 425 for the next layer) are compressed and spread either linearly or nonlinearly throughout said custom range between said upper and lower bound (Pan. eq. 10 and col. 6, ln. 48 – 52, where full range of input xi is compressed within a range [custom range] between an upper bound and an lower bound; the upper bound can derived from eq. 10,  the bound value of xi is when xi>=0 and                 
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    x
                                    i
                                
                                
                                    ∆
                                
                            
                            +
                            0.5
                        
                    
                    =
                    
                        
                            
                                
                                    M
                                    -
                                    1
                                
                                
                                    2
                                
                            
                        
                    
                
            , any xi value larger than this will be set to this value by the min function; the lower bound can also derived from eq. 10, the bound value of xi is when xi<0 and                  
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    -
                                    X
                                    i
                                
                                
                                    ∆
                                
                            
                            +
                            0.5
                        
                    
                    =
                    
                        
                            
                                
                                    M
                                    -
                                    1
                                
                                
                                    2
                                
                            
                        
                    
                
            , any xi value smaller than this value will be set to this value by the min function; the uniform quantization method spread quantization steps by equal step size i.e., spread linearly); 
wherein said shift indicates a quantization value where said custom range begins (Pan, eq. 10 where the minimum bound value of xi [beginning of custom range] is calculated when xi<0 and                  
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    -
                                    X
                                    i
                                    +
                                    0.5
                                    ∆
                                
                                
                                    ∆
                                
                            
                        
                    
                    =
                    
                        
                            
                                
                                    M
                                    -
                                    1
                                
                                
                                    2
                                
                            
                        
                    
                
            ; i.e., by the shift of                 
                    0.5
                    ∆
                
             [shift]); 
wherein an original number of quantization values are reassigned to a narrower region compared to an original assignment to allow data between said upper and lower bounds to be represented by a larger number of values (Pan. fig. 4. Where in step 420 and 425, the second optimization is performed. In the instance that the optimized second step size is smaller [narrower region] than the original step size, the input within the same bounds are represented by more [larger number] quantization steps [quantization values]) thereby improving efficiency and reducing quantization error (Pan, eq. 6, the target of optimization is to reduce the quantization error and thus improve the inference efficiency); 
and quantizing data in said ANN utilizing said reassigned quantization values (Pan. fig 4, step 430 where perform the fixed-point processing [quantizing data] in the neural network using the optimized quantization steps [reassigned quantization values]), thereby reducing quantization error in said ANN by adjusting quantization to optimize the representation of the data actually observed by said ANN (Pan, col. 5, ln. 61 – 62, where the optimal quantization step size can be determined when a quantization error function is minimized so that the data representation is optimized).
Pan does not explicitly disclose: 
Collecting histogram statistics … via a plurality of bin counters placed in one or more neurons in said ANN,
said ANN implemented in a first circuit in a neural network processor integrated circuit (IC)
calculating a scale factor r and shift B via a control circuit in said neural network processor IC
compressed and spread … via said control circuit in said neural network processor IC
reassigned via said control circuit in said neural network processor IC
quantizing data in said ANN via said first circuit in said neural network processor IC 
Lee explicitly discloses: 
Collecting histogram statistics … via a plurality of bin counters placed in one or more neurons in said ANN (Lee, fig. 3, where in the Microarchitecture of the neurons, the binary counters [bin counter] are at the output of each neuron; the statistics of the output of neurons are collected through the bin counters of each neuron),
said ANN implemented in a first circuit in a neural network processor integrated circuit (IC) (Lee, fig. 3, where the Microarchitecture of the Stochastic Convolution Engine Array [ANN; neural network processor integrated circuit])
Pan and Lee both disclose neural network implementation and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Pan’s teaching of optimization method for quantization with Lee’s teaching of physical realization of energy efficient neural network architecture to reach the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to gain energy efficiency savings and application level accuracies (Lee, intro, ln. 18 – 21). 
Pan in view of Lee does not explicitly disclose:
 calculating a scale factor r and shift B via a control circuit in said neural network processor IC
compressed and spread … via said control circuit in said neural network processor IC
reassigned via said control circuit in said neural network processor IC
quantizing data in said ANN via said first circuit in said neural network processor IC 
Lin explicitly discloses:
calculating a scale factor r and shift B via a control circuit in said neural network processor IC; compressed and spread … via said control circuit in said neural network processor IC ; reassigned via said control circuit in said neural network processor IC ; quantizing data in said ANN via said first circuit in said neural network processor IC (Lin, para. 0089, where the various illustrative logical blocks, modules and circuits described … may be implemented or performed with … an application specific integrated circuit ASIC [control circuit in said neural network processor IC] … discrete hardware component or any combination)
Pan (in view of Lee) and Lin both disclose quantization implementation in machine learning application and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Pan (in view of Lee)’s teaching optimization quantization and neural network processor with Lin’s teaching of hardware implementation of quantization controller within an application processor to reach the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to optimize for the overall design constraints for a particular application (Lin, para. 0095, ln. 20 – 23). 

Regarding Claim 2, depending on Claim 1. Pan further discloses: wherein collecting statistics comprises generating a distribution of at least one of data input to said one or more layers layer, weight distribution of said one or more layers (Pan, col 8, ln. 28 – 35, where according to the distribution of the original data [data input] which includes original weight data), and data output of said one or more layers (Pan, eq. 4, where output Q(xi) [data output] is observed in the determining of the quantization step size).

Regarding Claim 3, depending on Claim 1. Pan further discloses: calculating a quantization error after available quantization values are compressed, whereby said quantization error is utilized in periodically determining an updated quantization level (Pan, eq. 4 – 6 & col. 6, ln 3 – 5, where the optimal quantization step size is obtained by iteratively using the initial quantization step size 
    PNG
    media_image1.png
    28
    30
    media_image1.png
    Greyscale
and the quantization error function. The quantization step size determines the available quantization steps [compress available quantization value] to map the input to bit values.  Examiner do not find specific definition of quantization level in the original disclosure. In light of para. 0096 of the specification, examiner interpret the quantization levels as the input value steps, i.e., 1/16, 2/16, … 16/16 in the example of 16 steps of normalized input value between 0 and 1, that map to each bit value. The input value steps are determined by the quantization step size. Optimize the quantization step size also optimize the input value steps [quantization level]).

Regarding Claim 4, depending on Claim 1. Pan further discloses: analyzing at least one of data input to said at least one layer, data output from said at least one layer, and weights in said at least one layer (Pan, Col 8, ln. 28 – 35, where determines optimal quantization step according to the analysis of the distribution of original data [data input] which may comprise original weight data; eq. 4, where data output Q(xi) is used in analyzing the quantization error in order to determine the quantization step size) and selecting a quantization level that minimizes a quantization error of output from said at least one layer (Pan, eq. 4 – 6, col 6, ln. 19 – 23, where determine optimal quantization step size by minimize quantization error function of output; Examiner do not find specific definition of quantization level in the original disclosure. In light of para. 0096 of the specification, examiner interpret the quantization levels as the input value steps, i.e., 1/16, 2/16, … 16/16 in the example of 16 steps of normalized input value between 0 and 1, that map to each bit value. The input value steps are determined by the quantization step size. Optimize the quantization step size also optimize the input value steps [quantization level]).

Regarding Claim 5, depending on Claim 1, Pan further discloses: applying said compressed quantization values comprises assigning bit values over a smaller range compared to an original assignment of quantized weights and/or input data (Pan. fig. 4. Where in step 420 and 425, the second optimization is performed over the original quantized output. In the instance that the optimized second step size                 
                    ∆
                
             is smaller than the original step size, each bit values are assigned to a smaller range                 
                    ∆
                
             of input values than the original assignment.) in an attempt to reduce quantization error (Pan, eq. 6, the target of optimization is to reduce the quantization error).

Regarding Claim 6, Pan further discloses: generating at least one of a scaling and bias parameter which is used in determining a quantization level (Pan, eq. 1 & 2, where Cscaling [scaling] and u [bias] are used in the optimization process; Examiner do not find specific definition of quantization level in the original disclosure. In light of para. 0096 of the specification, examiner interpret the quantization levels as the input value steps, i.e., 1/16, 2/16, … 16/16 in the example of 16 steps of normalized input value between 0 and 1, that map to each bit value. The input value steps are determined by the quantization step size. Optimize the quantization step size also optimize the input value steps [quantization level]).

Regarding Claim 7, Pan further discloses: dropping one or more bits used to represent  quantized (Pan, eq. 11 & eq. 12 where the number of bits representing quantization steps [quantization value] is m+n+1, which is derived by the quantization step size                 
                    ∆
                
            , in the instance that the number of bits is lower than the original, the system drop one or more bits) weights and or/input data (Pan, fig. 4, where in step 415, original data is quantized; col 8, ln. 28 – 35, where original data including original image data [input data] … may further comprise original weight data) in an attempt to improve memory utilization (Pan, col. 5, ln. 5, where reduce storage [memory] capacity so that improves storage utilization). 

Regarding Claim 8, Pan further discloses: input data value below said lower bound are set to a minimum value representation (Pan, eq. 10, for xi<0, the minimum bound value of xi is when                 
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    x
                                    i
                                
                                
                                    ∆
                                
                            
                            +
                            0.5
                        
                    
                    =
                    
                        
                            
                                
                                    M
                                    -
                                    1
                                
                                
                                    2
                                
                            
                        
                    
                
             any xi value smaller than this will be set to this value by the min function, i.e., set to the minimum bit value [minimum value representation]) and above said upper bound are set to a maximum value representation (Pan, eq. 10, for xi>=0, the maximum bound value of xi is when                 
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    x
                                    i
                                
                                
                                    ∆
                                
                            
                            +
                            0.5
                        
                    
                    =
                    
                        
                            
                                
                                    M
                                    -
                                    1
                                
                                
                                    2
                                
                            
                        
                    
                
             any xi value larger than this will be set to this value by the min function, i.e., set to the maximum bit value [maximum value representation]).

Regarding Claim 9, Pan discloses: a method of optimum quantization in an artificial neural network (ANN) (Pan, abs. ln. 1, where method for data quantization; fig. 5, where in a neural network), comprising: 
collecting histogram statistics (Pan. fig. 3, step. S310, where the distribution [statistics] of data received in step S305 is collected and calculated) during an inference mode of operation (Pan. Col 5, para. 2, where quantization is not performed in the training or verification mode. Instead, using the actual input data to determine quantization, i.e., in the inference mode) by counting an activity level at a plurality of neurons in one or more layers of said ANN (Pan. Col. 5, ln. 45 – 45 where the mean u is calculated based on the number of data [an activity level] output from neuron of prior layer [plurality of neurons in one or more layers]); 
based on said statistics collected, determining determining at least one histogram thereof (Pan, col 5, ln 41 – 42, where in some embodiment, the distribution of the data to be quantized [one histogram] maybe like a Gaussian distribution) and … a number of bits to drop from a value representation of data to yield a modified value representation (Pan, eq. 11 & eq. 12 where the number of bits representing quantization steps [value representation] is m+n+1, which is derived by the quantization step size                 
                    ∆
                
             optimized [determined] in the process; in the instance that the optimized number of bits is lower than the original, the system determines a number of bits to be dropped);
to operate with said modified value representation thereby improving memory utilization (Pan, col 5, ln. 3 – 8, where to improve calculation speed and reduce storage capacity [improve memory utilization])
quantizing data in said ANN … utilizing said modified value representation (Pan. fig 4, step 430 where perform the fixed-point processing [quantizing data] in the neural network using the optimized quantization steps [modified value representation]), thereby reducing quantization error in said ANN by adjusting quantization to optimize the representation of the data actually observed by said ANN by adjusting quantization to optimize the representation of the data actually observed by said ANN (Pan, col. 5, ln. 61 – 62, where the optimal quantization step size [adjusting quantization; optimize the representation of the data observed] can be determined when a quantization error function is minimized so that the data representation is optimized).
Pan does not explicitly disclose: 
Collecting histogram statistic … via a plurality of bin counters placed in one or more neurons in said ANN,
said ANN implemented in a first circuit in a neural network processor integrated circuit (IC)
via a control circuit in said neural network processor IC, a number of bits to drop from a value representation of data to yield a modified value representation
configuring a memory system within said first circuit in said neural network processor IC implementing said ANN to operate with said modified value representation thereby improving memory utilization
quantizing data in said ANN via said first circuit in said neural network processor IC 
Lee explicitly discloses: 
Collecting histogram statistics … via a plurality of bin counters placed in one or more neurons in said ANN (Lee, fig. 3, where in the Microarchitecture of the neurons, the binary counters [bin counter] are at the output of each neuron; the statistics of the output of neurons are collected through the bin counters of each neuron),
said ANN implemented in a first circuit in a neural network processor integrated circuit (IC) (Lee, fig. 3, where the Microarchitecture of the Stochastic Convolution Engine Array [ANN; neural network processor integrated circuit])
Pan and Lee both disclose neural network implementation and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Pan’s teaching of optimization method for quantization with Lee’s teaching of physical realization of energy efficient neural network architecture to reach the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to gain energy efficiency savings and application level accuracies (Lee, intro, ln. 18 – 21). 
Pan in view of Lee does not explicitly disclose:
via a control circuit in said neural network processor IC, a number of bits to drop from a value representation of data to yield a modified value representation,
configuring a memory system within said first circuit in said neural network processor IC implementing said ANN to operate with said modified value representation thereby improving memory utilization
quantizing data in said ANN via said first circuit in said neural network processor IC 
Lin explicitly discloses:
via a control circuit in said neural network processor IC, a number of bits to drop from a value representation of data to yield a modified value representation; configuring a memory system within said first circuit in said neural network processor IC implementing said ANN to operate with said modified value representation thereby improving memory utilization; quantizing data in said ANN via said first circuit in said neural network processor IC (Lin, fig. 2, where configuration process 214 configure memory 206, 204 and 212; para. 0089, where the various illustrative logical blocks, modules and circuits described … may be implemented or performed with … an application specific integrated circuit ASIC [control circuit in said neural network processor IC] … discrete hardware component or any combination)
Pan (in view of Lee) and Lin both disclose quantization implementation in machine learning application and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Pan (in view of Lee)’s teaching optimization quantization and neural network processor with Lin’s teaching of hardware implementation of quantization controller within an application processor to reach the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to optimize for the overall design constraints for a particular application (Lin, para. 0095, ln. 20 – 23).

Regarding Claim 10, depending on Claim 9, Lin further discloses: wherein collecting statistics comprises generating a histogram of said input data and/or output data (Lin, fig. 4 & para. 0063, where distribution graph [histogram] of input; fig. 5B where distribution graph [histogram] of activation output).

Regarding Claim 11, depending on Claim 9, Pan further disclose: calculating a quantization error after quantization of weights, said data, or both said weights and data (Pan, eq. 4 – 6 & col. 6, ln 3 – 5, where the quantization error is calculated after applying the initial quantization step size 
    PNG
    media_image1.png
    28
    30
    media_image1.png
    Greyscale
).

Regarding Claim 12, depending on Claim 9, Pan further disclose: wherein the number of bits dropped from the value representation of data minimizes quantization error for an observed distribution of the data (Pan, eq. 6 &Col 6, ln. 3 - 4, where minimize the error so that optimal quantization step size                 
                    ∆
                
             is obtained; eq. 11 – 12, where the optimum bit size is derived from the optimized quantization step size                 
                    ∆
                
            , and thus the bits dropped from the original bit size is the optimum for the observed data).

Regarding Claim 13, depending on Claim 9, Pan further discloses: assigning bit values over a smaller range of quantized weights and/or input data compare to an original assignment in either a linear or nonlinear manner (Pan. fig. 4. Where in step 420 and 425, the second optimization is performed over the original quantized output. In the instance that the optimized second step size                 
                    ∆
                
             is smaller than the original step size, each bit values are assigned to a smaller range                 
                    ∆
                
             of input values than the original assignment) in an attempt to reduce quantization error (Pan, eq. 6, the target of optimization is to reduce the quantization error).

Regarding Claim 14, depending on Claim 9, Pan further disclose: wherein said quantization comprises generating at least one of a scaling and bias parameter which is used in determining said quantization (Pan, eq. 1 & 2, where Cscaling [scaling] and u [bias] are used in the optimization process).

Regarding Claim 15, depending on Claim 9, Pan further disclose: wherein input data values below a lower bound are set to a minimum value representation (Pan, eq. 10, for xi<0, the minimum bound value of xi is when                 
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    x
                                    i
                                
                                
                                    ∆
                                
                            
                            +
                            0.5
                        
                    
                    =
                    
                        
                            
                                
                                    M
                                    -
                                    1
                                
                                
                                    2
                                
                            
                        
                    
                
             any xi value smaller than this will be set to this value by the min function, i.e., set to the minimum bit value [minimum value representation]) and above an upper bound are set to a maximum value representation (Pan, eq. 10, for xi>=0, the maximum bound value of xi is when                 
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    x
                                    i
                                
                                
                                    ∆
                                
                            
                            +
                            0.5
                        
                    
                    =
                    
                        
                            
                                
                                    M
                                    -
                                    1
                                
                                
                                    2
                                
                            
                        
                    
                
             any xi value larger than this will be set to this value by the min function, i.e., set to the maximum bit value [maximum value representation]).

Regarding Claim 16, Pan discloses: a method of optimum quantization in an artificial neural network (ANN) (Pan, abs. ln. 1, where method for data quantization; fig. 5, where in a neural network), comprising: 
collecting histogram statistics (Pan. fig. 3, step. S310, where the distribution [statistics] of data received in step S305 is collected and calculated) during an inference mode of operation (Pan. Col 5, para. 2, where quantization is not performed in the training or verification mode. Instead, using the actual input data to determine quantization, i.e., in the inference mode) by counting an activity level at a plurality of neurons in one or more layers of said ANN (Pan. Col. 5, ln. 45 – 45 where the mean u is calculated based on the number of data [an activity level] output from neuron of prior layer [plurality of neurons in one or more layers]); 
based on said statistics collected, determining at least one histogram thereof and calculating a scale factory r (Pan. eq. 1, where ∆0 [scale factor]; ∆ is calculated based on the statistics σ’ and a and is used to scale input x into the quantized value as show in eq. 10)  and shift B (Pan. eq. 10, where                 
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    x
                                    i
                                
                                
                                    ∆
                                
                            
                            +
                            0.5
                        
                    
                    =
                     
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    x
                                    i
                                    +
                                    0.5
                                    ∆
                                
                                
                                    ∆
                                
                            
                        
                    
                
            ,                 
                    0.5
                    ∆
                
             [shift B] shift the input xi before calculating floor function) as well as determining a number of bits to drop from a value representation of data to yield a modified value representation (Pan, eq. 11 & eq. 12 where the number of bits representing quantization steps [value representation] is m+n+1, which is derived by the quantization step size                 
                    ∆
                
             optimized [determined] in the process; in the instance that the optimized number of bits is lower than the original, the system determines a number of bits to be dropped); 
wherein said scale factor r indicates a custom range including an upper and lower bound whereby a full range of available quantization values (Pan, fig. 4, step. 420 and 425, where the output of prior layer is quantized by all of the quantization steps [full range of available quantization values] of the prior layer and is quantized at step 425 for the next layer) are compressed and spread either linearly or nonlinearly throughout said custom range between said upper and lower bound (Pan. eq. 10 and col. 6, ln. 48 – 52, where full range of input xi is compressed within a range [custom range] between an upper bound and an lower bound; the upper bound can derived from eq. 10,  the bound value of xi is when xi>=0 and                 
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    x
                                    i
                                
                                
                                    ∆
                                
                            
                            +
                            0.5
                        
                    
                    =
                    
                        
                            
                                
                                    M
                                    -
                                    1
                                
                                
                                    2
                                
                            
                        
                    
                
            , any xi value larger than this will be set to this value by the min function; the lower bound can also derived from eq. 10, the bound value of xi is when xi<0 and                  
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    -
                                    X
                                    i
                                
                                
                                    ∆
                                
                            
                            +
                            0.5
                        
                    
                    =
                    
                        
                            
                                
                                    M
                                    -
                                    1
                                
                                
                                    2
                                
                            
                        
                    
                
            , any xi value smaller than this value will be set to this value by the min function; the uniform quantization method spread quantization steps by equal step size i.e., spread linearly); 
wherein said shift indicates a quantization value where said custom range begins (Pan, eq. 10 where the minimum bound value of xi [custom range begins] is calculated when xi<0 and                  
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    -
                                    X
                                    i
                                    +
                                    0.5
                                    ∆
                                
                                
                                    ∆
                                
                            
                        
                    
                    =
                    
                        
                            
                                
                                    M
                                    -
                                    1
                                
                                
                                    2
                                
                            
                        
                    
                
            ; i.e., base on the shift:                 
                    0.5
                    ∆
                
            ); 
to operate with said modified value representation thereby improving memory utilization (Pan, col 5, ln. 3 – 8, where to improve calculation speed and reduce storage capacity [improve memory utilization])
wherein an original number of quantization values are reassigned … to a narrower region compared to an original assignment to allow data between said upper and lower bounds to be represented by a larger number of values thereby improving efficiency and reducing quantization error (Pan. fig. 4. Where in step 420 and 425, the second optimization is performed. In the instance that the optimized second step size is smaller [narrower region] than the original step size, the input within the same bounds is represented by more [larger number] quantization steps [quantization values]; Pan, eq. 6, the target of optimization is to reduce the quantization error and thus improve the inference efficiency); 
and quantizing data in said ANN … utilizing said modified value representation (Pan. fig 4, step 430 where perform the fixed-point processing [quantizing data] in the neural network using the optimized quantization steps [modified value representation]), thereby reducing quantization error in said ANN by adjusting quantization to optimize the representation of the data actually observed by said ANN (Pan, col. 5, ln. 61 – 62, where the optimal quantization step size [adjusting quantization; optimize the representation of the data observed] can be determined  when a quantization error function is minimized so that the data representation is optimized).
Pan does not explicitly disclose: 
Collecting histogram statistic … via a plurality of bin counters placed in one or more neurons in said ANN,
said ANN implemented in a first circuit in a neural network processor integrated circuit (IC)
calculating … as well as determining … to yield a modified value representation via a control circuit in said neural network processor IC
configuring a memory system within said first circuit in said neural network processor IC implementing said ANN to operate with said modified value representation thereby improving memory utilization
the original number of quantization values are reassigned via said first circuit in said neural network processor IC
quantizing data in said ANN via said first circuit in said neural network processor IC 
Lee explicitly discloses: 
Collecting histogram statistics … via a plurality of bin counters placed in one or more neurons in said ANN (Lee, fig. 3, where in the Microarchitecture of the neurons, the binary counters [bin counter] are at the output of each neuron; the statistics of the output of neurons are collected through the bin counters of each neuron),
said ANN implemented in a first circuit in a neural network processor integrated circuit (IC) (Lee, fig. 3, where the Microarchitecture of the Stochastic Convolution Engine Array [ANN; neural network processor integrated circuit])
Pan and Lee both disclose neural network implementation and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Pan’s teaching of optimization method for quantization with Lee’s teaching of physical realization of energy efficient neural network architecture to reach the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to gain energy efficiency savings and application level accuracies (Lee, intro, ln. 18 – 21). 
Pan in view of Lee does not explicitly disclose:
calculating … as well as determining … to yield a modified value representation via a control circuit in said neural network processor IC
configuring a memory system within said first circuit in said neural network processor IC implementing said ANN to operate with said modified value representation thereby improving memory utilization
the original number of quantization values are reassigned via said first circuit in said neural network processor IC
quantizing data in said ANN via said first circuit in said neural network processor IC 
Lin explicitly discloses: 
calculating … as well as determining … to yield a modified value representation via a control circuit in said neural network processor IC; configuring a memory system within said first circuit in said neural network processor IC implementing said ANN to operate with said modified value representation thereby improving memory utilization; the original number of quantization values are reassigned via said first circuit in said neural network processor IC; quantizing data in said ANN via said first circuit in said neural network processor IC (Lin, fig. 2, where configuration process 214 configure memory 206, 204 and 212; para. 0089, where the various illustrative logical blocks, modules and circuits described … may be implemented or performed with … an application specific integrated circuit ASIC [control circuit in said neural network processor IC] … discrete hardware component or any combination)
Pan (in view of Lee) and Lin both disclose quantization implementation in machine learning application and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Pan (in view of Lee)’s teaching optimization quantization and neural network processor with Lin’s teaching of hardware implementation of quantization controller within an application processor to reach the claimed invention. One of the ordinary skilled in the art would have motivated to make this modification in order to optimize for the overall design constraints for a particular application (Lin, para. 0095, ln. 20 – 23).

Regarding Claim 17, depending on Claim 16. Son further disclose: collecting statistics comprises generating one or more histograms of input data and/or output data (Lin, fig. 4 & para. 0063, where distribution graph [histogram] of input; fig. 5B where distribution graph [histogram] of activation output).

Regarding Claim 18, depending on Claim 16, Pan further discloses: analyzing the data and selecting a quantization level that minimizes a quantization error (Pan, fig. 4, eq. 4 – 6 and col 6, ln. 19 – 23, where step 410 analyze original data [input data] determine optimal quantization step size and optimal quantization steps [quantization level] by minimize quantization error function).

Regarding Claim 19, depending on Claim 16, Pan further discloses:
dropping one or more bits used to represent  quantized (Pan, eq. 11 & eq. 12 where the number of bits representing quantization steps [quantization value] is m+n+1, which is derived by the quantization step size                 
                    ∆
                
            , in the instance that the number of bits is lower than the original, the system drop one or more bits) weights and or/input data (Pan, fig. 4, where in step 415, original data is quantized; col 8, ln. 28 – 35, where original data including original image data [input data] … may further comprise original weight data) in an attempt to improve memory utilization (Pan, col. 5, ln. 5, where reduce storage [memory] capacity so that improves storage utilization).  

Regarding Claim 20, Pan further discloses: input data value below said lower bound are set to a minimum value representation (Pan, eq. 10, for xi<0, the minimum bound value of xi is when                 
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    x
                                    i
                                
                                
                                    ∆
                                
                            
                            +
                            0.5
                        
                    
                    =
                    
                        
                            
                                
                                    M
                                    -
                                    1
                                
                                
                                    2
                                
                            
                        
                    
                
             any xi value smaller than this will be set to this value by the min function, i.e., set to the minimum bit value [minimum value representation]) and above said upper bound are set to a maximum value representation (Pan, eq. 10, for xi>=0, the maximum bound value of xi is when                 
                    f
                    l
                    o
                    o
                    r
                     
                    
                        
                            
                                
                                    x
                                    i
                                
                                
                                    ∆
                                
                            
                            +
                            0.5
                        
                    
                    =
                    
                        
                            
                                
                                    M
                                    -
                                    1
                                
                                
                                    2
                                
                            
                        
                    
                
             any xi value larger than this will be set to this value by the min function, i.e., set to the maximum bit value [maximum value representation]).

Response to Amendment
Applicant's remark filed on 10/14/2021 has been fully considered but they are not persuasive. 
Regarding 101, applicant state that the amended claims, especially 1, 9 and 16, provide technical improvement over prior art and overcome 101 rejection. Examiner respectfully disagree. Claim 1, 9 and 16 recite steps to determine quantization parameter and applying quantization to ANN. However, the recited steps such as collecting statistics to determine bits to drop and reassign quantization values are improvement in abstract idea not the technology. Implementing abstract idea in a hardware processor amounts not more than a recitation of the words “apply it” (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a computer (MPEP 2106.05 (f)). 
Regarding prior art rejection, applicant state that Pan fails to teach generating histograms from the data collected to determine a scale factor r and shift B. Examiner respectfully disagree. The term histogram in the specification of the instant application seems interchangeable with the term distribution (specification, para. 0018; para. 00101). Based on the input data distribution, the scale factor r and shift or bias B parameter are calculated (specification, para. 00105). Pan discloses: the computing device calculates a quantization parameter according to a distribution of the data to be quantized (Pan, col. 5, ln. 30 – 32). Skilled in the art would reasonable believe that Pan discloses the recited limitation as point out in the prior art rejection section. 
Applicant’s other arguments regarding prior art rejection have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIEN MING CHOU whose telephone number is (571)272-9354.  The examiner can normally be reached on Monday- Friday 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CHAKI KAKALI can be reached on (571) 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/S.C./Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122