DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. KR10-2018-0123927, filed on 10/17/2018.
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 05/30/2019 and 06/05/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 16 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 16 recites the limitation “wherein the certain value is based on a number of parameters”. There is insufficient antecedent basis for this limitation as there is no “certain 
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-22 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Claim 1 recites the limitations “calculating, for each of the parameters, a bit shift value” and “updating the fixed-point format based on calculated bit shift values, and “quantizing parameters”. All limitations listed above are math in combination with mental processes, capable of being performed with the aid of pen and paper or mathematical function. For example, “calculating”, in context of this claim encompasses the user calculating bit shift value performing the calculations on pen and paper. “Updating”, in context of this claim, encompasses the user updating the fixed-point format number based on their calculations, either all in the mind or with the aid of pen and paper. “Quantizing parameters”, in context of this claim, encompasses the user approximating the value parameters all in the mind. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea. 

Finally, claim 1 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of updating in a learning or inference process according to the updated fixed-point format amounts to no more than mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Further, additional element was considered to be insignificant extra-solution activity in Step 2A Prong 2, and thus it is re-evaluated in Step 2B to determine if it is more than what is well-understood, routine, conventional activity in the field. The court decisions cited in MPEP 2106.05(h) indicate that generally linking the use of the judicial exception to a particular technological environment or field of use or technological environment does not amount to an inventive concept. Therefore, Claim 1 is not patent eligible.
Claim 2 recites the limitations “detecting, for each of the parameters a most significant bit having a value ‘1’ and “determining, for each of the parameters, a difference in a number of bits between the detected most significant bit and a most significant bit of the integer part of the fixed-point format as the bit shift value”. These limitations are mental processes, capable of being performed with the aid of pen and paper. For example, “detecting”, in context of this claim 
Claim 2 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.
Claim 3 recites the following limitations “searching for bits within a certain range based on the most significant bit of the integer part of fixed-point format” and “detecting the most significant bit having value ‘1’. Both of these limitations are mental processes, capable of being performed with the aid of pen and paper.
Claim 3 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.
Claim 4 recites the limitations “determining a number of occurrences of overflow from the calculated bit shift values and a maximum bit shift value from the calculated bit shift values” and “updating the fixed-point format”. These limitations are mental processes, capable of being performed with the aid of pen and paper. For example, “determining”, in context of this claim, encompasses the user discerning occurrences of overflow and a maximum bit shift value from the calculating bit shift values all in the mind. “Updating”, in context of this claim, encompasses the user making changes to the fixed-point format based the occurrences of overflow observed by the user. Thus the claim recites an abstract idea. 

Claim 5 recites the limitation “in a case in which the number of occurrences of overflow is greater than a certain value, updating the fixed-point format by reducing a fraction length of fixed-point format by the maximum bit shift value”. This limitation is a mental process. For example, in context of this claim, “updating” encompasses the user reducing the value of the fixed-point format after observing the maximum bit shift value above a certain value, all in the mind. Thus the claim recites an abstract idea. 
Claim 5 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.
Claim 6 recites the limitation, “wherein the certain value is based on a number of parameters”. This limitation is a mental process. The context of this claim encompasses the user associating a value with the number of parameters observed, all in the mind. Thus the claim recites an abstract idea. 
Claim 6 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.
Claim 7 recites the limitations, “wherein updated parameters are parameters updated in a t+1th learning or inference process”, “wherein the parameters are parameters updated in a t-th learning or inference process”, “fixed-point format updated based on the parameter updated in the t-1th learning or inference process”, “t is a natual number greater than equal to 2”. These limitations are mathematical processes. Thus the claim recites an abstract idea. 
This judicial exception is not integrated into a practical application. The claim do not include any additional limitations. Accordingly, this additional element do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
Finally, claim 7 does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  Therefore, Claim 7 is not patent eligible.

Claim 8 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.
Claim 9 recites the limitation “wherein the parameters are weights or activations on a same layer in the neural network”. This limitation is a mental process. This limitation encompasses the user assigning the parameters to be weights or activations on a same layer in the neural network. Thus, the claim recites an abstract idea.
Claim 9 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.
Claim 10 recites the limitation “a computer-readable recording medium storing a program for causing a computer to execute the method in claim 1”. This limitation is a mental process. In context of this claim, this limitation encompasses the user to performing the method of claim 1, which is also a mental process – see explanation above. Therefore, the method is already capable of being performed by a human, all in the mind or with the aid of pen and paper. Thus, the claim recites an abstract idea. 
This judicial exception is not integrated into a practical application. In particular, claim 10 recites the additional element, “a computer-readable recording medium”. The computer-readable recording medium is recited at a high-level of generality such that it amounts to no 
Finally, claim 10 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of a computer-readable recording medium amounts to no more than linking the exception to a technological environment or field of use. Further, additional element was considered to be generally linking the judicial exception to a particular technological environment or field of use in Step 2A Prong 2, and thus it is re-evaluated in Step 2B to determine if it is more than generally linking the use of a judicial exception to a particular technological environment or field of use. The court decisions cited in MPEP 2106.05(h) indicate that merely indicating a field of use or technological environment does not amount to an inventive concept. Therefore, Claim 10 is not patent eligible.
Claim 11 recites the limitation “calculate, for each of the parameters, a bit shift value indicating a degree outside a bit range of a fixed-point format for quantizing the parameters”, “update the fixed-point format using the calculated bit shift values of the parameters”, and “quantizing parameters”. These limitations were also similarly recited in claim 1. The same reasoning given for claim 1, equally applies to claim 11. 
This judicial exception is not integrated into a practical application. In particular, claim 11 recites the additional element, “updated in a learning or inference process according to the updated fixed-point format”. The updating in a learning or inference process is recited at a high-level of generality such that it amounts to no more than mere instructions to apply the exception 
Finally, claim 11 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of updating in a learning or inference process according to the updated fixed-point format amounts to no more than mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Further, additional element was considered to be insignificant extra-solution activity in Step 2A Prong 2, and thus it is re-evaluated in Step 2B to determine if it is more than what is well-understood, routine, conventional activity in the field. The court decisions cited in MPEP 2106.05(h) indicate that generally linking the use of the judicial exception to a particular technological environment or field of use or technological environment cannot provide an inventive concept. Therefore, Claim 11 is not patent eligible.
Claim 12 recites the limitations “detect, for each of the parameters, a most significant bit having a value '1'”  and “determine, for each of the parameters, a difference in a number of bits between the detected most significant bit and a most significant bit of an integer part of the fixed-point format as the bit shift value”. These limitations were similarly recited in claim 2. The same reasoning given for claim 2, equally applies to claim 12. 
Claim 12 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.

Claim 13 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.
Claim 14 recites the limitations “determine a number of occurrences of overflow and a maximum bit shift value from the calculated bit shift values” and “update the fixed-point format using the number of occurrences of overflow and the maximum bit shift value.”. These limitations were similarly recited in claim The same reasoning given for claim 4, equally applies to claim 14.
Claim 14 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.
Claim 15 recites the limitations “in a case in which the number of occurrences of overflow is greater than a predetermined value, update the fixed-point format by reducing a fraction length of the fixed-point format by the maximum bit shift value.” This limitation was similarly recited in claim 5. The same reasoning given for claim 5, equally applies to claim 15. 
Claim 15 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.

Claim 16 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.
Claim 17 recites the limitations, “wherein updated parameters are parameters updated in a t+1th learning or inference process”, “wherein the parameters are parameters updated in a t-th learning or inference process”, “fixed-point format updated based on the parameter updated in the t-1th learning or inference process”, “t is a natual number greater than equal to 2”. These limitations are mathematical processes. Thus the claim recites an abstract idea. 
This judicial exception is not integrated into a practical application. The claim do not include any additional limitations. Accordingly, this additional element do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
Finally, claim 17 does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  Therefore, Claim 7 is not patent eligible.
Claim 18 recites the limitation “calculate the bit shift value of each of the parameters in a process of quantizing the parameters according to the fixed-point format.”. This limitation was similarly recited in claim 8. The same reasoning given for claim 8, equally applies to claim 18. 
Claim 18 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.
Claim 19 recites the limitation “wherein the parameters are weights or activations on a same layer in the neural network”. This limitation was similarly recited in claim 9. The same reasoning given for claim 9, also equally applies to claim 19.
Claim 19 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.
Claim 20 recites the limitations “calculating, for each of parameters updated in a t-th learning or inference process of a neural network, a bit shift value based on a fixed-point format for quantizing the parameters”, “updating the fixed-point format based on the number of 
Claim 20 recites the additional element “the parameters in a t+1th learning or inference process of the neural network based on the updated fixed-point format”. The updating in a learning or inference process is recited at a high-level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Selecting a particular data source to type of data to manipulated is a form of insignificant extra-solution activity. Accordingly, this additional element do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
Finally, claim 20 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of updating parameters in a t+1th learning or inference process according to the updated fixed-point format amounts to no more than mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Further, additional element was considered to be insignificant extra-solution activity in Step 2A Prong 2, and thus it is re-evaluated in Step 2B to determine if it is more than what is well-understood, routine, conventional activity in the field. The court decisions cited in MPEP 2106.05(h) indicate that generally linking the use of the judicial 
Claim 21 recites the limitation “determining the number of occurrences of overflow includes determining whether the bit shift value of each of the parameter is greater than 0 and increasing the number of occurrences of overflow by 1 for each bit shift value that is greater than 0.”. This limitation is a mental process, capable of performed with the aid of pen and paper. For example, “determining”, in context of this claim, encompasses the user counting the number of occurrences of overflow, incrementing by 1, for each occurrence of overflow observed, all in the mind. 
Claim 21 does not recite any additional elements beyond the generic computer components already discussed, which could integrate the abstract idea into a practical application or provide an inventive concept.
Claim 22 recites the limitation, “determining the maximum bit shift value includes comparing the calculated bit shift values of the parameters with each other updated in the t-th learning or inference process and determining a maximum value among the bit shift values updated in the t-th learning or inference process as the maximum bit shift value.”. This limitation is a mental process, capable of being performed with the aid of pen and paper. For example, “determining”, in context of this claim, encompasses the user discerning the largest bit shift value out of the calculating bit shift values as the maximum bit shift value, all in the mind. Thus the claim recites an abstract idea. 
Claim 10 is rejected under 35 U.S.C 101 because the claimed invention recites “a computer-readable recording medium storing a program for causing a computer to execute the method”. The computer-readable recording medium recited is only mentioned twice in In re Nuijten, 500 F.3d 1346, 84 USPQ2d 1495 (Fed. Cir. 2007)). 
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 7-11, and 17-19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Xu et al. (WO 2018140294 A1).
Regarding claim 1, Xu et al. teaches a processor-implemented method of quantizing parameters of a neural network (Xu et al. [0022] “The computing device 100 or the special -purpose processing device 106 can perform the training of the neural networks in the implementations of the subject matter described herein.”), the method comprising: 
calculating, for each of the parameters (Xu et al. [0049] “The method for updating the scaling factor will now be explained with reference to weights. However, it would be appreciated that the method can also be applied for other parameters”), a bit shift value (Xu et al. [0050] “For example, the scaling factor may be multiplied with the cardinal number (e.g., 2).”,  where together, the scaling factor and cardinal number make up the bit shift value.) indicating a degree outside a bit range of a fixed-point format for quantizing the parameters (Xu et al. [0050] “It may be determined whether the overflow rate of the weights exceeds the predefined threshold. If the overflow rate exceeds the predefined threshold, the range of the fixed-point number is too small and the scaling factor should be increased accordingly.”);
 updating the fixed-point format based on the calculated bit shift values of the parameters format (Xu et al. [0049; 0050] “the scaling factor may be updated based on the data range… For example, the radix point may be shifted right by one bit. If the overflow rate does not exceed the predefined threshold and, after the weight is multiplied with 2, the overflow rate is still below the predefined threshold, the range of the fixed-point number is too large. Therefore, the scaling factor may be reduced, for example, by dividing the scaling factor by the cardinal number (e.g., 2). For example, the radix point may be shifted left by one bit.”, where the radix point in the fixed-point format is shifted based on the calculated bit shift values.); and 
quantizing parameters updated in a learning or inference process according to the updated fixed-point (Xu et al. [0048] “It is known that magnitudes of weight, activation, and gradient will fluctuate during training, where the gradient fluctuation is most apparent. To match the fluctuations, different bit-widths and scaling factors are assigned to the parameters, activations, and gradients in different layers and the scaling factors of the parameters are updated accordingly during iteration. Moreover, different scaling factors can also be assigned to weights and biases among the parameters.”).

Regarding claim 7, Xu et al. teaches the method wherein the updated parameters are parameters updated in a t+1th learning or inference process (Xu et al. [0048; 0075] “To match the fluctuations, different bit-widths and scaling factors are assigned to the parameters, activations, and gradients in different layers and the scaling factors of the parameters are updated accordingly during iteration… each layer is updated layer by layer. Then, the backward output is provided to the input layer 202, to finally finish updating all parameters of the neural network 200, thereby completing an iteration of a mini- batch. Iteratively completing iterations of all mini-batches in the training set may be referred to as finishing a full iteration of the data set, which is also known as epoch.”, where the parameters are updated in a future learning process, mini-batch, as it passes through the neural network.) wherein the parameters are parameters updated in a t-th learning or inference process (Xu et al. [0048] “To match the fluctuations, different bit-widths and scaling factors are assigned to the parameters, activations, and gradients in different layers and the scaling factors of the parameters are updated accordingly during iteration…”, where the parameters within a mini-batch iteration is also updated within the current mini-batch iteration.), wherein the fixed-point format is a fixed-point format updated based on the parameters updated in the t-1th learning or inference process, and t is a natural number greater than or equal to 2  (Xu et al. [Abstract; 0048; 0075] “In this solution, parameters of the neural network are stored in a fixed-point format…To match the fluctuations, different bit-widths and scaling factors are assigned to the parameters, activations, and gradients in different layers and the scaling factors of the parameters are updated accordingly during iteration… each layer is updated layer by layer. Then, the backward output is provided to the input layer 202, to finally finish updating all parameters of the neural network 200, thereby completing an iteration of a mini- batch. Iteratively completing iterations of all mini-batches in the training set may be referred to as finishing a full iteration of the data set, which is also known as epoch.”, where the parameters future learning process, mini-batch, as it passes through the neural network, therefore the updated parameter is based on update to the parameter in the previous layer of the neural network, where t is iterative, and the process is on at least the second iteration of updating the parameters in the model.) 

	Regarding claim 8, Xu et al. teaches the method wherein calculating the bit shift value comprises: calculating the bit shift value of each of the parameters in a process of quantizing the parameters according to the fixed-point format (Xu et al. [0045; 0049; 0050] “In some implementations, the following equation (6) may be used to convert data x (such as, a floating-point number) into an /-bit fixed-point number with the scaling factor…The method for updating the scaling factor will now be explained with reference to weights. However, it would be appreciated that the method can also be applied for other parameters…In the case of the current scaling factor, it may be determined whether the overflow rate of the weights exceeds the predefined threshold. If the overflow rate exceeds the predefined threshold, the range of the fixed-point number is too small and the scaling factor should be increased accordingly. For example, the scaling factor may be multiplied with the cardinal number (e.g., 2). For example, the radix point may be shifted right by one bit. If the overflow rate does not exceed the predefined threshold and, after the weight is multiplied with 2, the overflow rate is still below the predefined threshold, the range of the fixed-point number is too large. Therefore, the scaling factor may be reduced, for example, by dividing the scaling factor by the cardinal number (e.g., 2). For example, the radix point may be shifted left by one bit.”, where the multiplication or division of the scaling factor and cardinal number is calculation the bit-shift value.).

	Regarding claim 9, Xu et al. teaches the method wherein the parameters are weights or activations on a same layer in the neural network (Xu et al. [0002] “In accordance with implementations of the subject matter described herein, there is provided a solution for training a neural network. In the solution, a fixed-point format is used to store parameters of the neural networks, such as weights and biases. The parameters are also known as primal parameters to be updated for each iteration. Parameters in the fixed-point format have a predefined bit-width and can be stored in a memory unit of a special-purpose processing device. The special-purpose processing device, when executing the solution, receives an input to a layer of a neural network, reads parameters of the layer from the memory unit, and computes an output of the layer based on the input of the layer and the read parameters.”).
	Regarding claim 10, Xu et al. teaches a computer-readable recording medium storing a program for causing a computer to execute the method (Xu et al. [Claim 1; Claim 8] “A special-purpose processing device, comprising: a memory unit configured to store parameters of a layer of a neural network in a first fixed-point format, the parameters in the first fixed-point format having a predefined bit- width; a processing unit coupled to the memory unit and configured to perform acts including: receiving an input to the layer; reading the parameters of the layer from the memory unit; and computing, based on the input of the layer and the read parameters, an output of the layer through a fixed-point operation…The special-purpose processing device of claim 1, wherein the special -purpose processing device is a field programmable gate array (FPGA), an application-specific integrated circuit (ASIC), a processor having a customized processing unit, or a graphics processing unit (GPU).)
Regarding claim 11, Xu et al. teaches an apparatus for quantizing parameters of a neural network, the apparatus comprising (Xu et al. [0022] “The computing device 100 or the special -purpose processing device 106 can perform the training of the neural networks in the implementations of the subject matter described herein.”): a memory storing at least one program and a processor configured to, by executing the at least one program, calculate, for each of the parameters (Xu et al. [Claim 1; Claim 8] “A special-purpose processing device, comprising: a memory unit configured to store parameters of a layer of a neural network in a first fixed-point format, the parameters in the first fixed-point format having a predefined bit- width; a processing unit coupled to the memory unit and configured to perform acts…The special-purpose processing device of claim 1, wherein the special -purpose processing device is a field programmable gate array (FPGA), an application-specific integrated circuit (ASIC), a processor having a customized processing unit, or a graphics processing unit (GPU).”), a bit shift value indicating a degree outside a bit range of a fixed-point format for quantizing the parameters (Xu et al. [0050] “It may be determined whether the overflow rate of the weights exceeds the predefined threshold. If the overflow rate exceeds the predefined threshold, the range of the fixed-point number is too small and the scaling factor should be increased accordingly.”), update the fixed-point format using the calculated bit shift values of the parameters (Xu et al. [0049; 0050] “the scaling factor may be updated based on the data range… For example, the radix point may be shifted right by one bit. If the overflow rate does not exceed the predefined threshold and, after the weight is multiplied with 2, the overflow rate is still below the predefined threshold, the range of the fixed-point number is too large. Therefore, the scaling factor may be reduced, for example, by dividing the scaling factor by the cardinal number (e.g., 2). For example, the radix point may be shifted left by one bit.”, where the radix point in the fixed-point format is shifted based on the calculated bit shift values.); and 
quantize parameters updated in a learning or inference process according to the updated fixed-point format (Xu et al. [0050] “For example, the radix point may be shifted right by one bit. If the overflow rate does not exceed the predefined threshold and, after the weight is multiplied with 2, the overflow rate is still below the predefined threshold, the range of the fixed-point number is too large. Therefore, the scaling factor may be reduced, for example, by dividing the scaling factor by the cardinal number (e.g., 2). For example, the radix point may be shifted left by one bit.”).
Regarding claim 17, Xu et al. teaches the apparatus wherein the updated parameters are parameters updated in a t+1th learning or inference process (Xu et al. [0048; 0075] “To match the fluctuations, different bit-widths and scaling factors are assigned to the parameters, activations, and gradients in different layers and the scaling factors of the parameters are updated accordingly during iteration… each layer is updated layer by layer. Then, the backward output is provided to the input layer 202, to finally finish updating all parameters of the neural network 200, thereby completing an iteration of a mini- batch. Iteratively completing iterations of all mini-batches in the training set may be referred to as finishing a full iteration of the data set, which is also known as epoch.”, where the parameters are updated in a future learning process, mini-batch, as it passes through the neural network.), wherein the parameters are parameters updated in a t-th learning or inference process (Xu et al. [0048] “To match the fluctuations, different bit-widths and scaling factors are assigned to the parameters, activations, and gradients in different layers and the scaling factors of the parameters are updated accordingly during iteration…”, where the parameters within a mini-batch iteration is also updated within the current mini-batch iteration.), wherein the fixed-point format is a fixed-point format updated based on the parameters updated in the t-1th learning or inference process, and t is a natural number greater than or equal to 2 (Xu et al. [Abstract; 0048; 0075] “In this solution, parameters of the neural network are stored in a fixed-point format…To match the fluctuations, different bit-widths and scaling factors are assigned to the parameters, activations, and gradients in different layers and the scaling factors of the parameters are updated accordingly during iteration… each layer is updated layer by layer. Then, the backward output is provided to the input layer 202, to finally finish updating all parameters of the neural network 200, thereby completing an iteration of a mini- batch. Iteratively completing iterations of all mini-batches in the training set may be referred to as finishing a full iteration of the data set, which is also known as epoch.”, where the parameters are updated in a future learning process, mini-batch, as it passes through the neural network, therefore the updated parameter is based on update to the parameter in the previous layer of the neural network, where t is iterative, and the process is on at least the second iteration of updating the parameters in the model.).
	Regarding claim 18, Xu et al. teaches the apparatus wherein the processor is configured to calculate the bit shift value of each of the parameters in a process of quantizing the parameters according to the fixed-point format (Xu et al. [0022; 0049; 0050] “The computing device 100 or the special -purpose processing device 106 can perform the training of the neural networks in the implementations of the subject matter described herein… The method for updating the scaling factor will now be explained with reference to weights. However, it would be appreciated that the method can also be applied for other parameters…“For example, the scaling factor may be multiplied with the cardinal number (e.g., 2). If the overflow rate does not exceed the predefined threshold and, after the weight is multiplied with 2, the overflow rate is still below the predefined threshold, the range of the fixed-point number is too large. Therefore, the scaling factor may be reduced, for example, by dividing the scaling factor by the cardinal number (e.g., 2). For example, the radix point may be shifted left by one bit.”,  where together, the scaling factor and cardinal number make up the bit shift value, moving the radix point left by one pit is quantizing the parameter.).
	Regarding claim 19, Xu et al. teaches the apparatus wherein the parameters are weights or activations on a same layer in the neural network (Xu et al. [0002] “In accordance with implementations of the subject matter described herein, there is provided a solution for training a neural network. In the solution, a fixed-point format is used to store parameters of the neural networks, such as weights and biases. The parameters are also known as primal parameters to be updated for each iteration. Parameters in the fixed-point format have a predefined bit-width and can be stored in a memory unit of a special-purpose processing device. The special-purpose processing device, when executing the solution, receives an input to a layer of a neural network, reads parameters of the layer from the memory unit, and computes an output of the layer based on the input of the layer and the read parameters.”).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-3, and 12-13 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. (WO 2018140294 A1) in view of Arthurs et al. ("Overflow Detection and Correction in a Fixed-Point Multiplier").
Regarding claim 2, Xu et al. teaches the method wherein calculating the bit shift value for each of the parameters (Xu et al. [0022; 0049; 0050] “The computing device 100 or the special -purpose processing device 106 can perform the training of the neural networks in the implementations of the subject matter described herein… The method for updating the scaling factor will now be explained with reference to weights. However, it would be appreciated that the method can also be applied for other parameters…“For example, the scaling factor may be multiplied with the cardinal number (e.g., 2). If the overflow rate does not exceed the predefined threshold and, after the weight is multiplied with 2, the overflow rate is still below the predefined threshold, the range of the fixed-point number is too large. Therefore, the scaling factor may be reduced, for example, by dividing the scaling factor by the cardinal number (e.g., 2).”, where together, the scaling factor and cardinal number make up the bit shift value).  
Xu et al. does teach determining a bit shift value based on the fixed-point format for each of the parameters (Xu et al. [0049; 0050] “In some implementations of the subject matter described herein, the scaling factor may be updated based on the data range. Specifically, it may be determined, based on overflow of the data (e.g., overflow rate and/or overflow amount), whether to update the scaling factor and how to update the scaling factor. The method for updating the scaling factor will now be explained with reference to weights. However, it would be appreciated that the method can also be applied for other parameters. For example, the scaling factor may be multiplied with the cardinal number (e.g., 2). For example, the radix point may be shifted right by one bit.”, where the data range is the fixed-point format and together the scaling factor and the cardinal number make up the bit shift value.)
Xu et al. does not teach but does detecting a most significant bit having a value '1' and determining and a difference in a number of bits between the detected most significant bit and a most significant bit of an integer part of the fixed-point format.
Arthurs et al. teaches detecting a most significant bit having a value '1' (Arthurs et al. [Section 3, Page 83] “The second step in the overflow detection is the comparison between the sign bit of the final product and the least significant bit from the overflow field of the intermediate product.”, where the sign bit of a fixed-point format number is the most significant bit, where a bit can only have two values ‘0’ or ‘1’. A sign bit of value ‘1’ would indicate detecting the most significant bit having a value ‘1’ of a negative number.) and determining a difference in a number of bits between the detected most significant bit and a most significant bit of an integer part of the fixed-point format (Arthurs et al. [Section 3, Page 83] “If either the preliminary overflow or the two-bit sign difference is detected, the fixed-point multiplier raises the overflow flag and corrects the format of the final product”).
Arthurs et al. and Xu et al. are analogous art because they are from the same field of fixed-point format numbers in neural networks. Before the effective filing date of the invention, 
Regarding claim 3, Xu et al. in view of Arthur et al. teaches all of the elements of the claim. Arthurs et al. further teaches wherein detecting the most significant bit comprises: for each of the parameters, searching for bits within a certain range based on the most significant bit of the integer part of the fixed-point format (Arthurs et al. [Page 82, Section 3, Paragraph 4; Table 1] “The first and most complex step is the preliminary overflow detection. Preliminary overflow detection catches all but one cases of overflow…Because overflow depends solely on the integer part of the fixed-point number…For instance, if the V- 8-Bit PreOV operand format is Q8.24, then the bit-width of the overflow field from the intermediate result will be eight bits.”, where the fixed-point format and the range of bits that are searched are listed in Table 1.), and detecting the most significant bit having the value '1' (Arthurs et al. [Section 3, Page 83] “The second step in the overflow detection is the comparison between the sign bit of the final product and the least significant bit from the overflow field of the intermediate product.”, where the sign bit of a fixed-point format number is a most significant bit.).

(Xu et al. [0049; 0050] “In some implementations of the subject matter described herein, the scaling factor may be updated based on the data range. Specifically, it may be determined, based on overflow of the data (e.g., overflow rate and/or overflow amount), whether to update the scaling factor and how to update the scaling factor. The method for updating the scaling factor will now be explained with reference to weights. However, it would be appreciated that the method can also be applied for other parameters. For example, the scaling factor may be multiplied with the cardinal number (e.g., 2). For example, the radix point may be shifted right by one bit.”, where the data range is the fixed-point format and together the scaling factor and the cardinal number make up the bit shift value.)
	Xu et al. does not teach the apparatus wherein the processor is configured to detect a most significant bit having a value '1'; and determine a difference in a number of bits between the detected most significant bit and a most significant bit of an integer part of the fixed-point format.
Arthurs et al. teaches detecting a most significant bit having a value '1' (Arthurs et al. [Section 3, Page 83] “The second step in the overflow detection is the comparison between the sign bit of the final product and the least significant bit from the overflow field of the intermediate product.”, where the sign bit of a fixed-point format number is the most significant bit, where a bit can only have two values ‘0’ or ‘1’. A sign bit of value ‘1’ would indicate detecting the most significant bit having a value ‘1’ of a negative number.)  and determining, a difference in a number of bits between the detected most significant bit and a most significant bit of an integer part of the fixed-point format (Arthurs et al. [Section 3, Page 83] “If either the preliminary overflow or the two-bit sign difference is detected, the fixed-point multiplier raises the overflow flag and corrects the format of the final product”). The same motivation utilized for combining Arthurs et al. and Xu et al. in claim 2, is equally applicable to claim 12. 
	Regarding claim 13, the Xu et al. in view of Arthurs et al. teaches all the elements of the claim. Arthurs et al. further teaches the apparatus wherein the processor is configured to search for bits within a certain range based on the most significant bit of the integer part of the fixed-point format (Arthurs et al. [Page 82, Section 3, Paragraph 4; Table 1] “The first and most complex step is the preliminary overflow detection. Preliminary overflow detection catches all but one cases of overflow…Because overflow depends solely on the integer part of the fixed-point number…For instance, if the V- 8-Bit PreOV operand format is Q8.24, then the bit-width of the overflow field from the intermediate result will be eight bits.”, where the fixed-point format and the range of bits that are searched are listed in Table 1.) and detect the most significant bit having the value '1' (Arthurs et al. [Section 3, Page 83] “The second step in the overflow detection is the comparison between the sign bit of the final product and the least significant bit from the overflow field of the intermediate product.”, where the sign bit of a fixed-point format number is a most significant bit.).

Claim 4-6, 14-16, and 20-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. (WO 2018140294 A1) in view of Migacz et al. (US 20180211152 A1).
Regarding claim 4, Xu teaches all the elements of the claim except wherein updating the fixed-point format comprises determining a number of occurrences of overflow and a maximum 
Migacz et al. teaches wherein updating the fixed-point format comprises determining a number of occurrences of overflow and a maximum bit shift value from the calculated bit shift values (Migacz et al. [Abstract; 0012; Claim 8] “Aspects of the present invention are directed to computer-implemented techniques for performing data compression and conversion between data formats of varying degrees of precision, and more particularly for improving the inferencing (application) of artificial neural networks using a reduced precision (e.g., INT8) data format…In one or more embodiments, generating the reduced-precision distributions (candidate conversions)… determining a number of candidate conversions for the plurality of candidate conversions; iteratively selecting a particular threshold from a plurality of saturation thresholds to correspond to a particular candidate conversion of the plurality of candidate conversions; merging data values from a consecutive sequence of bins from the histogram until a remaining number of bins in the histogram corresponds to a highest absolute value of the lower precision data format; and
collecting the plurality of candidate conversions.”, where the INT8 data format is a fixed point format number, and where generating reduced-precision distributions or candidate conversions comprises calculating bit shift values, in which all reduced-precision distributions are merged until the remaining corresponds to same lower accuracy data format, where the reduced-precision distribution with the lowest accuracy corresponds to the maximum bit shift value.); and updating the fixed-point format based on the number of occurrences of overflow and the maximum bit shift value (Migacz et al. [0012] “For each given candidate conversion, the values in the bins of the histogram above the saturation level corresponding to the candidate conversion are clamped to the saturation level. Subsequently, the bins of the histogram for the set of activated data values for a layer are then merged proportionally for all bins below the saturation threshold corresponding to the candidate conversion until the remaining number of bins corresponds to the maximum positive value for a selected lower precision data format.”, where the bins of the histogram are the fixed-point format that were updated to match the data point with the lowest accuracy, using the maximum bit shift value.).
Migacz et al. and Xu et al. are analogous art because they are from the same field of computer arrangements based on biological models. Before the effective filing date of the invention, it would have been obvious of ordinary skill in the art, having the teachings of Xu et al. and Migacz et al. to determine a maximum bit shift value from the calculated bit shift values. The suggestion and/or motivation for doing so is to gain the advantage of quantizing the model parameters to the same lower accuracy fixed-point format, as suggested by Migacz et al.

Regarding claim 5, Xu et al. in view of Migacz et al. teaches the all the elements of the claim. Xu et al. further teaches the method wherein updating the fixed-point format comprises in a case in which the number of occurrences of overflow is greater than a certain value (Xu et al. [0050] “In the case of the current scaling factor, it may be determined whether the overflow rate of the weights exceeds the predefined threshold. If the overflow rate exceeds the predefined threshold, the range of the fixed-point number is too small and the scaling factor should be increased accordingly.”, where the overflow rate is the number of occurrences of overflow.), updating the fixed-point format by reducing a fraction length of the fixed-point format by the maximum bit shift value (Xu et al. [0050] “Therefore, the scaling factor may be reduced, for example, by dividing the scaling factor by the cardinal number (e.g., 2). For example, the radix point may be shifted left by one bit.”, where the radix point is shifted by the maximum bit shift value, 1, therefore reducing a fraction length of the fixed-point format.)
Regarding claim 6, Xu et al. in view of Migacz et al. teaches all the elements of the claim. Xu et al. further teaches the method wherein the certain value is based on a number of parameters (Xu et al. [0075] “For example, the threshold condition can be a predefined number of epochs or a predefined accuracy”, where the threshold condition is based on a number of parameters, the parameters being a predefined number of epochs or a predefined accuracy.).

Regarding claim 14, Xu et al. teaches the elements of the claim except wherein the apparatus teaches wherein the processor is configured to determine a number of occurrences of overflow and a maximum bit shift value from the calculated bit shift values
Migacz et al. teaches wherein the processor is configured to determine a number of occurrences of overflow and a maximum bit shift value from the calculated bit shift values (Migacz et al. [Abstract; 0012; Claim 8] “Aspects of the present invention are directed to computer-implemented techniques for performing data compression and conversion between data formats of varying degrees of precision, and more particularly for improving the inferencing (application) of artificial neural networks using a reduced precision (e.g., INT8) data format…In one or more embodiments, generating the reduced-precision distributions (candidate conversions)… determining a number of candidate conversions for the plurality of candidate conversions; iteratively selecting a particular threshold from a plurality of saturation thresholds to correspond to a particular candidate conversion of the plurality of candidate conversions; merging data values from a consecutive sequence of bins from the histogram until a remaining number of bins in the histogram corresponds to a highest absolute value of the lower precision data format; and collecting the plurality of candidate conversions.”, where the INT8 data format is a fixed point format number, and where generating reduced-precision distributions or candidate conversions comprises calculating bit shift values, in which all reduced-precision distributions are merged until the remaining corresponds to same lower accuracy data format, where the reduced-precision distribution with the lowest accuracy corresponds to the maximum bit shift value.); 
and update the fixed-point format using the number of occurrences of overflow and the maximum bit shift value (Migacz et al. [0012] “For each given candidate conversion, the values in the bins of the histogram above the saturation level corresponding to the candidate conversion are clamped to the saturation level. Subsequently, the bins of the histogram for the set of activated data values for a layer are then merged proportionally for all bins below the saturation threshold corresponding to the candidate conversion until the remaining number of bins corresponds to the maximum positive value for a selected lower precision data format.”, where the bins of the histogram are the fixed-point format that were updated to match the data point with the lowest accuracy, using the maximum bit shift value.).The same motivation utilized for combining Xu et al. with Migacz et al. as set forth in claim 4, is equally applicable to claim 14. 
	
(Xu et al. [0050] “In the case of the current scaling factor, it may be determined whether the overflow rate of the weights exceeds the predefined threshold. If the overflow rate exceeds the predefined threshold, the range of the fixed-point number is too small and the scaling factor should be increased accordingly.”, where the overflow rate is the number of occurrences of overflow.), update the fixed-point format by reducing a fraction length of the fixed-point format by the maximum bit shift value (Xu et al. [0050] “Therefore, the scaling factor may be reduced, for example, by dividing the scaling factor by the cardinal number (e.g., 2). For example, the radix point may be shifted left by one bit.”, where the radix point is shifted by the maximum bit shift value, 1, therefore reducing a fraction length of the fixed-point format.).
	
Regarding claim 16, Xu et al. in view of Migacz et al. teaches all the elements of the claim. Xu et al. further teaches the apparatus wherein the certain value is based on a number of parameters (Xu et al. [0075] “For example, the threshold condition can be a predefined number of epochs or a predefined accuracy”, where the threshold condition is based on a number of parameters, the parameters being a predefined number of epochs or a predefined accuracy.).

Regarding claim 20, Xu et al. teaches a processor-implemented method comprising: calculating, for each of parameters updated in a t-th learning or inference process of a neural  (Xu et al. [0049] “The method for updating the scaling factor will now be explained with reference to weights. However, it would be appreciated that the method can also be applied for other parameters”), a bit shift value based on a fixed-point format for quantizing the parameters (Xu et al. [0050] “For example, the scaling factor may be multiplied with the cardinal number (e.g., 2).”,  where together, the scaling factor and cardinal number make up the bit shift value.); 
updating the fixed-point format based on the number of occurrences of overflow and the maximum bit shift value (Xu et al. [0050] “If the overflow rate does not exceed the predefined threshold and, after the weight is multiplied with 2, the overflow rate is still below the predefined threshold, the range of the fixed-point number is too large. Therefore, the scaling factor may be reduced, for example, by dividing the scaling factor by the cardinal number (e.g., 2). For example, the radix point may be shifted left by one bit.”); and quantizing the parameters in a t+1th learning or inference process of the neural network based on the updated fixed-point format, wherein t is a natural number greater than or equal to 2 (Xu et al. [Abstract; 0048; 0075] “In this solution, parameters of the neural network are stored in a fixed-point format…To match the fluctuations, different bit-widths and scaling factors are assigned to the parameters, activations, and gradients in different layers and the scaling factors of the parameters are updated accordingly during iteration… each layer is updated layer by layer. Then, the backward output is provided to the input layer 202, to finally finish updating all parameters of the neural network 200, thereby completing an iteration of a mini- batch. Iteratively completing iterations of all mini-batches in the training set may be referred to as finishing a full iteration of the data set, which is also known as epoch.”, where the parameters are updated in a future learning process, mini-batch, as it passes through the neural network, where the parameter is quantized when the parameter updates. Where t is iterative and the process is on at least the second iteration of updating the parameters in the model.). 
Xu et al. does not teach determining a number of occurrences of overflow and a maximum bit shift value from the calculated bit shift values.
Migacz et al. teaches determining a number of occurrences of overflow and a maximum bit shift value from the calculated bit shift values (Migacz et al. [Abstract; 0012; Claim 8] “Aspects of the present invention are directed to computer-implemented techniques for performing data compression and conversion between data formats of varying degrees of precision, and more particularly for improving the inferencing (application) of artificial neural networks using a reduced precision (e.g., INT8) data format…In one or more embodiments, generating the reduced-precision distributions (candidate conversions)… determining a number of candidate conversions for the plurality of candidate conversions; iteratively selecting a particular threshold from a plurality of saturation thresholds to correspond to a particular candidate conversion of the plurality of candidate conversions; merging data values from a consecutive sequence of bins from the histogram until a remaining number of bins in the histogram corresponds to a highest absolute value of the lower precision data format; and collecting the plurality of candidate conversions.”, where the INT8 data format is a fixed point format number, and where generating reduced-precision distributions or candidate conversions comprises calculating bit shift values, in which all reduced-precision distributions are merged until the remaining corresponds to same lower accuracy data format, where the reduced-precision distribution with the lowest accuracy corresponds to the maximum bit shift value.). The same motivation utilized for combining Xu et al. and Migacz et al. set forth in claim 4, is equally applicable to claim 20.
	
Regarding claim 21, Xu et al. in view of Migacz et al. teaches all of the elements of the claim. Xu et al. further teaches wherein determining the number of occurrences of overflow (Xu et al. [0049] “In some implementations of the subject matter described herein, the scaling factor may be updated based on the data range. Specifically, it may be determined, based on overflow of the data (e.g., overflow rate and/or overflow amount)) includes determining whether the bit shift value of each of the parameter is greater than 0 and increasing the number of occurrences of overflow by 1 for each bit shift value that is greater than 0	 (Xu et al. [0050] “In the case of the current scaling factor, it may be determined whether the overflow rate of the weights exceeds the predefined threshold. If the overflow rate exceeds the predefined threshold, the range of the fixed-point number is too small and the scaling factor should be increased accordingly. For example, the scaling factor may be multiplied with the cardinal number (e.g., 2). For example, the radix point may be shifted right by one bit. If the overflow rate does not exceed the predefined threshold and, after the weight is multiplied with 2, the overflow rate is still below the predefined threshold, the range of the fixed-point number is too large. Therefore, the scaling factor may be reduced, for example, by dividing the scaling factor by the cardinal number (e.g., 2). For example, the radix point may be shifted left by one bit.”, where the number of occurrences, overflow rate, is compared to a threshold. If the fixed-point number is too large, the radix moves to the left.”, where comparing the bit shift value to a threshold is determining if a parameter is greater than 0, since the threshold is a non-zero number, and the number of occurrences of overflow is accordingly adjusted based on the comparison of the bit shift value to the non-zero threshold.)

Regarding claim 22, Xu et al. in view of Migacz et al. teaches the elements of the claim. Migacz further teaches the method wherein determining the maximum bit shift value includes comparing the calculated bit shift values of the parameters with each other updated in the t-th learning or inference process and determining a maximum value among the bit shift values updated in the t-th learning or inference process as the maximum bit shift value (Migacz et al. [Abstract; 0012; Claim 8] “Aspects of the present invention are directed to computer-implemented techniques for performing data compression and conversion between data formats of varying degrees of precision, and more particularly for improving the inferencing (application) of artificial neural networks using a reduced precision (e.g., INT8) data format…In one or more embodiments, generating the reduced-precision distributions (candidate conversions)… determining a number of candidate conversions for the plurality of candidate conversions; iteratively selecting a particular threshold from a plurality of saturation thresholds to correspond to a particular candidate conversion of the plurality of candidate conversions; merging data values from a consecutive sequence of bins from the histogram until a remaining number of bins in the histogram corresponds to a highest absolute value of the lower precision data format; and collecting the plurality of candidate conversions.”, where the INT8 data format is a fixed point format number, and where generating reduced-precision distributions or candidate conversions comprises calculating bit shift values, in which all reduced-precision distributions are merged until the remaining corresponds to same lower accuracy data format, where the reduced-precision distribution with the lowest accuracy corresponds to the maximum bit shift value.)

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Lin et al. (US 20160328645 A1) teaches a method for reducing the computational complexity for a fixed-point neural network involving balancing the amount of quantization error and overflow error when computing activations in the fixed-point neural network.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to IAN K ALLEYNE whose telephone number is (571)272-1327. The examiner can normally be reached 8:30 - 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on 571-270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: 




/IAN K ALLEYNE/Examiner, Art Unit 2127                                                                                                                                                                                                        

/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127