Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office action is in response to request for continuing examination filed 6/14/2022. Claims 1-20 are pending.

Response to Arguments
Applicant's arguments filed 6/14 have been fully considered but they are not persuasive. 
Regarding the support of claim amendment, in particular, “an amount of calculation”, to claims 1 and n10 
In response: Paragraph 26 of the spec includes the term of calculation amount, but does not to provide a further definition. The term is broad that it reads on any computational activity (e.g., loss, storage). In an event of amendment, claims 1 and 10 may include calculation as described in paragraph 33 (i.e., loss with forward and backward propagation). 
Claim Objections
(1) Claims 1 and 10 are objected to. The antecedent basis of the term “data” in “fixed-point numbers of data” is not clear. The claim may be amended as data comprising the weights and biases.
(2) Claims 4, 6, 9, 13, 15, and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 3, 5, 7, 8, 10, 11, 12, 14, 16, 17, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (US 20190050710), hereinafter Wang, in view of Gupta et al. (“Deep Learning with Limited Numerical Precision”, ML, 2015, pages: 10, https://arxiv.org/pdf/1502.02551.pdf), and hereinafter Gupta.

1. A method of accelerating deep learning (Wang: e.g., [0069], speeding up the calculation in model training), comprising: 

reducing a number of bits of fixed-point numbers of data in a plurality of layers in the deep neural network that have an amount of calculation greater than a predetermined threshold from n bits to m bits, where m and n are integers and m<n (Wang: e.g., [0072], computing the information loss for each layer, expressed with different degrees of quantization corresponding to different reduced bit-width, where [0072], computing with the reduced bit-width at two or more layers interprets reducing a number of bits of fixed-point numbers of data in a plurality of layers in the deep neural network, [0071], an original bit-width interprets n bits, [0072], bit width reduction interprets from n to m bits with n and m being integers and m<n, [0072], calculation of information loss “until a predefined information loss being met” interprets having an amount of calculation greater than a predetermined threshold, and where [0071], the bit-width of a floating-point with a given length, such as, 32-bit, is an example of a fixed point number), and 
maintaining data in remaining layers among the plurality of layers as n-bit fixed-point numbers (Wang: e.g., [0072], for two layers (or, [0067], L being equal to 2), the condition that in the first layer the predefined information loss threshold is not met and in the second layer the predefined information loss threshold is met interprets reducing the bit-width of a layer and maintaining data in remaining layers among the plurality of layers as n-bit fixed-point numbers); and 
training the deep neural network after the reducing and the maintaining, until convergence (Wang: e.g., [0004], training comprising the initial training, and quantization with validation, test and calibration data interprets training the deep neural network after the reducing and the maintaining, where “until a suitable combination of reduced bit-widths that produces an acceptable level of model accuracy is identified” interprets “until convergence”). 
Wang does not expressly disclose, in the same field of endeavor Gupta discloses “randomly initializing” in “randomly initializing weights and biases of a deep neural network as n-bit fixed-point numbers” (Gupta: e.g., e.g., page 4, sec 4, par 1, parameters and variables rob we represented in the fixed-point format with random initialization). Nonetheless, random initialization is a common practice in training of neural networks. Wang does not impose constraints on initialization for bit width adaptation. It would have been obvious for one of ordinary skill in the art, having Wang and Gupta before the effective filing date, to combine the Gupta with Wang to incorporate such a commonly adopted parameter initialization in Wang. 
 
10. The claim is substantially the same as claim 1 and therefore, rejected for the same reason. In addition, Wang discloses an apparatus (e.g., [0006], processing apparatus).

2. and 11., further comprising: performing a preliminary training for the deep neural network Wang: e.g., [0018], Fig 1, training the deep leaning model interprets, [0021], via iterative training to reach a predefined convergence interprets performing a preliminary training for the deep neural network for a predetermined number of rounds, where [0023], Fig 1, bit-width reduction being performed after the training process interprets the reducing is performed with respect to the deep neural network after the performing of the preliminary training).  Gupta discloses “after the initializing” (Gupta: e.g., page 4, sec 4, par 1, training after random initialization).

3. and 12., wherein in the preliminary training and the training, a fixed-point number format of data is automatically adjusted according to a size of the data when the data overflows (Gupta: e.g., page 8, truncation of excess MSB bits after detection of overflows, where the MSB bits interpret a size of data).  

5. and 14., wherein during the training, an extreme value of a loss term which is calculated in a forward propagation is calculated by gradient descent method in a backward propagation (Wang: e.g., [0018], the training including forward propagation through a plurality of layers of the deep learning model and backward propagation through the plurality of layers of the deep learning model, [0060], with the search for the optimal solution interprets during the training, an extreme value of a loss term which is calculated in a forward propagation is calculated by gradient descent method in a backward propagation).  

7. and 16., wherein in the training, before an operation is performed between data having bit numbers of decimal parts that are different, a data precision conversion is performed for the data according to a difference between the bit numbers of the decimal parts of the data (Wang: e.g., [0051], with regularization to the weights of layers, “the training will take the decimals of the weights as penalty, and this can push all the full-precision ( e.g., FP32) weights in the network toward their corresponding integer values after the training” interprets before an operation is performed between data having bit numbers of decimal parts that are different, a data precision calculation is performed for the data according to a difference between the bit numbers of the decimal parts of the data).  Gupta discloses “conversion” (Gupta: e.g., page 3, col 2, “2. Convert”).

8. and 17., wherein in the initializing, a corresponding fixed-point number format is set for the data according to types of parameters in 3Serial No.: 16/251,471 the deep neural network to which the data belongs (Wang: e.g., [0071], the bit-width representation of the parameters, including the weights and biases interprets a corresponding fixed-point number format is set for the data according to types of parameters in 3Serial No.: 16/251,471 the deep neural network to which the data belongs).  

19. A deep neural network, comprising: an input layer, to receive data to be processed by the deep neural network; an output layer, to output a result after processing by the deep neural network; multiple hidden layers coupled between the input layer and the output layer, the multiple hidden layers being designed according to functions to be implemented by the deep neural network (Wang: e.g., [0018], a deep learning model, such as, CNN comprising multiple hidden layers and parameters), [0044], Fig 4, with training). With respect to “wherein the deep neural network is trained by the method according to claim 1”, the reasons of rejection given in claim 1 is incorporated herein.  

20. The claim is directed to a non-transitory computer readable storage medium storing a program which, when being executed, causes a computer to implement the method according to claim 1. The reasons of rejection given in claim 1 is incorporated herein by reference.  In addition, Wang discloses a non-transitory computer readable storage medium (e.g., [0006], a non-transitory computer readable storage medium).


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure, e.g.,  
Reference Ginsburg et al. teaches reducing precision data to prevent overflow values and thus, the concept of reducing data with respect to a threshold in claims. Reference Tomono teaches adjusting the bit width for the integer and decimal portion when overflow rate exceeds a specified value and thus, the concept of updating fixed-point numbers with respect to decimals in the claims.



Any inquiry concerning this communication or earlier communications from the examiner should be directed to LiWu Chang whose telephone number is (571)270-3809, email: li-wu.chang@uspto.gov. The examiner can normally be reached M-F. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda M Huang can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/LI WU CHANG/           Primary Examiner, Art Unit 2124                                                                                                                                                                                             	September 1, 2022