DETAILED ACTION
This action is written in response to the application filed 9/24/18. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Allowable Subject Matter
Claims 2-21 are allowable over the prior art, but are rejected on the grounds of double patenting. Below are the closest cited references, each of which disclose various aspects of the claimed invention:
Ovtcharov et al. (“Accelerating deep convolutional neural networks using specialized hardware”, cited by Applicant in IDS dated 1/14/19) discloses, inter alia, a hardware convolutional neural network system using a data buffering scheme which re-circulates a computed output layer values to the input buffers for the next round of layer computation.
Seide et al. (US 2014/0142929 A1, cited by Applicant in IDS dated 1/14/19) discloses, inter alia, a deep neural network system using dynamic batch sizing.
However, none of the prior art references of record—alone or in combination—disclose or suggest the combined features recited in the independent claims, including specifically (for claim 2):
obtaining weights for the layer, wherein the weights have an associated reuse value that defines an amount of reuse of the weights for the layer;
selecting, based on the batch size and the reuse value of the weights, a particular quantity of inputs in the first batch to be processed at the layer using the weights, wherein the particular quantity of inputs is selected so as to not exceed: i) the batch size for the layer, or ii) an amount that the weights are permitted to be reused based on the reuse value.

Any comments considered necessary by Applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee. Such submissions should be clearly labeled "Comments on Statement of Reasons for Allowance."

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 2-7, 9-15, and 17-21 are rejected on the ground of nonstatutory double patenting as being unpatentable over the claim of U.S. Patent No. 9,842,293. Although the claims at issue are not identical, they are not patentably distinct from each other for the reasons identified in the table below. Dependent claims 8 and 16 are not rejected on the grounds of double patenting, but are objected to as being dependent upon a rejected parent claim.

16/139258– this application
US 9,842,293 (15/389345)
2. A method for performing neural network computations using a hardware circuit, the method comprising:

1. A method for generating a respective neural network output for each of a plurality of inputs, wherein the generating comprises processing each input through each of a plurality of neural network layers to generate the respective neural network output for the input,
wherein the neural network layers are arranged in a directed graph structure, and wherein each neural network layer has a respective batch size, the method comprising, for each of the neural network layers: 
obtaining a first batch of inputs to be processed at a layer of a neural network, wherein the layer has an associated batch size;

receiving a plurality of inputs to be processed at the neural network layer;
obtaining weights for the layer, wherein the weights have an associated reuse value that defines an amount of reuse of the weights for the layer;
forming one or more batches of inputs from the plurality of inputs, each batch having a number of inputs equal to the respective batch size for the neural network layer, where the respective batch size is based at least on a weight reuse value, the weight reuse value representing a number of times that weight inputs need to be reused for a compute time of output values using the weight inputs at a hardware matrix computation unit of a neural network hardware circuit to be longer than a load time of the weight inputs from memory;
selecting, based on the batch size and the reuse value of the weights, a particular quantity of inputs in the first batch to be processed at the layer using the weights, wherein the particular quantity of inputs is selected so as to not exceed: i) the batch size for the layer, or ii) an amount that the weights are permitted to be reused based on the reuse value; and
selecting a number of the one or more batches of inputs to process, where a count of the inputs in the number of the one or more batches is greater than, less than, or equal to the respective associated batch size of a subsequent layer in the directed graph structure; and
processing, using the weights, the particular quantity of inputs to generate a layer output.
processing, at the neural network hardware circuit and using the hardware matrix computation unit, the number of the one or more batches of inputs to generate the respective neural network layer output.
Each limitation in claim 2 of this application has a corresponding limitation in claim 1 in the ‘293 patent, as illustrated in the table above. Independent claims 10 and 18 recite analogous limitations to those in claim 2.


The correspondence in dependent claims is illustrated in the table below.
16/139258– this application
US 9,842,293 (15/389345)
3. The method of claim 2, further comprising:
 based on a threshold fetch time of accessing memory of the hardware circuit to obtain new weights for the layer.
weight reuse value, the weight reuse value representing a number of times that weight inputs need to be reused for a compute time of a load time of the weight inputs from memory;
[The Examiner interprets “fetch time” in claim 3 as being equivalent to the “load time... from memory” in claim 1 of ‘293.]

determining a number of times the hardware circuit is permitted to reuse weights for the layer before a compute time of reusing the weights the number of times with distinct activation inputs is at least equal to a fetch time of accessing new weight inputs for the layer.
[From claim 1]
where the respective batch size is based at least on a weight reuse value, the weight reuse value representing a number of times that weight inputs need to be reused for a compute time of output values using the weight inputs at a hardware matrix computation unit of a neural network hardware circuit...
5. The method of claim 4, wherein selecting the particular quantity of inputs comprises:
selecting the particular quantity of inputs such that the compute time of reusing the weights does not exceed the fetch time of accessing new weight inputs for the layer.
[From claim 1] where the respective batch size is based at least on a weight reuse value, the weight reuse value representing a number of times that weight inputs need to be reused for a compute time of output values using the weight inputs at a hardware matrix computation unit of a neural network hardware circuit to be longer than a load time of the weight inputs from memory;
[The Examiner interprets “fetch time” in claim 5 as being equivalent to the “load time... from memory” in claim 1 of ‘293.]
6. The method of claim 2, wherein the hardware circuit comprises an array of compute cells and processing the particular quantity of inputs to generate the layer output comprises:
reusing, by two or more respective compute cells in the array, the weights for the layer and an activation input in the particular quantity of inputs over a first processor clock cycle and a second subsequent processor clock cycle.

[From claim 1] where the respective batch size is based at least on a weight reuse value, the weight reuse value representing a number of times that weight inputs need to be reused for a compute time of output values using the weight inputs at a hardware matrix computation unit of a neural network hardware circuit to be longer than a load time of the weight inputs from memory;
[The Examiner notes that claim 1 of ‘293 recites iterative reuse of weights, and this means their reuse over subsequent processor clock cycles. Thus, this feature is inherent in claim 1 of ‘293.]
7. The method of claim 6, wherein processing the particular quantity of inputs to generate the layer output comprises:
processing multiple independent activation inputs while reusing weights that are loaded in the array for 


receiving a plurality of inputs to be processed at the neural network layer;
weight reuse value, the weight reuse value representing a number of times that weight inputs need to be reused for a compute time of output values using the weight inputs at a hardware matrix computation unit of a neural network hardware circuit...
[The Examiner notes that the hardware matrix computation unit of claim 1 of ‘293 is equivalent to the recited array of weight values in claim 2.]

a number of arithmetic units included in the hardware circuit; or a number of channels included in a memory of the hardware circuit that is used to store multiple batches of inputs to be processed at one or more layers of the neural network.
[From claim 4] where processing the number of the one or more batches of inputs comprises computing accumulated values for each input using the hardware matrix computation unit.
[The Examiner notes that any element which computes accumulated values—as in claim 4 of ‘293—is equivalent to an arithmetic unit. See also ‘293 specification, col. 10, line 9 et seq.: “in some implementations, the circuit calculates, e.g., using arithmetic circuitry, a least common multiple of batch sizes across all layers in the neural network.”]
Each of claims 3-7 and 9 in this application has a corresponding claim in the ‘293 patent, as illustrated in the table above. Dependent claims 11-15, 17 and 19-21 each correspond directly to one of dependent claims 3-7 and 9.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Vincent Gonzales whose telephone number is (571) 270-3837. The examiner can normally be reached on Monday-Friday 7 a.m. to 4 p.m. MT.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang, can be reached at (571) 270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained 

/Vincent Gonzales/Primary Examiner, Art Unit 2124