DETAILED ACTION	

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 5, 6, 8, 9, 14, 18, and 19 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Merrill et al., U.S. Patent Application Publication No. 2016/0335119 (hereinafter Merrill).

Regarding claims 1 and 14, taking clam 1 as exemplary, Merrill teaches a neural network processor [Batch neural network processor (BNNP). Merrill at paragraph 20; FIG. 2], comprising: 
a plurality of neurons [Inner product units (IPUs) 22. FIG. 2]; and 
a group partitioner and scheduler unit [D&A control logic 25. FIG. 2] configured to 
divide a workload for the neural network processor into a plurality of partitions [Control logic 25 issues commands to the job control logic 21 to control a plurality of jobs, thereby dividing the workload into a plurality of jobs (i.e. partitions). Merrill at paragraphs 21 and 36; FIGS. 2 and 5], and 
assign a group of the neurons to each of the plurality of partitions [Control logic 25 commands the job control logic 21 and thereby assigns a group of N IPUs to each of the jobs. Merrill at paragraph 21; FIG. 2]; and wherein the neurons within each group of neurons are configured to 
process the workload in an assigned partition to generate a partial output value [The IPUs within each group of N IPUs process the job and generate various results. Merrill at paragraphs 22-28; FIG. 2], and 
sum partial output values generated by the neurons in each group of neurons to generate an output value for the workload [The IPUs within each group of N IPUs “add the product to a result”. Merrill at paragraphs 25-30; FIG. 2].

Regarding claim 8, Merrill teaches a neural network processor [Batch neural network processor (BNNP). Merrill at paragraph 20; FIG. 2], comprising: 
a buffer storing an input volume and a weight volume [Merrill at paragraphs 23-24]; 
a plurality of neurons [Inner product units (IPUs) 22. FIG. 2]; and 
a group partitioner and scheduler [D&A control logic 25. FIG. 2] configured to 
partition the input volume and the weight volume into a plurality of partitions [Control logic 25 issues commands to the job control logic 21 to control a plurality of jobs, wherein the jobs comprise image input and weights, thereby partitioning the input volume and weights into a plurality of jobs (i.e. partitions). Merrill at paragraphs 21 and 36; FIGS. 2 and 5], and 
assign a group of the neurons to each of the plurality of partitions [Control logic 25 commands the job control logic 21 and thereby assigns a group of N IPUs to each of the jobs. Merrill at paragraph 21; FIG. 2]; and wherein the neurons within each group of neurons are configured to 
[The IPUs within each group of N IPUs process the job and generate various results. Merrill at paragraphs 22-28; FIG. 2], and 
sum partial output values generated by the neurons in each group of neurons to generate an output value for the workload [The IPUs within each group of N IPUs “add the product to a result”. Merrill at paragraphs 25-30; FIG. 2].

Regarding claims 5, 9, and 18, taking claim 5 as exemplary, Merrill teaches the neural network processor of claim 1, wherein the workload is divided into a plurality of partitions such that the number of neurons that can simultaneously process the workload is maximized [The workload is divided into M jobs such that the N IPUs can simultaneous processor each of the M jobs and (Merrill at paragraph 21), therefore, the number of IPUs that can simultaneously process the workload is maximized.].

Regarding claims 6 and 19, taking claim 6 as exemplary, Merrill teaches the neural network processor of claim 1, wherein processing the workload comprises performing a convolution operation on a portion of an input volume and a portion of a weight operation in the partition [Processing the jobs/workload comprising the IPU (i.e. neurons) performing inner product, max pooling, average pooling, and/or local normalization (i.e. a convolution operation) on a portion of the input image and weights. Merrill at paragraph 32; See also paragraphs 22-30].

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the 

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 2-4, 10-12, and 15-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Merrill in view of Shen et al., “Maximizing CNN Accelerator Efficiency Through Resource Partitioning” (hereinafter Shen).
Regarding claims 2, 10, and 15, taking claim 2 as exemplary, Merrill teaches the neural network processor of claim 1, wherein the workload comprises an input volume and a weight volume having height, width, and depth dimensions [The workload comprises input images (i.e. an input volume) and weights, which are inherently three dimensional (i.e. have height, width, and depth dimensions). See Merrill at paragraphs 23-24]. Merrill doesn’t teach the workload is partitioned along the depth dimension. In the same field of neural network processing, Shen teaches partitioning a workload in a convolutional layer processor, wherein the workload comprises an input volume and a weight volume having height, width, and depth dimensions [Shen at Fig. 3 and Section II], and wherein the workload is partitioned along the depth dimension [The workload is partitioned along the layer dimensions, including the depth dimension. Shen at Section IV, 1st paragraph; Fig. 1]. Shen teaches that partitioning the workload along the layer dimensions increases the computational efficiency and increases overall throughput [Shen at Abstract]. It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the partitioning of Merrill to be along the layer dimensions, including the depth dimensions, as taught by Shen because doing so would increases the computational efficiency and increases overall throughput.

Regarding claims 3, 11, and 16, taking claim 3 as exemplary, Merrill teaches the neural network processor of claim 1, wherein the workload comprises an input volume and a weight volume having height, width, and depth dimensions [The workload comprises input images (i.e. an input volume) and weights, which are inherently three dimensional (i.e. have height, width, and depth dimensions). See Merrill at paragraphs 23-24]. Merrill doesn’t teach the workload is partitioned along the height dimension. In the same field of neural network processing, Shen teaches partitioning a workload in a convolutional layer processor, wherein the workload comprises an input volume and a weight volume having height, width, and depth dimensions [Shen at Fig. 3 and Section II], and wherein the workload is partitioned along the height dimension [The workload is partitioned along the layer dimensions, including the depth dimension. Shen at Section IV, 1st paragraph; Fig. 1]. Shen teaches that partitioning the workload along the layer dimensions increases the computational efficiency and increases overall throughput [Shen at Abstract]. It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the partitioning of Merrill to be along the layer dimensions, including the height dimensions, as taught by Shen because doing so would increases the computational efficiency and increases overall throughput.

Regarding claims 4, 12, and 17, taking claim 4 as exemplary, Merrill teaches the neural network processor of claim 1, wherein the workload comprises an input volume and a weight volume having height, width, and depth dimensions [The workload comprises input images (i.e. an input volume) and weights, which are inherently three dimensional (i.e. have height, width, and depth dimensions). See Merrill at paragraphs 23-24]. Merrill doesn’t teach the workload is partitioned along the width dimension. In the same field of neural network processing, Shen teaches partitioning a workload in a convolutional layer processor, wherein the workload comprises an input volume and a weight volume having height, width, and depth dimensions [Shen at Fig. 3 and Section II], and wherein the workload is partitioned along the width dimension [The workload is partitioned along the layer dimensions, including the depth dimension. Shen at Section IV, 1st paragraph; Fig. 1]. Shen teaches that partitioning the workload along the layer dimensions increases the computational efficiency and increases overall throughput [Shen at Abstract]. It would have been obvious to a person of ordinary skill in the art, before the effective filing date .

Claims 7, 13, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Merrill in view of Dhong et al., U.S. Patent No. 7,137,021 (hereinafter Dhong).

Regarding claims 7, 13, and 20, taking claim 7 as exemplary, Merrill teaches the neural network processor of claim 1. Merrill doesn’t teach that the plurality of neurons are powered down following generation of the output values for the workload. In an analogous field of processing, Dhong teaches a processor comprises functional units that are powered down when there are no instructions for the respective functional units [Dhong at column 2, lines 15-38; column 5, lines 11-22; FIGS. 1 and 4]. Dhong teaches that powering down the functional units when not needed reduces power consumption, reduce cooling demand, and improves reliability [Dhong at column 1, lines 24-29]. It would have been obvious to a person of ordinary skill in the art, before the effective filling date of the invention, to modify Merrill’s neural network processor so that the IPUs (i.e. neurons), which are functional units, are powered down when there are no operations/instructions to execute (i.e. following generation of the output values for the workload) because it would reduce power consumption, reduce cooling demand, and improve reliability.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN P GEIB whose telephone number is (571)272-8628.  The examiner can normally be reached on Monday - Friday 8:30 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ALEXEY SHMATOV can be reached on (571)270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/BENJAMIN P GEIB/Primary Examiner, Art Unit 2123