DETAILED ACTION
Claims 1-2, 4-5, 7-12, 15-20, and 22-24 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections/Notes
Claim 1 is objected to because of the following informalities:
Given the amendments to lines 2-3, these lines should be condensed to --a memory to store one or more channels…--.
In line 9, delete “a” before “circuitry”.
Re-insert --and-- before the 2nd to last paragraph, which sets forth the last component of the integrated circuit.
Claim 2 is objected to because of the following informalities:
In line 1, replace “claim” with --Claim-- to match the other dependent claims.
Claim 8 is objected to because of the following informalities:
In line 4, it appears that “same” should be deleted.
Claim 11 is objected to because of the following informalities:
The examiner notes the selecting, accessing, and convolving steps are performed “in a processor coupled to accelerator circuitry” (line 3).  However, isn’t FIGs.4A-5 showing that these steps are performed by the accelerator.  What FIGs/paragraphs establish the basis for the processor and accelerator circuitry in this claim?  For instance, in claim 18, these same steps are performed by the accelerator circuitry, not a processor.  Claim 11 also claims distributing being performed by a processor, wherein claim 18, the accelerator circuitry is performing the distribution.  Please review these claims and clean up any inconsistencies, if they exist.  If not, the examiner would appreciate applicant pointing to FIG(s) showing the processor coupled to the accelerator circuitry, with the relevant tasks being performed by the respective component.
Re-insert --and-- before the 2nd to last paragraph (prior to the last step of the method).
In the 2nd to last paragraph, spell out MAC before using the abbreviation for the first time.
Claim 18 is objected to because of the following informalities:
In line 8, “an input buffer…” does not agree, grammatically, with “the accelerator circuitry configured to:” in line 3.  Perhaps line 9 could start with --receive, by an input buffer, one or more channels…--, assuming the input buffer is part of the accelerator circuitry.
Re-insert --and-- before the 2nd to last paragraph.
In the 2nd to last paragraph, applicant claims convolving in a MAC array coupled to the accelerator circuitry.  If the MAC array is coupled to the accelerator circuitry, then the MAC array is not the accelerator circuitry.  However, line 3 sets forth that the accelerator circuitry is configured to perform the subsequent steps, including the convolving.  Thus, the claim contradicts itself.
In the last paragraph, spell out MAC before using the abbreviation for the first time.
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 12 and 15 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Referring to claim 12, applicant claims that the circuitry (which per claim 1 does not include the input buffer and memory) and the at least one MAC array comprise a CNM circuit block.  However, from FIG.4A, the circuitry and MAC array alone do not comprise a CNM circuit block.  Instead, these two components plus the memory and the input buffer make up the CNM circuit block.  While it would be supported to say a CNM block comprises the circuity and MAC array (since the CNM is open-ended and can comprise other things, i.e., the memory and input buffer), it is not supported that the circuitry and a MAC array form/comprise a CNM circuit block.  Alone, they do not.  They form a portion of a CNM circuit block, as originally disclosed.
Referring to claim 15, applicant claims each CNM block integrates the memory with the input buffer and the at least one array of MAC units.  From FIGs.4A-5, it appears that each CNM block 404 has its own memory, input buffer and MAC array.  Thus, it appears to be new matter to now claim that each block includes the same memory, buffer, and MAC array (such that the blocks share components).

The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1-2, 4-5, 7-12, 15-20, and 22-24 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
In the last paragraph of each of claims 1, 11, and 18, applicant claims enabling “efficient access”.  The term “efficient” is a relative term which renders the claim indefinite.  “Efficient access” is not defined by the claim, nor does the specification appear to provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
The claims recite the following limitations for which there is a lack of antecedent basis:
In claim 2, last line, “the weight”.  There is a weight in claim 1, and other weights in claim 2, line 5.  The examiner again recommends claiming this weight as a first weight.
In claim 7, “the respective arrays of MAC units”.  Per claim 1, there may be just one array.  It appears that applicant would need to establish that there are multiple CNM circuit blocks, each with a respective array of MAC units.
In claim 12, last line, “the weight” for similar reasons as above.
In claim 15, “the at least one array of MAC units of each CNM circuit block”.  Lines 4-6 do not necessarily set forth at least one array of MAC units of each CNM block.
In claim 18, 3rd to last paragraph, “the input activations from the input buffer”.  Are these the activations of line 7 or those streamed to the input buffer in lines 8-9.  The examiner recommends replacing “from the input buffer” with --streamed to the input buffer--.
In claim 19, “the input activations from the input buffer” for similar reasons, but additionally because claim 18 introduces even more input activations streamed to the input buffer.
In claim 20, last line, “the weight” for similar reasons as above.
In claim 22, “the at least one array of MAC units of each CNM circuit block”.  Applicant never previously sets forth that each CNM block includes such an array.
In claim 24, “the input activations from the input buffer” for similar reasons set forth above for claim 18.
All dependent claims are rejected due to their dependence on an indefinite claim.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 11-12, 16, 18-20, and 23 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Nair, U.S. Patent No. 11,120,328.
Referring to claim 11, Nair has taught a computer-implemented method for accelerating a convolutional neural network (CNN) comprising:
a) in a processor (see FIG.2, 130) coupled to accelerator circuitry (see FIG.2B, 140 (any portion thereof being accelerator circuitry) and/or examples 16 and 20 in columns 36-37, which set forth the processor executing code to cause the steps to occur):
a1) distributing, in row-wise order, one or more channels of a same filter row of a filter to a memory coupled to the accelerator circuitry (see FIG.4, each row stores a filter vector of the same weight for each channel (e.g. see column 11, line 62, to column 12, line 5).  So, basically, eight channels of a same filter row (row 1 including w1,1 (see FIG.4, 402)), is distributed to memory 212), each channel of the same filter row to be stored contiguously in the memory, row by row, in a channel-wise order (again, each row stores a vector; hence the channels are stored contiguously.  And, as can be seen in FIG.4, a single weight (w1,1) in all channels of all filters are stored.  Subsequently, in FIG.11, a next single weight in order (w1,2) in all channels of all filters is stored.  Thus, the storage will occur row by row.  After all three in row 1 are stored, the next three in row 2 are stored, and so on), the filter containing weights for convolving input activations of a convolution layer of the CNN (the filter 402 includes weights for convolving in a CNN (see various locations throughout Nair, including the abstract));
a2) selecting, from the memory coupled to the accelerator circuitry, a weight from a stored filter row (from FIG.6, weight W1,1 is selected);
a2) accessing, from an input buffer (FIG.6, 214) coupled to the accelerator circuitry, input activations (e.g. FIG.6, X1,1 to X1,8) of input activation rows of the convolution layer streamed to the input buffer in channel-wise order (again, from FIG.4, a row of input activation (e.g. x1,1, x1,2, and x1,3 408) are streamed to buffer 214 where they are stored in channel-wise order (the bottom row of buffer 214 includes channels 1 to 8 of x,1,1, the next row includes channels 1 to 8 of x1,2, and so on)), the accessing of the input activations based on a stride input (see FIGs.6-8 and 14-16 and note that when the stride changes, the selected activation changes) and a weight position of the selected weight (as long as there are weights in inherent positions to be multiplied, an input activation is selected); and
a3) convolving, in at least one array of MAC units coupled to the accelerator circuitry, the accessed input activations with the selected weight to generate a partial sum for the convolution layer of the CNN (see FIGs.6 and FIG.11, The dot product (multiplication-accumulation) for convolution is performed, note that column 230 shows examples of partial sums that are calculated and added to other sums); and
a4) wherein the accelerator circuitry enables efficient access to the memory and the input buffer by the at least one array of MAC units to accelerate the convolving of input activations of the convolution layer of the CNN (from FIG.2, the hardware forms an accelerator, and the access to the components may be considered efficient access).
Referring to claim 12, Nair has taught the computer-implemented method of Claim 11, further comprising:
a) receiving the stride input in a stride control circuit of the accelerator circuitry, the stride input representing a stride by amount applied in the stride control circuit to shift access to an input row vector of the input activations streamed to the input buffer by buffer positions of the input buffer equal to the stride by amount (see FIGs.6-8 and 14-16.  The stride is input to shift access to an input row (e.g. to the X1 row in the example shown)); and
b) accessing the input activations from the input buffer based on the stride by amount and the weight position of the selected weight as applied to the input buffer by the stride control circuit  (see FIGs.6-8 and 14-16 and note that when the stride changes, the selected activation changes.  Also, as is known with dot product (multiply accumulation), when a given weight is selected, so too will inputs)), the weight position of the selected weight relative to weight positions of neighboring weights of the stored filter row from which the weight was selected (this is inherent.  Each weight has a position relative to other weights.  W1,1 is the top left weight which is 1 above and 1 left of neighboring weights).
Referring to claim 16, Nair has taught the computer-implemented method of Claim 12, wherein the memory includes any of a static random access memory (SRAM) and a register file (RF) (see, FIG.2A, which shows the near memory including a register file (with registers 212, 214, 220)).
Claim 18 is rejected for similar reasons as claim 11.
Claim 19 is rejected for some of the reasoning set forth in the rejection of claim 12.
Claim 20 is rejected for some of the reasoning set forth in the rejection of claim 12.
Claim 23 is rejected for similar reasons as claim 16.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 15 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Nair in view of the examiner’s taking of Official Notice.
Referring to claim 15, Nair has taught the computer-implemented method of claim 11, but has not taught arranging the accelerator circuitry into a systolic array of compute near memory (CNM) circuit blocks, each CNM circuit block integrating the memory to which the one or more channels of the same filter row was distributed with the input buffer and the at least one array of MAC units; and assembling partial sums generated in the at least one array of MAC units coupled to each module into an output feature map for the convolution layer.  However, using a systolic array for convolution is well known and accepted in the art.  That is, one partial sum could be pumped into the next element to create a next accumulation (partial sum), which is then pumped into the next element to create an ever further partial sum, and so on, until the final result is achieved.  A systolic array is generally faster and more scalable than a processor performing the same function.  As an accelerator (not a processor) is doing the convolution in Nair, a systolic array is a logical choice for carrying out the convolution for increased speed.  Further, implementing compute near memory is known in the art so as to limit the transfer time of data for computation.  As a result, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Nair for arranging the modules of the accelerator circuitry into a systolic array of compute near memory (CNM) circuit blocks, each CNM circuit block integrating the memory to which the one or more channels of the same filter row was distributed with the input buffer and the at least one array of MAC units; and assembling partial sums generated in the at least one array of MAC units coupled to each module into an output feature map for the convolution layer.  Note that Nair’s output is an output feature map as is known with convolution for a CNN.
Claim 22 is rejected for similar reasons as claim 15.

Allowable Subject Matter
Claims 1-2, 4-5, and 7-10 are allowable over the prior art.
Claims 17 and 24 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Response to Arguments
On pages 13-14 of applicant’s response, applicant argues that “near memory” is not indefinite based on the disclosure.
The examiner respectfully disagrees.  The nearness of a “near memory”, as in original claim 1 for instance, is unclear based on the specification.  However, deletion of this language from the independent claims renders this rejection moot.  Applicant has added a “compute near memory (CNM) circuit block” to at least claim 7.  While “near memory” is still present, the examiner notes a different context.  Here, applicant is claiming the name (“CNM”) of a circuit block whose structure, as shown in FIG.4A, is definite. Thus, no 112 rejection is applied here. 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to David J. Huisman whose telephone number is 571-272-4168.  The examiner can normally be reached on Monday-Friday, 9:00 am-5:30 pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta, can be reached at 571-270-3995.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/David J. Huisman/Primary Examiner, Art Unit 2183