DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Drawings
Figures 1-4 should be designated by a legend such as --Prior Art-- because only that which is old is illustrated. See specification [0026-0029], which describes these figures as prior art.  See MPEP § 608.02(g).  Corrected drawings in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. The replacement sheet(s) should be labeled “Replacement Sheet” in the page header (as per 37 CFR 1.84(c)) so as not to obstruct any portion of the drawing figures. If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
The abstract of the disclosure is objected to because it comprises language that is not clear and concise.  Such language is “the present invention relates to”, the “operation circuit may perform”, and “the operation accelerator may be configured” (emphasis added). Correction is required.  See MPEP § 608.01(b).I.C.

Claim Objections
Claims 11 is objected to because of the following informalities.  

Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:

(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitations are: “operation groups”, “operation blocks”, “operation units”, and “vector calculation unit’.  The term “unit” has been interpreted to be a generic placeholder 2181.I.A.  The terms “group” and “block” are being interpreted in a manner similar to unit, as a generic placeholder.
Because this/these claim limitations are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
The limitation “operation unit” is being interpreted as in figures 14 and 15, and specification [0015], [0017] to comprise a storage unit, a multiplying circuit connected to the storage unit and input and further input and output connections as in figure 14, or a plurality of storage units, a multiplying circuit, a first section circuit connected to the 
The limitation “operation group” is being interpreted to comprise NxK operation units arranged as in figure 6 and including input and output connections. See [0058].
The limitation “operation block” is being interpreted to comprise N operation units arranged as in figure 6 and including input and output connections. See [0058].
The limitation “vector calculation unit” is being interpreted to comprise M*K operation units. See [0110].
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:



Claims 2, 14, and 16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.

Claim 2 lines 2-3 recite “the adder tree”.  It is unclear what adder tree this refers to.  Lines 2-3 recite “the adder circuit comprises M*K adder trees, one adder tree is corresponding to one operation block”.  It is unclear whether “the adder tree” is the “M*K adder trees” collectively, or one of the adder tree that corresponds to one operation block, and if so which one of those adder tree that is.
Claim 14 line 8 recites “the weight matrix”.  This limitation lacks antecedent basis. It is unclear to what matrix “the weight matrix” refers.  Claims 15-18 inherit the same deficiency as claim 14 by reason of dependence.
	Claim 16 line 3 recites “a weight matrix”.  It is unclear whether this weight matrix is the same weight matrix recited in claim 14 or a different weight matrix.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having 


Claims 14-18 are rejected under 35 U.S.C. 103 as being unpatentable over W.M Jose et al., Algorithm-oriented design of efficient many-core architectures applied to dense matrix multiplication, Analog Integr Circ Sig Process (2015), (hereinafter “Jose”) in view of J. Zhang et al., Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network, FPGA ’17, ACM, Feb 2017 (hereinafter “Zhang”).

Regarding claim 14, Jose teaches the following:
 the operation circuit reads a first matrix from a memory and respectively sends the M row vectors in the first matrix to the M operation groups, wherein the first matrix is an M*N matrix (Fig 2, buffer in DMA for memory, section 3, A for first matrix, Fig 4, section 4 for read/send from memory);
the operation circuit reads a second matrix from a memory and respectively writes the K column vectors of the weight matrix into the K operation blocks of each operation group, wherein the second matrix is an N*K matrix (Fig 2, buffer in DMA for memory, section 3, B for second matrix, Fig 4, section 4 for read/write a second matrix from memory); and
the operation circuit performs a matrix multiplication calculation of the first matrix and the second matrix within one clock cycle (Section 3 figure 1, section 4 second paragraph FPMA result every cycle for a matrix multiplication calculation 
Jose discloses reading a first and second matrix from memory but does not explicitly disclose distinct first and second memories.  However in the same field of endeavor, Zhang discloses an accelerator for a convolutional neural network that performs matrix multiplication (abstract, section 4.1 third para.).  Zhang further discloses first and second memories storing first and second matrix data (Fig 4).  It would have been obvious to one of ordinary skill in the art before the effective filing date to configure Jose’s DMA to comprise first and second memories within the DMA as disclosed by Zhang. As recognized by Zhang using on-chip memories allows opportunities to reuse data from each on chip memory (Section 4.1 third column).

Regarding claim 15, in addition to the teachings addressed in the claim 14 analysis, Jose teaches the following:
adding, by the adder circuit, calculation results of operation units belonging to a same operation block to obtain a calculation result of each operation block (fig 3, section 3, section 4 second paragraph).

Regarding claim 16, Jose teaches the claim 14 limitations.  Jose further discloses application to compute intensive applications such as scientific computing, but does not explicitly disclose application to a convolutional neural network.  However in the same field of endeavor Zhang discloses an accelerator that performs matrix multiplication for a convolutional neural network with multiplication of a weights by input feature maps 

Regarding claim 17, in addition to the teachings addressed in the claim 14 analysis, Jose teaches the following:
wherein M = N = K (figure 1, section 3 second paragraph).

Regarding claim 18, in addition to the teachings addressed in the claim 14 analysis, Jose teaches the following:
	wherein any two parameters M, N, and K are equal (figure 1).


Allowable Subject Matter
Claims 1, 3-10, and 12-13 are allowed.  Claim 2 would be allowable if rewritten to overcome the respective rejection under 35 USC 112(b).  Claim 11 would be allowable if rewritten to overcome the claim objection.  The following is a statement of reasons for the indication of allowable subject matter.  
Applicant claims apparatus and methods for matrix multiplication.  The apparatus as in claim 1 comprises a first memory, a second memory, an operation circuit, and a th piece of data in a gth column of vectors of the second matrix is written into a jth operation unit in a gth operation block in the K operation blocks; respectively send M row vectors of the first matrix to the M operation groups, wherein an ith row vector of the first matrix is sent to an ith operation group in the M operation groups, and a jth operation unit in each operation block in the ith operation group receives a jth piece of data in the ith row vector; and so that the adder circuit adds calculation results of operation units in each operation block to obtain a third matrix that is a product of the first matrix and the second matrix.
The primary reason for indication of allowable subject matter comprise the specific structure of the matrix multiplying circuit as interpreted under 35 USC 112f, and the specific steps performed by the controller in combination with the remaining limitations, specifically: 

the controller configured to respectively write K column vectors of the second matrix into the K operation blocks of each operation group, wherein a jth piece of data in a gth column of vectors of the second matrix is written into a jth operation unit in a gth operation block in the K operation blocks; respectively send M row vectors of the first matrix to the M operation groups, wherein an ith row vector of the first matrix is sent to an ith operation group in the M operation groups, and a jth operation unit in each operation block in the ith operation group receives a jth piece of data in the ith row vector.

Jose is the closest prior art found.  Jose discloses claimed subject matter in accordance with the claim 14-18 mappings above.  Jose does not, however disclose the specific structure of the operation units as interpreted under 35 USC 112(f) as in figure 14, or figure 15.  Furthermore, Jose discloses fetching data and sending it to processors in a pre-determined order specified by the DMA and control logic, but does not explicitly disclose the specific order claimed, i.e., K column vectors of the second matrix into the K operation blocks of each operation group, wherein a jth piece of data in a gth column of vectors of the second matrix is written into a jth operation unit in a gth operation block in the K operation blocks; respectively send M row vectors of the first matrix to the M operation groups, wherein an ith row vector of the first matrix is sent to an ith operation th operation unit in each operation block in the ith operation group receives a jth piece of data in the ith row vector.
Zhang discloses an accelerator for a convolutional neural network that performs matrix multiplication (abstract, section 4.1 third para.).  Zhang further discloses an array of compute units each comprising an array of processing elements that performing matrix multiplications in parallel (section 2.2, figure 2).  Zhang does not, however disclose the specific structure of compute units or processing elements as interpreted under 35 USC 112(f) as in figure 14, or figure 15.  Furthermore, Zhang does not explicitly disclose the specific order claimed, i.e., K column vectors of the second matrix into the K operation blocks of each operation group, wherein a jth piece of data in a gth column of vectors of the second matrix is written into a jth operation unit in a gth operation block in the K operation blocks; respectively send M row vectors of the first matrix to the M operation groups, wherein an ith row vector of the first matrix is sent to an ith operation group in the M operation groups, and a jth operation unit in each operation block in the ith operation group receives a jth piece of data in the ith row vector.
US 20180173676 A1 Tsai et al., (hereinafter “Tsai”) discloses an execution engine for computing convolution, which performs matrix multiplications (abstract).  Tsai further discloses a matrix engine that includes first and second buffers that store the input image and filter matrix respectively (Fig 3, Fig 7, [0035-0037]).  Tsai does not, however disclose the specific structure of compute units or processing elements as interpreted under 35 USC 112(f) as in figure 14, or figure 15.  Furthermore, Tsai does not explicitly disclose the specific order claimed, i.e., K column vectors of the second matrix into the K operation blocks of each operation group, wherein a jth piece of data in th column of vectors of the second matrix is written into a jth operation unit in a gth operation block in the K operation blocks; respectively send M row vectors of the first matrix to the M operation groups, wherein an ith row vector of the first matrix is sent to an ith operation group in the M operation groups, and a jth operation unit in each operation block in the ith operation group receives a jth piece of data in the ith row vector.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EMILY E LAROCQUE whose telephone number is (469)295-9289.  The examiner can normally be reached on 7:30 am - 5:00 pm, CST, every other Friday off.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor Aimee Li can be reached on 571-272-4169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  



/EMILY E LAROCQUE/Examiner, Art Unit 2182