DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Applicant’s amendment filed on June 03, 2021 has been considered and entered.
Accordingly, claims 1-10 are pending in this application. Claims 1 and 6 are currently amended; claims 2-5 and 7-10 are previously presented.
Specification
The following document is incorporated by reference: Song Han et al., “EIE: Efficient Inference Engine on Compressed Deep Neural Network, ISCA 2016:243-254”. Pursuant to MPEP 608.01(p), examiner is requiring the applicant to provide a copy of the incorporated reference. If the applicant wants this reference to be printed in the front page of a potential patent publication, a submission of an information disclosure statement is required. 
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitations are: 
Convolution and pooling unit for performing a convolution and pooling operation […] in claim 1.
Full connection unit for performing a full connection calculation […] in claim 1.
Convolution unit for performing a multiplication operation […] in claim 2.
Adder tree unit for accumulating output results […] in claim 2.
Nonlinear unit performing a nonlinear processing […] in claim 2. 
Pooling unit for performing a pooling operation […] in claim 2. 
Input vector buffer unit for buffering […] in claim 3. 
Pointer information buffer unit for buffering […] in claim 3. 
Weight information buffer unit for buffering […] in claim 3. 
Output buffer unit for buffering […] in claim 3. 
Activation function unit for performing an activation function operation […] in claim 3. 

The following is the interpretation of the 112(f) limitations:
a.	Convolution and pooling unit (claim 1): See Fig. 4 and paragraph [0046-0052].
b.	Full connection unit (claim 1): See Fig. 5 and paragraph [0054-0061].
c.	Convolution unit (claim 2) See Fig. 4 reference symbol convolver and paragraph [0049].
d.	Adder tree unit (claim 2) See Fig. 4 reference symbol adder tree and paragraph [0050].
e.	Nonlinear unit (claim 2) See Fig. 4 reference symbol Nonlinear and paragraph [0051].
f.	Pooling unit (claim 2): See Fig. 4 reference symbol pooling and paragraph [0052].
g.	Input vector buffer unit (claim 3): See Fig. 5 reference symbol ActQueue and paragraph [0056].
h.	Pointer information buffer unit (claim 3): See Fig. 5 reference symbol PtrRead and paragraph [0057].
i.	Weight information buffer unit (claim 3): See Fig. 5 reference symbol SpmatRead and paragraph [0058].
j.	Output buffer unit (claim 3): See Fig. 5 reference symbol ActBuffer and paragraph [0060].
k.	Activation function unit (claim 3): See Fig. 5 reference symbol Function and paragraph [0061].
If applicant does not intend to have these limitations interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitations to avoid them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitations 
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1-5 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites “their state machines” in lines 17-18. It is unclear what the term “their” refers to. It is unclear whether “their” refers to the plurality of convolution and pooling units or whether “their” refers to both the plurality of convolution and pooling units and full connection unit. Perhaps applicant may want to recite “corresponding state machines” instead. Claims 2-5 inherit the same deficiency as claim 1 by reason of dependence.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-10 are rejected under 35 U.S.C. 103 as being unpatentable over Guo1 et al. (NPL - "From Model to FPGA: Software-Hardware Co-Design for Efficient Neural Network Acceleration"), hereinafter Guo. The rejection is based on the combination of features from different embodiments in the publication, and in view of Dally et al. (US 20180046906 A1), hereinafter Dally and Kato et al. (US-PGPUB 20110239032 A1), hereinafter Kato.
Regarding claim 1, Guo teaches a convolutional neural network (CNN) accelerator comprising:
a controller (Guo page 15 controller); a convolution and pooling unit (Guo page 15 PE and page 16 shows the structure of one PE in page 15);
	Although the Aristotle architecture for CNN acceleration of Guo does not teach a full connection unit, the Descartes architecture of Guo teaches a different processing element for a full connection operation (Guo page 19 figure on the right).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, to modify the Aristotle architecture of Guo using the Descartes architecture and configure the processing elements by cascading the Aristotle processing element and the Descartes processing element. The output buffer of the Aristotle processing element shown in page 16 may serve as the input buffer to the Descartes processing element. By doing so, the control unit will also be configured to control the operation and functionality of the different processing elements. Further, the accelerator will also be configured to perform compression on the weight matrix information input to the Descartes processing element since it is designed for sparse neural network acceleration.

Therefore, the combination of the two neural network accelerators of Guo teaches a full connection unit.
Accordingly, Guo as modified teaches a neural network accelerator comprising:
a […] convolution and pooling unit, […] for performing a convolution and pooling operation, for a first iteration number of times, on a first plurality of sub-blocks of input data in parallel in accordance with convolution parameter information to obtain an input vector of a sparse neural network;
a full connection unit for performing a full connection calculation, for a second iteration number of times, on the input vector in accordance with weight matrix position information of a full connection layer to finally obtain a calculation result of the sparse convolutional neural network, wherein each input vector comprises a second plurality of sub-blocks, and wherein the full connection unit performs a full connection operation on the second plurality of sub-blocks in parallel; and
a controller for determining and sending the convolution parameter information and the weight matrix position information of the full connection layer to the convolution and pooling unit and the full connection unit respectively, and controlling reading of the input vectors on respective iterative levels in plurality of convolution and pooling units and the full connection unit and their state machines.
Guo as modified does not explicitly teach a plurality of convolution and pooling unit, each for performing a convolution and pooling operation. Further, Guo as modified does not explicitly teach the synchronizing signals between each of the plurality of convolution and pooling units and the full connection unit.
However, in the same field of endeavor, Dally teaches an accelerator for sparse convolutional neural network comprising of an array of processing elements, each processing elements are configured to perform convolution operations. Further, Dally teaches that each processing elements comprises a multiplier array, an accumulator array, and a post-processing unit that performs non-linear activation function and a pooling function (Dally Fig. 2A and 3A and paragraphs [0038, 0044-0045, and 0076]).
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, to modify the accelerator of Guo using Dally and configure multiple processing elements of the Aristotle architecture shown in Guo page 15 to have the same structure as the processing element shown in page 16 of Guo which comprises a convolver unit, an adder tree unit, a non-linear processing (ReLU) unit, and a pooling unit consistent with the teaching of Dally where multiple processing elements are configured to include the same structure for performing the same functionality.
The motivation to do so is to increase parallelism beyond a single processing element (PE) by using multiple PEs that are operated in parallel working on different tiles/sub-blocks of input activations/data (Dally paragraph [0104]).
Therefore, the combination of Guo as modified in view of Dally teaches a plurality of convolution and pooling unit, each for performing a convolution and pooling operation.
Guo as modified in view of Dally does not explicitly teach the controller synchronizing signals between each of the plurality of convolution and pooling units and the full connection unit.
However, on the same field of endeavor, Kato teaches a convolutional neural network (CNN) processing unit that includes a control unit for controlling the operations of the CNN processing unit 
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, to modify the accelerator of Guo in view of Dally using Kato and configure the controller to synchronize the operations of the convolution and pooling units and the full connection unit.
The motivation to do so is because, by operating in synchronization with each other, the product-sum operations and loading of the data can be pipelined (Kato paragraph [0058]).
Therefore, the combination of Guo as modified in view of Dally and Kato teaches the controller synchronizing signals between each of the plurality of convolution and pooling units and the full connection unit.

Regarding claim 2, Guo as modified in view of Dally and Kato teaches all the limitations of claim 1 as stated above. Further, Guo as modified in view of Dally and Kato teaches 
wherein each of the convolution and pooling units comprises:
a convolution unit for performing a multiplication operation of the first plurality of sub-blocks of input data and a convolution parameter (Guo page 16 convolution unit – Convolvers, input data – Data, convolution parameter – weights);
an adder tree unit for accumulating output results of the convolution to complete a convolution operation (Guo page 16 adder tree – Adder Tree);
a nonlinear unit for performing a nonlinear processing on the output results of the convolution unit (Guo page 16 nonlinear unit – ReLu); and
a pooling unit for performing a pooling operation on the output results of the convolution unit after the nonlinear processing to obtain the input data on the next iterative level or finally obtain the input vector of the sparse neural network (Guo page 16 pooling unit – Pool; obtain the input data on the next iterative level – intermediate data being fed back to the adder tree for next iterative level, finally obtain the input vector of the sparse neural network – output of the Output Buffer). The motivation to combine is the same as claim 1.

Regarding claim 3, Guo as modified by Dally and Kato teaches all the limitations of claim 1 as stated above. Further, Guo as modified in view of Dally and Kato teaches wherein the full connection unit further comprises:
an input vector buffer unit for buffering the input vector of the sparse neural network (Guo page 19 input vector buffer unit - Act_0);
a plurality of process elements, each process element comprising (Guo page 19 shows a plurality of process element. The figure on the right in page 19 shows M PEs comprising the same structure):
a pointer information buffer unit for buffering compressed pointer information of the sparse neural network in accordance with the weight matrix position information of the full connection layer (Guo page 19 pointer information buffer unit – PtrRead_0);
a weight information buffer unit for buffering compressed weight information of the sparse neural network in accordance with the compressed pointer information of the sparse neural network (Guo page 19 weight information buffer – SpmatRead_0);
an arithmetic logic unit (ALU) for performing a multiplication-accumulation calculation in accordance with the compressed weight information and the input vector of the sparse neural network (Guo page 19 arithmetic logic unit – Mac unit is used for multiplication-addition calculation in accordance with compressed weight information from SpmatRead_0 buffer and the input vector from the Act_0 buffer); and
an output buffer unit for buffering an intermediate calculation result and a final calculation result of the ALU (Guo page 19 output buffer unit – ActBuf_0); and
an activation function unit for performing an activation function operation on the final calculation result in the output buffer unit to obtain the calculation result of the sparse convolutional neural network (Guo page 19 activation function unit – sigmoid).

Regarding claim 4, Guo as modified by Dally and Kato teaches all the limitations of claim 2 as stated above. Further, Guo as modified in view of Dally and Kato teaches wherein the adder tree unit further adds a bias in accordance with the convolution parameter information, in addition to accumulating output results of the convolution unit (Guo page 16 adder tree takes in a bias as input).

Regarding claim 5, Guo as modified by Dally and Kato teaches all the limitations of claim 3 as stated above. Further, Guo as modified in view of Dally and Kato teaches wherein the compressed weight information of the sparse neural network comprises a position index value and a weight value, and (compressed weight information comprises weight value and index value)
the ALU is further configured to:
perform a multiplication operation of the weight value and a corresponding element of the input vector (Mac unit is used for multiplication calculation in accordance with compressed weight information from SpmatRead_0 buffer which comprise a weight value and a corresponding element of the input vector from the Act_0 buffer),
read data in a corresponding position in the output buffer unit in accordance with the position index value, and add the data to the calculation result of the multiplication operation above (Guo page 19 data from the ActBuf_0 is read by the Mac unit as shown by the data line coming from the ActBuf_0 going to the Mac unit specified by the index), and
write the calculation result of the addition into the corresponding position in the output buffer unit in accordance with the position index value (Guo page 19 result of the Mac unit calculation is written to the ActBuf_0 specified by the index). 
Further, Dally also teaches a compressed weight information comprising a position index and a weight value (Fig. 3A 305 and 315 and paragraphs [0066 and 0068]); an ALU configured to perform a multiplication operation of the weight value and a corresponding element of the input vector (Fig. 2A ALU – FxI multiplier array 325 and accumulator array 340 which performs a MAC operation similar to MAC unit of Guo); reading data in a corresponding position in the output buffer unit in accordance with the position index value, and adding the data to the calculation result of the multiplication operation above and write the calculation result of the addition into the corresponding position in the output buffer unit in accordance with the position index value (Dally Fig. 3D and paragraph [0093]). The motivation to combine is to exploit weight and/or activation sparsity to reduce energy consumption and improve processing throughput (Dally paragraph [0030]).

Claims 6-10 are directed to a method practiced by the apparatus of claims 1-5. All steps performed by the method of claims 6-10 would be practiced by the corresponding apparatus of claims 1-5. Analysis of claims 1-5 applies equally to claims 6-10 respectively.

Response to Arguments
In response to applicant’s arguments with respect to the objection to the specification, applicant stated that the incorporated reference was included as an attachment. However, a copy of the incorporated reference has not been submitted, and there is no copy of the reference on file.
In response to applicant’s arguments with respect to the 35 U.S.C. 112(b) rejection of claims 1-5, applicant amended claim 1 to recite the plurality of convolution and pooling units and the full 
Applicant’s arguments, see remarks page 9, filed 06/03/2021, with respect to the rejection of claims 1-10 under 35 U.S.C. 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground of rejection is made in view of amendments made and newly found prior art reference.
In response to applicant’s arguments with respect to the 35 U.S.C. 103 rejection of claims 1-10, applicant amended claim 1 to include the features of “a controller for synchronizing signals between each of the plurality of convolution and pooling units and the full connection unit”. Applicant argued that neither Guo nor Dally teaches the added feature of a controller synchronizing signals between each of the plurality of convolution and pooling units and the full connection unit. Examiner agrees. However, on the same field of endeavor, Kato discloses a convolutional neural network (CNN) processing unit that includes a control unit as shown in Fig. 2 for controlling the operation of the of CNN processing unit including synchronizing loading or supplying data to the multipliers and adders. Therefore, it would have been obvious to a person of ordinary skill in the art to modify the controller of Guo in view of Dally using the teaching of Kato and configure the controller to synchronize operation of the convolution and pooling units and full connection unit of Guo such as synchronizing loading the input data to the convolvers and adder tree of each of the convolution and pooling unit and the MAC unit of the full connection unit for the purpose of pipelining the product-sum operations as disclosed by Kato.
In response to applicant’s arguments with respect to the 35 U.S.C. 103 rejection of claims 2-10, applicant amended independent method claim 6 to include the feature similar to amended claim 1. Applicant relied on claim 1 argument for claims 2-10 arguments. The rejection for claims 2-10 have been modified for the same reason as claim 1.
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CARLO C WAJE whose telephone number is (571)272-5767.  The examiner can normally be reached on 7:30-4:30 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee Li can be reached on (571) 272-4169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/C.W./
Carlo WajeExaminer, Art Unit 2182                                                                                                                                                                                                        (571)272-5767


/Aimee Li/Supervisory Patent Examiner, Art Unit 2183                                                                                                                                                                                                        


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 The publication being relied upon as ground of rejection is a PowerPoint presentation, and the presentation mostly contains figures without providing sufficient details to every structure shown in the figures. A different publication by the same author with a similar title published after the filing date of the application provides a more detailed explanation of the structure and operation of the neural network accelerators presented in the reference being relied upon.