Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

EXAMINER'S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

The application has been amended as follows: 

	In the Title:
Replace Title with “Accelerator Comprising Input and Output Controllers for Feeding Back Intermediate Data Between Processing Elements via Cache Module”.

Allowable Subject Matter
Claims 1-3 and 5-15 are allowed.

The following is an examiner’s statement of reasons for allowance:

	None of the cited prior art of record appear to teach or suggest, in combination with the other recited features, wherein the control module is configured to control at least one output control unit of at least one processing module and one or more input control units of one or more processing modules to feed back data processed by the at least one processing module to the one or more processing modules via the cache module.
	After consideration of Applicant’s Remarks P5-7 filed 5/23/2022, the presented arguments are found to be persuasive.
The closest prior art of record, Zejda, discloses an accelerator comprising a plurality of processing elements 341, where each kernel 340 of the accelerator comprises an interconnect that couples an input data path and an output data path. In particular, interconnect 340 is coupled to the input of a read circuit 346 and the output of a cache 348, where the contents of cache 348 may originate from the write circuit 352 or RAM 226. [C6, L66 to C7, L42].
Control logic 342 is to control the various circuits of the processing elements 341, including 344, 346, 356, 350, and 352 [C7 L43 to C8, L3].
Zejda further discloses an embodiment in which 210 and 116 are integral, but does not teach or suggest a control module wherein the control module is configured to control at least one output control unit of at least one processing module and one or more input control units of one or more processing modules to feed back data processed by the at least one processing module to the one or more processing modules via the cache module as currently recited in claim 1. 
Dally discloses an accumulator buffer which corresponds to a cache module for storing output data for reuse as input data. However, Dally does not appear to cure the deficiency of Zejda because Dally is similarly silent to a control module to control the one or more input control units to receive data from one or more output control units of the processing modules via a cache module.
	Henry US 2018/0189639 [Fig. 22] discloses a common memory acting as a cache module which is configured to store data output from a plurality of processing units of an accelerator, and to feed back such data to one or more processing units of the accelerator. See also C-nodes feeding back output of Z-nodes [Fig. 40]. However, Henry similarly appears silent to a control module which is configured to control an input unit and an output unit of respective processing modules to perform such feedback between one or more source processing modules and one or more destination processing modules via the cache module.
	Barik US 2018/0307980 discloses control logic on SoC 1300 for controlling data flow to and from buffers for the purpose of organizing processing of a convolution operation into multiple subunits which can be performed by corresponding compute units in a GPGPU. An instruction to perform a computation employing specialized hardware in the accelerator includes input addresses and output addresses of the buffers to specify filter and/or kernel data [0185-0188]. Such data may comprise intermediate data which is stored in the buffers for transmission between processing modules [0058]. However, Barik is similarly silent to a control module to control the processing modules to feed back data using an input control unit and an output control unit.
	Guo’s “Survey of FPGA-Based Neural Network Inference Accelerator” discloses that processing modules in a neural network may transfer data in a daisy chained manner [11:13, S5.2.2]. A daisy chain refers to the output of one element being connected locally to the input of a next element. Guo further discloses that a controller on the FPGA handles communication with the host and controls all other modules on the FPGA [S5.3; Fig. 4]. However, Guo similarly does not specifically disclose the controller controlling an input control unit and output control unit to perform this function using a cache module.
	Yu US 2018/0046913 discloses an accelerator system comprising a cache module [Figs. 4-6] used to feed back output data from a Nonlinear layer of a compute array back to an Add layer. The DDR memory also may perform such feedback for the Conv and Adder Tree layer. The buffer is managed by a controller for the Computation and Buffer blocks. However, Yu appears silent to where each processing module of the compute array comprises a respective input control unit and output control unit, and where the control module specifically feeds back data between an input and output unit of the processing modules via the cache module.
	Hence, although using output data as input data for another layer or processing element (i.e. feeding back output) was known, and individual components of a neural network accelerator were known in the art, the cited prior art does not appear to teach or suggest the combination of a control module, cache module, and I/O control units arranged to feedback intermediate data as presented in the claims.
Accordingly, claim 1 is allowed. Claims 2-3 and 5-15 recite similar subject matter and are allowed on similar grounds.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HEWY H LI whose telephone number is (571)272-8714. The examiner can normally be reached Mon-Fri 10-6.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Rones can be reached on (571)272-4085. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/HEWY H LI/Examiner, Art Unit 2136                                                                                                                                                                                                        
/CHARLES RONES/Supervisory Patent Examiner, Art Unit 2136