DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

	The instant application having Application No. 16/833,610 has a total of 25 preliminary amended claims pending in the application; there are 3 independent claims and 22 dependent claims, all of which are ready for examination by the examiner.

INFORMATION CONCERNING OATH/DECLARATION
Oath/Declaration
The applicant’s oath/declaration has been reviewed by the examiner and is found to conform to the requirements prescribed in 37 C.F.R. 1.63.

INFORMATION CONCERNING DRAWINGS
Drawings
The applicant’s drawings submitted are acceptable for examination purposes.

ACKNOWLEDGEMENT OF REFERENCES CITED BY APPLICANT
As required by M.P.E.P.  609(C), the applicant’s submissions of the Information Disclosure Statement 09/29/2022 is acknowledged by the examiner and the cited references have been considered in the examination of the claims now pending. As required by M.P.E.P 609 C(2), a copy of the PTOL-1449 initialed and dated by the examiner is attached to the instant office action.
	
REJECTIONS BASED ON PRIOR ART

Claim Rejections - 35 USC § 102

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.


1.         Claims 1-24 and 33 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Culurciello et al. (US pub. 2018/0341495), hereinafter, “Culurciello”.

At the outset, Applicant is reminded that claims subject to examination will be given their broadest reasonable interpretation in light of the supporting disclosure. In re Morris, 127 F.3d 1048, 1054-55, 44 USPQ2d 1023,1027-28 (Fed. Cir. 1997). With this in mind, the discussion will focus on how the terms and relationships between the terms in the claims are met by the references.

2.         As per claim 1, Culurciello discloses an acceleration circuit (accelerator 100/200 of figs. 1 and 2) comprising: a first buffer circuit (maps cache 132 of fig. 2) storing maps data [see paragraph 0036, which discloses “the maps cache 132 and the weights cache 136 are implemented using one or more static random access memory (SRAM) buffers that are integrated into the compute unit 124”]; a first data network (see bidirectional arrows between maps cache 132 and vMAC 128) coupled to the first buffer circuit (see crossbar in fig. 3 and paragraph 0051); a second buffer circuit (weights cache 136) storing kernel data (see paragraph 0036); a second, serial data network (see arrows between maps cache 132 and vMAC 128) coupled to the second buffer circuit (see fig. 2); a first plurality of multiply and accumulate circuits arranged in a first array and coupled through the first data network to the first buffer circuit (see vMAC 128D comprising a plurality of MACs 238) and coupled through the second, serial data network to the second buffer circuit (see fig. 2 and paragraphs 0036 and 0043), each multiply and accumulate circuit of the first plurality of multiply and accumulate circuits comprising: a multiplier circuit (multiplier 240) to multiply a maps datum and a kernel datum to generate a multiplicative product (see paragraph 0043, which discloses part of a trace stored in the maps cache 132, hence a map datum; the other input is a weight value from the weight trace cache 136, hence a kernel datum); and a first adder circuit (adder 242) coupled to the multiplier (see fig. 2 and paragraph 0043); a shift register (shift register 252) coupled to the first plurality of multiply and accumulate circuits (see fig. 2 and paragraph 0045); a first control multiplexer adapted to provide a selected output in response to a first mode control word, the selected output comprising a bias parameter or a first or next accumulation sum (see fig. 2 and paragraphs 0045 and 0055, which disclose the second input of the gather adder 258 to be a previously accumulated sum or a bias term B, depending on the selected mode; the selection of this second input is implicitly performed by a multiplexer under the control of the selection of ‘cooperative operating mode’ or ‘independent operation mode’ by the control core 104); and a second adder circuit (gather adder 258) coupled to the shift register and to the first control multiplexer, the second adder circuit adapted to add the multiplicative product to the bias parameter or to the first accumulation sum to generate a second or next accumulation sum (see fig. 2 and paragraph 0045).

3.         As per claims 2 and 14, Culurciello discloses “The acceleration circuit of claim 1” [See rejection to claim 1 above], wherein in response to a first mode control word designating an independent mode, the first control multiplexer provides the bias parameter as the selected output (see paragraphs 0050 and 0062).

4.         As per claims 3 and 15, Culurciello discloses wherein in response to a first mode control word designating a cooperative mode, the first control multiplexer provides the bias parameter as the selected output for a first cycle and provides the first or next accumulation sum as the selected output for a plurality of subsequent cycles (see paragraphs 0045, 0050 and 0062).

5.         As per claims 4 and 16, Culurciello discloses wherein in response to a first mode control word designating a combined independent and cooperative mode, the first control multiplexer provides the bias parameter as the selected output for a first cycle and provides the first or next accumulation sum as the selected output for a first plurality of subsequent cycles; and following the first plurality of subsequent cycles, the first control multiplexer provides the bias parameter as the selected output for a next cycle and provides the first or next accumulation sum as the selected output for a second plurality of subsequent cycles (see paragraphs 0045, 0050 and 0062).

6.         As per claims 5 and 17, Culurciello discloses wherein the second, serial data network provides first kernel data to a first plurality of multiply and accumulate circuits, followed by sequentially providing the first kernel data to a second plurality of multiply and accumulate circuits (see paragraph 0060).

7.         As per claims 6 and 19, Culurciello discloses, further comprising: a maps buffer arbiter circuit coupled to the first buffer circuit and to the first data network, the maps buffer arbiter circuit adapted to determine a conflict in accessing the first buffer circuit and in response to the conflict, to implement a priority protocol for access to the first buffer circuit (see fig. 2 and paragraph 0045).

8.         As per claims 7 and 20, Culurciello discloses wherein the maps buffer arbiter circuit is further adapted to receive an address for selected maps data in the first buffer circuit and, in response, to obtain and provide the selected maps data (see fig. 2 and paragraph 0045).

9.         As per claims 8 and 21, Culurciello discloses, further comprising: a tensor decoder circuit coupled to the first data network, the tensor decoder circuit adapted to generate and output a sequence of addresses starting with a base address and incrementing the base address until the output address is equal to the base address plus a tensor length (see fig. 2 and paragraphs 0040 to 0042).

10.         As per claims 9 and 22, Culurciello discloses wherein the tensor decoder circuit further comprises: an operand collector coupled to the first data network, the operand collector adapted to transfer the output addresses to a maps buffer arbiter circuit or to the second buffer circuit, to obtain data corresponding to the output addresses, and to transfer the obtained data to the first plurality of multiply and accumulate circuits (see fig. 2 and paragraphs 0040 to 0042).

11.         As per claims 10 and 24, Culurciello discloses, further comprising: a mode control circuit adapted to provide or generate the first mode control word (see paragraph 0055).

12.         As per claim 11, Culurciello discloses, further comprising: a MAX circuit comprising a plurality of comparators, the plurality of comparators adapted to determine a maximum value of a plurality of second or next accumulation sums (see fig. 2 and paragraphs 0047 to 0048 teaching vMAX 140).

13.         As per claim 12, Culurciello discloses, further comprising: a plurality of second control multiplexers, each second control multiplexer of the plurality of second control multiplexers coupled to a first adder circuit of a multiply and accumulate circuit of the first plurality of multiply and accumulate circuits, each second control multiplexer of the plurality of second control multiplexers adapted to provide a selected output in response to a second mode control word, the selected output comprising a bias parameter or a first accumulation sum (see fig. 2 which shows selection of the second input of adder 242, between M2 or the accumulated sum, performed by multiplexer).

14.         As per claim 13, Culurciello discloses an acceleration circuit (accelerator 100/200 of figs. 1 and 2) comprising: a first buffer circuit (maps cache 132 of fig. 2) storing maps data [see paragraph 0036, which discloses “the maps cache 132 and the weights cache 136 are implemented using one or more static random access memory (SRAM) buffers that are integrated into the compute unit 124”]; a first data network (see bidirectional arrows between maps cache 132 and vMAC 128) coupled to the first buffer circuit (see crossbar in fig. 3 and paragraph 0051); a plurality of second buffer circuits (weights cache 136) storing kernel data, each second buffer circuit of the plurality of second buffer circuits storing different kernel data than another second buffer circuit of the plurality of second buffer circuits [see paragraph 0036, which discloses “the maps cache 132 and the weights cache 136 are implemented using one or more static random access memory (SRAM) buffers that are integrated into the compute unit 124” and paragraphs 0057 and 0060 which disclose different kernel data are stored in the plurality of weights caches]; a plurality of second, serial data networks, each second, serial data network of the plurality of second, serial data networks coupled to a corresponding second buffer circuit of the plurality of second buffer circuits (see fig. 4 and paragraph 0053); and a plurality of vector-vector acceleration circuits arranged in a plurality of arrays, each vector-vector acceleration circuit of the plurality of vector-vector acceleration circuits coupled through the first data network to the first buffer circuit (see vMAC 128D comprising a plurality of MACs 238), and each vector-vector acceleration circuit of the plurality of vector-vector acceleration circuits of a selected array of the plurality of arrays coupled through a corresponding second, serial data network of the plurality of second, serial data networks to a second buffer circuit of the plurality of second buffer circuits, each vector-vector acceleration circuit of the plurality of vector-vector acceleration circuits comprising: a plurality of multiply and accumulate circuits, each multiply and accumulate circuit of the plurality of multiply and accumulate circuits comprising (see fig. 2 and paragraphs 0036 and 0043): a multiplier circuit (multiplier 240) to multiply a maps datum and a kernel datum to generate a multiplicative product (see paragraph 0043, which discloses part of a trace stored in the maps cache 132, hence a map datum; the other input is a weight value from the weight trace cache 136, hence a kernel datum); and a first adder circuit (adder 242) coupled to the multiplier (see fig. 2 and paragraph 0043); a shift register (shift register 252) coupled to the plurality of multiply and accumulate circuits (see fig. 2 and paragraph 0045); a first control multiplexer adapted to provide a selected output in response to a first mode control word, the selected output comprising a bias parameter or a first or next accumulation sum (see fig. 2 and paragraphs 0045 and 0055, which disclose the second input of the gather adder 258 to be a previously accumulated sum or a bias term B, depending on the selected mode; the selection of this second input is implicitly performed by a multiplexer under the control of the selection of ‘cooperative operating mode’ or ‘independent operation mode’ by the control core 104); and a second adder circuit (gather adder 258) coupled to the shift register and to the first control multiplexer, the second adder circuit adapted to add the multiplicative product to the bias parameter or to the first accumulation sum to generate a second or next accumulation sum (see fig. 2 and paragraph 0045).

15.         As per claim 18, Culurciello discloses, wherein a first serial data network of the plurality of second, serial data networks provides first kernel data to a first array of vector-vector acceleration circuits and a second serial data network of the plurality of second, serial data networks provides second kernel data to a second array of vector-vector acceleration circuits, the first kernel data different than the second kernel data (see paragraphs 0057 and 0060 teaching that different kernel data are stored in the plurality of weight caches).

16.         As per claim 23, Culurciello discloses, further comprising: a tensor buffer coupled to the tensor decoder circuit; and a control core coupled to the tensor decoder circuit and to the first data network, the control core adapted to receive and decode a plurality of instructions, and to transfer a tensor instruction to the tensor buffer for execution by the tensor decoder circuit (see fig. 2 and paragraphs 0040 to 0042).

17.         As per claim 33, Culurciello discloses an acceleration circuit (accelerator 100/200 of figs. 1 and 2) comprising: a memory interface circuit (memory interface 102); a first buffer circuit (maps cache 132 of fig. 2) storing maps data [see paragraph 0036, which discloses “the maps cache 132 and the weights cache 136 are implemented using one or more static random access memory (SRAM) buffers that are integrated into the compute unit 124”]; a first data network (see bidirectional arrows between maps cache 132 and vMAC 128) coupled to the first buffer circuit and to the memory interface circuit (see crossbar in fig. 3 and paragraph 0051); a plurality of second buffer circuits (weights cache 136) storing kernel data, each second buffer circuit of the plurality of second buffer circuits storing different kernel data than another second buffer circuit of the plurality of second buffer circuits [see paragraph 0036, which discloses “the maps cache 132 and the weights cache 136 are implemented using one or more static random access memory (SRAM) buffers that are integrated into the compute unit 124” and paragraphs 0057 and 0060 which disclose different kernel data are stored in the plurality of weights caches]; a plurality of second, serial data networks, each second, serial data network of the plurality of second, serial data networks coupled to a corresponding second buffer circuit of the plurality of second buffer circuits (see fig. 4 and paragraph 0053); and at least one matrix-matrix acceleration circuit having a plurality of operating modes, the plurality of operating modes comprising an independent mode, a cooperative mode, and a plurality of combined independent and cooperative modes (see paragraphs 0045, 0050 and 0062), the at least one matrix-matrix acceleration circuit comprising: a plurality of matrix-vector acceleration circuits (see vMAC 128D comprising a plurality of MACs 238), each matrix-vector acceleration circuit of the plurality of matrix-vector acceleration circuits comprising an array of a plurality of vector-vector acceleration circuits, each matrix-vector acceleration circuit of the plurality of matrix-vector acceleration circuits coupled through the first data network to the first buffer circuit, and each matrix-vector acceleration circuit of the plurality of matrix-vector acceleration circuits coupled through a corresponding second, serial data network of the plurality of second, serial data networks to a different second buffer circuit of the plurality of second buffer circuits (see fig. 2 and paragraphs 0036 and 0043), each vector-vector acceleration circuit of the plurality of vector-vector acceleration circuits comprising: a plurality of multiply and accumulate circuits, each multiply and accumulate circuit of the plurality of multiply and accumulate circuits comprising: a multiplier circuit (multiplier 240) to multiply a maps datum and a kernel datum to generate a multiplicative product (see paragraph 0043, which discloses part of a trace stored in the maps cache 132, hence a map datum; the other input is a weight value from the weight trace cache 136, hence a kernel datum); and a first adder circuit (adder 242) coupled to the multiplier (see fig. 2 and paragraph 0043); a shift register (shift register 252) coupled to the plurality of multiply and accumulate circuits (see fig. 2 and paragraph 0045); a control multiplexer adapted to provide a selected output in response to a mode control word corresponding to a selected operating mode of the plurality of operating modes, the selected output comprising a bias parameter or a first or next accumulation sum (see fig. 2 and paragraphs 0045 and 0055, which disclose the second input of the gather adder 258 to be a previously accumulated sum or a bias term B, depending on the selected mode; the selection of this second input is implicitly performed by a multiplexer under the control of the selection of ‘cooperative operating mode’ or ‘independent operation mode’ by the control core 104); and a second adder circuit (gather adder 258) coupled to the shift register and to the first control multiplexer, the second adder circuit adapted to add the multiplicative product to the bias parameter or to the first accumulation sum to generate a second or next accumulation sum and to provide the second or next accumulation sum as an output corresponding to the selected operating mode (see fig. 2 and paragraph 0045); and a MAX circuit (vMAX 140) comprising a plurality of comparators, the plurality of comparators adapted to determine a maximum value of a plurality of second or next accumulation sums (see fig. 2 and paragraphs 0047 to 0048).

CLOSING COMMENTS
CONCLUSION

a. STATUS OF CLAIMS IN THE APPLICATION 

            The following is a summary of the treatment and status of all claims in the 

application as recommended by M.P.E.P. 707.07(i):

a (1) CLAIMS REJECTED IN THE APPLICATION 

            Per the instant office action, claims 1-24 and 33 have received a first action on the merits and are subject of a first action non-final.
b. DIRECTION OF FUTURE CORRESPONDENCES

            Any inquiry concerning this communication or earlier communications from the 

Examiner should be directed to Ernest Unelus whose telephone number is (571) 272-

8596. The examiner can normally be reached on Monday to Friday 9:00 AM to 5:00PM. 


IMPORTANT NOTE

            If attempts to reach the above noted Examiner by telephone are unsuccessful, the Examiner's supervisor, Mr. Idriss Alrobaye, can be reached at the following telephone number: Area Code (571) 270-1023.
The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through private PAIR only. For more information about the PMR system, see her//pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217- 91 97 (toll-free).

/Ernest Unelus/
Primary Examiner
Art Unit 2181