DETAILED ACTION
Claims 1-20 are pending.
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 7/13/2021 has been entered. 
The office acknowledges the following papers:
Power of Attorney filed on 6/18/2021
Claims, specification, and remarks filed on 6/11/2021.

	Withdrawn objections and rejections
The drawing objections have been withdrawn due to amendment of the specification.
The 35 U.S.C. 112(a & b) rejections for claims 13-14 have been withdrawn due to amendment.

New Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 13-15 and 18-20 are rejected under 35 U.S.C. 102(a)(1 & 2) as being anticipated by Symes et al. (U.S. 2011/0106871).
As per claim 13:
Symes disclosed a method for performing multiply and accumulate ("MAC") operations comprising:
receiving, at an execution unit of a MAC unit, a set of values from a vector scalar register file (Symes: Figures 1-2 elements 110, 140, 217, and 220, paragraphs 54-55 and 63)(The broadest reasonable interpretation of a vector scalar register file is a register file containing vector registers (see paragraph 79). A set of vector register values are received by the SIMD MAC unit.), wherein the execution unit includes a multiplier and an adder (Symes: Figure 2 elements 130, 217, and 219-220, paragraphs 54-55 and 63), and wherein the MAC unit further comprises a respective one-write one-read ("1W/1R") ported register file having at least one accumulator, and wherein the execution unit is independently connected to the respective 1W/1R ported register file (Symes: Figures 1-2 elements 110, 130, and 215-220, paragraphs 54-55 and 63)(The accumulator register bank includes a single read port accessed from the controller and a single write port accessed from the MAC circuit. The accumulator register bank is directly 
calculating, using the multiplier, a product of the received set of values (Symes: Figures 2 and 5 elements 217 and 400-420, paragraphs 63 and 67); 
reading a current content of the accumulator of the MAC unit (Symes: Figure 2 element 130, paragraphs 54 and 63); 
calculating, using the adder, a sum of the read current content of the accumulator and the calculated product of the received set of value (Symes: Figures 2 and 5 elements 219 and 440, paragraphs 63 and 69); and 
writing the calculated sum to the accumulator of the MAC unit (Symes: Figures 2 and 5 elements 130 and 440, paragraphs 63 and 69).
As per claim 14:
Claim 14 essentially recites the same limitations of claim 13. Claim 14 additionally recites the following limitations:
one or more computer-readable storage media and program instructions collectively stored on the one or more computer-readable storage media (Symes: Figure 1 element 165, paragraphs 57 and 59).
As per claim 15:
Symes disclosed a multiply and accumulate ("MAC") unit comprising: 
a respective execution unit (Symes: Figures 1-2 elements 110 and 220, paragraphs 54-55 and 63); and 
a respective one-write one-read ("1W/1R") ported register file independently connected to the execution unit, the respective 1W/1R ported register file including at 
As per claim 18:
Symes disclosed the MAC unit of claim 15, wherein the at least one accumulator further comprises:
a plurality of accumulators, wherein the MAC unit is configured to perform a plurality of MAC operations in parallel using a respective accumulator of the plurality of accumulators (Symes: Figures 2 and 5 elements 130, 215-220, and 420-440, paragraphs 54)(The SIMD MAC circuit performs a plurality of MAC operations in parallel on each data element lane.).
As per claim 19:
Symes disclosed the MAC unit of claim 15, further comprising: 
at least one multiplier for performing the computation of the product (Symes: Figure 2 elements 215, 217, and 220, paragraph 63); and 
at least one adder for performing an addition of the product (Symes: Figure 2 elements 215, 219, and 220, paragraph 63).
As per claim 20:


New Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Symes et al. (U.S. 2011/0106871).
As per claim 13:
Symes disclosed a method for performing multiply and accumulate ("MAC") operations comprising:
receiving, at an execution unit of a MAC unit, a set of values from a vector scalar register 
calculating, using the multiplier, a product of the received set of values (Symes: Figures 2 and 5 elements 217 and 400-420, paragraphs 63 and 67); 
reading a current content of the accumulator of the MAC unit (Symes: Figure 2 element 130, paragraphs 54 and 63); 
calculating, using the adder, a sum of the read current content of the accumulator and the calculated product of the received set of value (Symes: Figures 2 and 5 
writing the calculated sum to the accumulator of the MAC unit (Symes: Figures 2 and 5 elements 130 and 440, paragraphs 63 and 69).
As per claim 14:
Claim 14 essentially recites the same limitations of claim 13. Claim 14 additionally recites the following limitations:
one or more computer-readable storage media and program instructions collectively stored on the one or more computer-readable storage media (Symes: Figure 1 element 165, paragraphs 57 and 59).

Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Symes et al. (U.S. 2011/0106871), in view of Mansell (U.S. 2020/0218538).
As per claim 16:
Symes disclosed the MAC unit of claim 15.
Symes failed to teach an architected accumulator register index, wherein the MAC unit is configured to perform the MAC operation by executing a processor instruction referencing the architected accumulator register index.
However, Mansell combined with Symes disclosed an architected accumulator register index, wherein the MAC unit is configured to perform the MAC operation by executing a processor instruction referencing the architected accumulator register index (Mansell: Figure 7 elements 220 and 232, paragraph 59)(Symes: Figures 1-2 element 130 and 215-220, paragraphs 54 and 63)(Symes disclosed SIMD MAC operations selecting vector and accumulator registers according to the controller. Symes doesn’t 
Symes disclosed a SIMD MAC operation using vector and accumulator registers, but didn’t disclose an encoding format to specify specific registers to be used. One of ordinary skill in the art would have been motivated by this lack of teaching to find the Mansell reference that shows an instruction encoding of a SIMD MAC operation that uses a vector and accumulator register. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the SIMD MAC encoding scheme of Mansell into the processor of Symes for the advantage of selecting specific source registers for executing SIMD MAC operations.

Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Symes et al. (U.S. 2011/0106871), further in view of Hokenek et al. (U.S. 2006/0041610).
As per claim 17:
Symes disclosed the MAC unit of claim 15.
Symes failed to teach wherein the execution unit of the MAC unit is configured to consecutively perform a plurality of MAC operations using a same accumulator for accumulating the product of each MAC operation of the plurality of MAC operations.
However, Hokenek combined with Symes disclosed wherein the execution unit of 
The advantage of issuing consecutive SIMD MAC operations using the same accumulator is that larger data sets can be reduced. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the instruction issuing method of Hokenek into Symes for the advantage of reducing larger data sets.

Claims 1 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Symes et al. (U.S. 2011/0106871), in view of Ramchandran et al. (U.S. 2006/0015703), in view of Official Notice.
As per claim 1:
Symes and Ramchandran disclosed a processor unit for multiply and accumulate ("MAC") operations, the processor unit comprising: 
a plurality of MAC units for performing a respective subset of MAC operations of a set of MAC operations (Ramchandran: Figure 4 elements 406-408, paragraphs 104-105)(Symes: Figures 1-2 elements 110, 215, and 220)(Ramchandran disclosed a processor with a scalar and vector MAC unit. The combination allows for the processor of 
another register file, wherein the respective execution unit of each MAC unit is configured to perform the respective subset of MAC operations of the set of MAC operations by computing a product of a set of values received from the another register file and adding the computed product to a content of the at least one accumulator of the MAC unit (Ramchandran: Figure 4 elements 406-408, paragraphs 104-105)(Symes: Figures 1-2 elements 140 and 145, and 215-220)(The vector register bank supplies operands to be multiplied by the SIMD MAC unit. The resulting product is accumulated 
The advantage of performing scalar MAC instructions is that code footprints of programs can be reduced. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement a scalar MAC unit within Symes for the advantages of reduced code sizes.
As per claim 11:
Symes and Ramchandran disclosed the processor unit of claim 1, wherein each MAC unit further comprises at least one multiplier for computing the product and at least one adder for performing the addition of the computed product (Ramchandran: Figure 4 elements 406-408, paragraphs 104-105)(Symes: Figure 2 elements 217 and 219, paragraph 63).

Claims 2-10 are rejected under 35 U.S.C. 103 as being unpatentable over Symes et al. (U.S. 2011/0106871), in view of Ramchandran et al. (U.S. 2006/0015703), in view of Official Notice, further in view of Mansell (U.S. 2020/0218538).
As per claim 2:
Symes and Ramchandran disclosed the processor unit of claim 1.
Symes and Ramchandran failed to teach wherein each MAC unit of the plurality of MAC units includes an associated index, wherein each MAC unit is configured to perform the respective subset of MAC operations by executing a processor instruction referencing the associated index.
However, Mansell combined with Symes and Ramchandran disclosed wherein each MAC unit of the plurality of MAC units includes an associated index, wherein each MAC unit is configured to perform the respective subset of MAC operations by executing a processor instruction referencing the associated index (Mansell: Figure 7 elements 220 and 232, paragraph 59)(Ramchandran: Figure 4 elements 406-408, paragraphs 104-105)(Symes: Figures 1-2 element 130 and 215-220, paragraphs 54 and 63)(Symes disclosed SIMD MAC operations selecting vector and accumulator registers according to the controller. Symes doesn’t explicitly state what mechanism is used for selecting a given register for an instruction. Mansell disclosed an instruction encoding format that includes register index identifiers to select a given register for reading from a register file. The combination implements the instruction encoding format of Mansell with the accumulator register identifier (i.e. associated index) into Symes for encoding and processing SIMD MAC instructions and scalar MAC instructions. Each of the SIMD and scalar MAC units perform a subset of instructions by respectively referencing a vector 
Symes disclosed a SIMD MAC operation using vector and accumulator registers, but didn’t disclose an encoding format to specify specific registers to be used. One of ordinary skill in the art would have been motivated by this lack of teaching to find the Mansell reference that shows an instruction encoding of a SIMD MAC operation that uses a vector and accumulator register. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the SIMD MAC encoding scheme of Mansell into the processor of Symes for the advantage of selecting specific source registers for executing SIMD MAC operations.
As per claim 3:
Symes, Ramchandran, and Mansell disclosed the processor unit of claim 2, wherein the associated index includes an architected accumulator register index of the at least one accumulator of each MAC unit (Mansell: Figure 7 elements 220 and 232, paragraph 59)(Ramchandran: Figure 4 elements 406-408, paragraphs 104-105)(Symes: Figures 1-2 element 130 and 215-220, paragraphs 54 and 63)(The combination implements the instruction encoding format of Mansell with the accumulator register identifier (i.e. architected accumulator register index) into Symes for encoding and processing SIMD MAC instructions and scalar MAC instructions.).
As per claim 4:
Symes, Ramchandran, and Mansell disclosed the processor unit of claim 2, the wherein the at least one accumulator includes a respective accumulator element (Symes: Figures 1-2 element 130, paragraph 54), wherein the computed product is added to a content of the respective accumulator element (Symes: Figure 2 elements 
As per claim 5:
Symes, Ramchandran, and Mansell disclosed the processor unit of claim 2, further comprising a dispatch/issue unit, the dispatch/issue unit being configured to process a plurality of processor instructions, select a MAC unit using the associated index, and send a respective set of processor instructions to the selected MAC unit for performing the set of MAC operations (Mansell: Figure 7 elements 220-222 and 232, paragraph 59)(Symes: Figures 1-2 elements 110, 145, 160, and 215-220, paragraphs 57-59)(The controller decodes incoming instructions and issues them to scalar and SIMD units for processing. The combination implements the instruction encoding format of Mansell with the accumulator register identifier (i.e. architected accumulator register index) into Symes for encoding and processing SIMD MAC instructions and scalar MAC instructions. Both the opcode and the accumulation register index identify the functional unit the instruction is to be sent to. Thus, it would have been obvious to one of ordinary skill in the art for the controller to use either field to obtain the same issuing result.).
As per claim 6:
Symes, Ramchandran, and Mansell disclosed the processor unit of claim 5, wherein the respective set of processor instructions further comprises at least one 
As per claim 7:
Symes, Ramchandran, and Mansell disclosed the processor unit of claim 5, wherein the dispatch/issue unit is associated with the MAC unit (Symes: Figure 2 elements 160, 215, and 220, paragraph 63).
As per claim 8:
Symes, Ramchandran, and Mansell disclosed the processor unit of claim 5, wherein the dispatch/issue unit is configured to dispatch the plurality of processor instructions in accordance with a single threaded ("ST") mode such that the selected MAC unit receives the respective set of processor instructions from a single thread (Symes: Figure 2 element 160, paragraphs 57-59)(Symes makes no mention of multithreading. As such, the controller issues instructions to the SIMD MAC and scalar MAC units in a single threaded mode by default.).
As per claim 9:

As per claim 10:
Symes, Ramchandran, and Mansell disclosed the processor unit of claim 5, wherein the dispatch/issue unit is configured to dispatch the plurality of processor instructions in accordance with a four-way simultaneous multithreading ("SMT4") mode such that each MAC unit of the plurality of MAC units receives the respective set of processor instructions from respective two threads (Symes: Figure 2 element 160, paragraphs 57-59)(Symes makes no mention of multithreading. Official notice is given that simultaneous multithreading (SMT) using four threads can be implemented for the advantage of increased performance by filling in latency stalls and pipeline bubbles with ready to execute instructions. Thus, it would have been obvious to one of ordinary skill in the art to implement SMT using four threads in Symes.).

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Symes et al. .
As per claim 12:
Symes and Ramchandran disclosed the processor unit of claim 1.
Symes and Ramchandran failed to teach being configured to perform further sets of MAC operations, wherein all the sets of MAC operations provide all elements of an output matrix, the output matrix being a result of a matrix convolution on an input matrix.
However, Ross combined with Symes and Ramchandran disclosed being configured to perform further sets of MAC operations, wherein all the sets of MAC operations provide all elements of an output matrix, the output matrix being a result of a matrix convolution on an input matrix (Ross: Figures 1-2 elements 140, 202, and 210-214, paragraphs 3, 82, 84-85, and 87-89)(Symes: Figure 2 elements 215-220, paragraph 63)(Ross disclosed convolution operations using large matrix data sets. The combination allows for convolution operation processing using the SIMD MAC unit of Symes.).
The advantage of performing convolution operations is that filtering operations can be performed on large data sets. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing data to implement convolution processing in Symes for the advantage of quickly performing filtering operations.

Response to Arguments
The arguments presented by Applicant in the response, received on 6/11/2021 are not considered persuasive.
Applicant argues for claims 1 and 13-15:


This argument is not found to be persuasive for the following reason. The accumulator register bank is directly connected to the adder of the SIMD MAC unit, which reads upon the independently connected language. The accumulator registers within the accumulator register bank store accumulated values and read upon the claimed accumulator. The combination of the accumulator and SIMD MAC unit reads upon the claimed MAC unit. In addition, moving the accumulator register bank closer and within the SIMD MAC “box” would have been obvious to one of ordinary skill in the art. Shifting the location of this element by itself isn’t a patentably distinct feature.

	Conclusion
The following is text cited from 37 CFR 1.111(c): In amending in reply to a rejection of claims in an application or patent under reexamination, the applicant or patent owner must clearly point out the patentable novelty which he or she thinks the claims present in view of the state of the art disclosed by the references cited or the 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACOB A. PETRANEK whose telephone number is (571)272-5988.  The examiner can normally be reached on M-F 8:00-4:30.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee Li can be reached on (571) 272-4169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JACOB PETRANEK/Primary Examiner, Art Unit 2183