DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over 
Maiyuran (patent application publication No. 2019/0324746).
Maiyuran taught the invention substantially  as claimed (as to claim 1) including  An apparatus comprising: a plurality of registers (register file 106, general register file 624, registers 1525)(e.g., see figs. 1, 6B); and one or more processing elements (FPUs in graphics execution unit (608)(e.g., see paragraph 0070, 0084-0085) (e.g., see figs. 5,6A) communicably coupled to the plurality of registers (e.g.,  see fig. 1), the one or more processing elements  (e.g.,  see paragraph 0080) comprising: a systolic array circuit(1608) (e.g.,. see figs. 17A,17B) to perform cross-channel operations on source data received from a single source register (1701, 1702,)(SRC, SRC1, SRC2)  of the plurality of registers, wherein the systolic array circuit is modified to: receive inputs from the single source register (1701) at different stages of the systolic array circuit (e.g., see fig. 17A)[data is input from the register 1701 different rows of the array where the rows comprise different  stages of the systolic array);  
Maiyuran taught perform cross-channel operations at channels of the systolic array circuit (e.g.. see paragraph 0182)  and disabling channels (e.g., see paragraphs 0165,0071) but did not expressly detail  bypass disabled channels of the systolic array circuit, the disabled channels not used to compute the cross channel operations. However note (e.g., see paragraphs 0165,0071) the predicate mask is used to enable or disable a SIMD execution channel and in some embodiments a disable channel may bypass execution. As this is part of the system that performs cross channel operation one of ordinary skill would have been motivated to apply the bypassing to cross channel operation at least to reduce time in performing cross operations by eliminating access to disabled channels]; and Maiyuran taught broadcast a final result of a final stage of the systolic array circuit to all channels of a destination register (e.g., see paragraphs 0168-0169).

As to claim 2 Maiyuran taught. The apparatus of claim 1, wherein the systolic array circuit is modified to broadcast the final result of the final stage further comprises the systolic array circuity to broadcast the final result from a final row of the systolic array circuit to all elements of a destination register of the plurality of registers (e.g., see paragraphs 0168-0169)[as the result value being a final result of the final stage n the processing of the data in the channels of the array when the last operation is performed the data is configured to travel to and exit from the final stage which make the final result  value being the final result of the final stage(e.g., see fig. 17A].
.	As to claim 3 Maiyuran taught The apparatus of claim 1, wherein the one or more processing elements are comprised in a graphics processing unit (GPU) (FPU(s) in graphics execution unit (608) (e.g., see paragraphs 0070, 0084-0085).

As to claim 4 Maiyuran taught . The apparatus of claim 1, wherein the systolic array circuit is modified to perform the cross channel operations by modifying data processing units (DPUs) of the systolic array circuit to perform the cross-channel operations on the source data and modifying routing of the DPUs of the systolic array circuit to receive input from different channels of the single source register at different stages of the systolic array circuit (e.g., figs. 17A,17B, and paragraphs 0088, 0157-0162, 0171).

As to claim 5 Maiyuran taught The apparatus of claim 4, wherein the different stages (173A, 173B, 173C, 173D) of the systolic array circuit each receive a different element of the single source register (1701,1702) on which to perform the cross- channel operations (e.g., see figs. 17A,17B).

As to claim 6 Maiyuran taught The apparatus of claim 1, wherein a subset of channels of the systolic array circuit perform the cross-channel operations and wherein other channels of the systolic array circuit that are not comprised in the subset of channels are disabled (e.g., see paragraphs 0140, 0157-0159 and 0164-0165).

As to claim 7 Maiyuran taught The apparatus of claim 1, wherein the cross- channel operations comprise at least one of a maximum operation, a minimum operation, or an are equal operation (e.g., see paragraphs 0166-0167).

As to claim 8 Maiyuran taught The apparatus of claim 1, wherein a first channel of a final stage of the systolic array circuit is modified to receive inputs from more than one channel of a previous stage of the systolic array circuit (e.g., see figs. 17A,17B) (e.g., see paragraphs 0168-0169).

As to claim 9 Maiyuran taught The apparatus of claim 1, wherein the apparatus is a single instruction multiple data (SIMD) machine (e.g., see paragraph 0084).
As to claim 10 Maiyuran taught The apparatus of claim 1, but did not expressly detail wherein the apparatus is a single instruction multiple thread (SIMT) machine. However Maiyuran taught single instruction multiple data (SIMD) operation and also combination of simultaneous multithreading (SMT) and fine grained interleaved Multi- Threading (IMT) (e.g., see paragraph 0081). Therefore one of ordinary skill would have  been motivated to implement the Maiyuran system using single instruction multiple thread (SIMT) operation at least to optimize the system for complex parallel operations on large amounts of data by use of the SIMD and the multithreading operation of Maiyuran to increase throughput.

As to claim 11, Maiyuran taught A computer-generated method comprising: receiving, at systolic array hardware circuit modified for cross-channel operations, inputs from a single source register at different stages of the systolic array hardware circuit (1701,1702)(SRC1,SRC2) (e.g., see figs. 17A,17B and paragraph 0158); performing cross-channel operations at channels of the systolic array hardware circuit; and broadcasting a final result of a final stage of the systolic array hardware circuit to all channels of a destination register (e.g., see figs. 17A,17B and paragraphs (0140, and 0157-0165, and 0168).
.
Mauyran taught perform cross-channel operations at channels of the systolic array circuit(e.g.. see paragraph 0182)  and disabling channels (e.g., see paragraphs 0165,0071) but did not expressly detail ; bypass disabled channels of the systolic array circuit, the disabled channels not used to compute the cross channel operations.[ note (e.g., see paragraphs 0165,0071)  the predicate mask is used to enable or disable a SIMD execution channel that in some embodiments a disable channel may bypass execution, as this is part of the system that performs cross channel operation one of ordinary skill would have been motivated to apply the bypassing to cross channel operation at least to reduce time in performing cross operations by eliminating access to disabled channels]; and  Maiyran taught broadcast a final result of a final stage of the systolic array circuit to all channels of a destination register (e.g., see paragraphs 0168-0169).
Due to claim the similarities between claims 11 and 17; claim 17 is rejected for the same reasons as claim 11 above. As to the limitation of a non-transitory computer readable medium in claim 17, Maiyuran taught this limitation (e.g., see paragraph 0036)

As to claims 12,18  Maiyran taught The method of claim 11, wherein subsequent stages of the systolic array hardware circuit receive a different element of the single source register (1701,1702) on which to perform operations (e.g., see figs. 17A,17B).

As to claims 13,19 Maiyuran taught The method of claim 11  wherein other channels of the systolic array hardware circuit that are not comprised in the subset of channels are disabled (e.g., see paragraphs 0140, 0157-0159 and 0164-0165).

As to claim 14 Maiyuran taught The method of claim 11, wherein the systolic array hardware circuit is part of a graphics processing unit (GPU)(figs. 4A, 5,10,15, 22, and paragraphs 0061, 0082, 0149-0151).

As to claim 15 Maiyuran taught  The method of claim 11, wherein a first channel of a final stage of the systolic array circuit is modified to receive inputs from more than one channel of a previous stage of the systolic array circuit (e.g., see figs. 17A,17B) (e.g., see paragraphs 0168-0169).


As to claims 16, 20  Maiyuran taught The method of claims 11,  wherein the systolic array hardware circuit is modified for the cross-channel operations by modifying data processing units (DPUs) of the systolic array hardware circuit to perform the cross- channel operations on the source data and modifying routing of the DPUs of the systolic array hardware circuit to receive input from different channels of the single source register at different stages of the systolic array hardware circuit (e.g., figs. 17A,17B, paragraphs 0088, 0157-0162, 0171).
	
As to claim 19 Maiyuran taught The non-transitory computer-readable medium of claim 17, wherein other channels of the systolic array hardware circuit that are not comprised in the subset of channels are disabled (e.g., see paragraphs 0140, 0157- 0159 and 0164-0165).

As to claim 20 Maiyuran taught The non-transitory computer-readable medium of claim 17, wherein the systolic array hardware circuit is modified for the cross-channel operations by modifying data processing units (DPUs) of the systolic array hardware circuit to perform the cross-channel operations on the source data and modifying routing of the DPUs of the systolic array hardware circuit to receive input from different channels of the single source register at different stages of the systolic array hardware circuit (e.g., figs. 17A,17B, paragraphs 0088, 0157-0162, 0171).


Double Patenting

The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20	 of U.S. Patent No. 11,182,337. Although the claims at issue are not identical, they are not patentably distinct from each other because the side by side showing of representative claims of the patent and the instant application show that both are directed to common subject matter.
Instant Application (SN 17/518,202)
Patent No. 11,182,337
1. An apparatus comprising: a plurality of registers; and one or more processing elements communicably coupled to the plurality of registers, the one or more processing elements comprising: a systolic array circuit to perform cross-channel operations on source data received from a single source register of the plurality of registers, wherein the systolic array circuit is modified to: receive inputs from the single source register at different stages of the systolic array circuit; 
perform cross-channel operations at channels of the systolic array circuit; 






bypass disabled channels of the systolic array circuit,

 the disabled channels not used to compute the cross-channel operations; 



and broadcast a final result of a final stage of the systolic array circuit to all channels of a destination register.
1. An apparatus comprising: a plurality of registers; and one or more processing elements communicably coupled to the plurality of registers, the one or more processing elements comprising: a systolic array circuit to perform cross-channel operations on source data received from a single source register of the plurality of registers, 
the systolic array circuit modified to receive inputs from the single source register and route elements of the single source register to multiple channels in the systolic array circuit, wherein routing circuitry of stages of the systolic array circuit is modified to receive input from different channels of the single source register at different stages of the systolic array circuit.
6. The apparatus of claim 1, wherein a subset of channels of the systolic array circuit perform the cross-channel operations and wherein other channels of the systolic array circuit that are not comprised in the subset of channels are disabled.
2. The apparatus of claim 1, wherein the systolic array circuit is modified to broadcast a result value from a final row of the systolic array circuit to all elements of a destination register of the plurality of registers.
11. A computer-generated method comprising: receiving, at the systolic array hardware circuit modified for cross-channel operations, source data from a single source register; performing the cross-channel operations on the source data at a subset of channels of the systolic array hardware circuit, wherein routing circuitry of stages of the systolic array circuit is modified to receive input from different channels of the single source register at different stages of the systolic array circuit; passing results of the cross-channel operations to subsequent stages of the systolic array hardware circuit; and broadcasting a result of a last stage of the systolic array hardware circuit to each channel of a destination register.


.Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Wu (patent application publication No. 2019/0114548) disclosed static block scheduling in massively parallel software defined hardware systems (e.g., see abstract).
Peterson (patent No. 5,168,499) disclosed fault detection  and bypass in a sequence information signal processor (e.g. see abstract).  
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC COLEMAN whose telephone number is (571)272-4163. The examiner can normally be reached M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached on 0-3995. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

ERIC . COLEMAN
Primary Examiner
Art Unit 2183



EC
/ERIC COLEMAN/Primary Examiner, Art Unit 2183