DETAILED ACTION
Claims 1-20 are pending.
The office acknowledges the following papers:
Oath filed on 9/24/2020 and 8/21/2020.

	Priority
No claim for priority has been made in this application.

Drawings
The Examiner contends that the drawings submitted on 8/13/2020 are acceptable for examination proceedings. 

Specification
The disclosure is objected to because of the following informalities:
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. The Applicant’s cooperation is requested in correcting any errors of which the Applicant may become aware.
Appropriate correction is required.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of 
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-2, 5, 9-10, 13, and 17-18 are rejected under 35 U.S.C. 102(a)(1 & 2) as being anticipated by Reinhardt et al. (U.S. 2016/0352598).
As per claim 1:
Reinhardt disclosed a processing unit comprising: 
a first thread configured to, in response to determining that a next network operation for the first thread can be combined with a next network operation for a second thread: 
generate a network command packet that implements the next network operation for the first thread and the next network operation for the second thread (Reinhardt: Figures 4 and 8 elements 254 and 704-710, paragraphs 38, 51-52, 74-75, and 92)(A packed network message is generated when multiple network messages include multiple threads performing a redundant operation on same data in a same destination node.), and 
cause the network command packet to be issued (Reinhardt: Figure 6 element 512, paragraph 85).
As per claim 2:
Reinhardt disclosed the processing unit of Claim 1, wherein the first thread is further configured to:
receive, from the second thread, data that specifies one or more attributes of the next network operation for the second thread (Reinhardt: Figures 4-6 elements 252, 282, 406, and 508, paragraphs 77-78, 81, 84-85)(The monitoring logic receives data about stored network messages to determine if messages can be combined, compressed, or aggregated. The information within the messages includes at least the operation type, destination node, memory addresses, and TIDs (i.e. attributes).), and 
compare the data that specifies one or more attributes of the next network operation for the second thread to attributes of the next network operation for the first thread to determine whether the next operation for the first thread can be combined with the next network operation for the second thread (Reinhardt: Figure 4 element 254, paragraphs 51-53)(Detecting network messages that can be combined, compressed, or aggregated involves comparing at least the operation type, destination node, memory addresses, and TIDs (i.e. attributes) between multiple network messages. For example, a combining network messages operation requires separate network messages having matching destination nodes and overlapping data.).
As per claim 5:
Reinhardt disclosed the processing unit of Claim 1, wherein: 
the processing unit further comprises a plurality of bits, wherein the value of each bit, from the plurality of bits, specifies whether a next network operation of a corresponding thread is combinable with the next network operation of an adjacent 
the first thread is further configured to examine the plurality of bits to identify the second thread (Reinhardt: Figure 6 element 508, paragraph 74 and 84)(The inserted flag bits are examined to determine if network messages for threads qualify for merging operations into packed network messages.).
As per claim 9:
Claim 9 essentially recites the same limitations of claim 1. Claim 9 additionally recites the following limitations:
a plurality of threads executing on a graphical processing unit (Reinhardt: Figure 4 elements 210 and 220a-c, paragraphs 6, 38, 51, 59-61, 80, and 99).
As per claim 10:
The additional limitation(s) of claim 10 basically recite the additional limitation(s) of claim 2. Therefore, claim 10 is rejected for the same reason(s) as claim 2.
As per claim 13:
The additional limitation(s) of claim 13 basically recite the additional limitation(s) of claim 5. Therefore, claim 13 is rejected for the same reason(s) as claim 5.
As per claim 17:
Claim 17 essentially recites the same limitations of claim 1. Therefore, claim 17 is rejected for the same reasons as claim 1.
As per claim 18:
.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 3-4, 6-8, 11-12, 14-16, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Reinhardt et al. (U.S. 2016/0352598), in view of Danilak (U.S. 7,746,350).
As per claim 3:
Reinhardt disclosed the processing unit of Claim 1.
Reinhardt failed to teach wherein the first thread is further configured to: determine whether the next network operation of the first thread is combinable with a next network operation of a third thread, and provide, to the second thread, data that specifies whether the next network operation of the first thread is combinable with the next network operation of the third thread.
However, Danilak combined with Reinhardt disclosed wherein the first thread is further configured to:
determine whether the next network operation of the first thread is combinable with a next network operation of a third thread (Danilak: Column 3 lines 45-65 and column 4 lines 45-65)(Reinhardt: Figures 4 and 8 elements 254 and 704-710, 
provide, to the second thread, data that specifies whether the next network operation of the first thread is combinable with the next network operation of the third thread (Reinhardt: Figures 4-6 elements 252, 282, 406, and 508, paragraphs 77-78, 81, 84-85)(The monitoring logic receives data about stored network messages to determine if messages can be combined, compressed, or aggregated. The information within the messages includes at least the operation type, destination node, memory addresses, and TIDs.).
The advantage of coalescing thread group operations into a single operation is that the single operation is more effectively performed. Thus, it would have been obvious to one of ordinary skill in the art to implement coalescing three or more network messages from three or more threads for the advantages of improved operations and reducing network traffic.
As per claim 4:
Reinhardt disclosed the processing unit of Claim 1.
Reinhardt failed to teach wherein the first thread is further configured to determine 
However, Danilak combined with Reinhardt disclosed wherein the first thread is further configured to determine that the next network operation for the first thread can be combined with the next network operation for the second thread in response to the next network operation for the first thread and the next network operation for the second thread specifying at least the same operation, for contiguous memory addresses, and for the same size data on the same processing element (Danilak: Column 3 lines 45-65 and column 4 lines 45-65)(Reinhardt: Figures 4 and 8 elements 254 and 704-710, paragraphs 38, 51-52, 74-75, and 92)(Reinhardt disclosed generating a packed network message when multiple network messages include multiple threads performing a redundant operation on same/overlapping data in a same destination node. Danilak disclosed coalescing up to 16 threads in a thread group performing reads of contiguous memory addresses. The combination allows for generating packed network messages that perform a redundant operation on contiguous memory addresses of data in a same destination node.).
The advantage of coalescing thread group operations into a single operation is that the single operation is more effectively performed. Thus, it would have been obvious to one of ordinary skill in the art to implement coalescing network messages from multiple threads for the advantages of improved operations and reducing network traffic.
As per claim 6:
Reinhardt disclosed the processing unit of Claim 5.
Reinhardt failed to teach wherein the second thread is an adjacent thread to the first thread in the plurality of bits.
However, Danilak combined with Reinhardt disclosed wherein the second thread is an adjacent thread to the first thread in the plurality of bits (Danilak: Figure 2 element 200, column 3 lines 45-65 and column 4 lines 45-65)(Reinhardt: Figure 4 elements 220a-c and 222, 59-61, 64, and 74)(Danilak disclosed a GPU executing thread groups. The combination allows for the lanes of the compute units of the GPU parallel processor of Reinhardt to execute thread groups. Adjacent threads executing on the lanes are configured to produce network messages.).
The advantage of coalescing thread group operations into a single operation is that the single operation is more effectively performed. Thus, it would have been obvious to one of ordinary skill in the art to implement coalescing network messages from multiple threads for the advantages of improved operations and reducing network traffic.
As per claim 7:
Reinhardt and Danilak disclosed the processing unit of Claim 6, wherein one or more adjacent neighbor threads to the first thread are between the first thread and another thread that is not combinable with other threads in the plurality of bits (Danilak: Figure 2 element 200, column 3 lines 45-65 and column 4 lines 45-65)(Reinhardt: Figure 4 elements 220a-c and 222, 59-61, 64, 74, 78, and 80-81)(Danilak disclosed a GPU executing thread groups. The combination allows for the lanes of the compute units of the GPU parallel processor of Reinhardt to execute thread groups. Adjacent threads 
As per claim 8:
Reinhardt disclosed the processing unit of Claim 1.
Reinhardt failed to teach wherein the network command packet includes a size parameter value that is a multiple of a size parameter value from the next network operation of the first thread.
However, Danilak combined with Reinhardt disclosed wherein the network command packet includes a size parameter value that is a multiple of a size parameter value from the next network operation of the first thread (Danilak: Column 3 lines 45-65 and column 4 lines 45-65)(Reinhardt: Figures 4 and 8 elements 254 and 704-710, paragraphs 38, 51-52, 74-75, and 92)(Reinhardt disclosed generating a packed network message when multiple network messages include multiple threads performing a redundant operation on same/overlapping data in a same destination node. Danilak disclosed coalescing up to 16 threads in a thread group performing reads of contiguous memory addresses. The combination allows for generating packed network messages that perform a redundant operation on contiguous memory addresses of data in a same destination node. The combination implements a size field in order for the packed network message to properly indicate the size of data to be accessed for the contiguous memory address.).
The advantage of coalescing thread group operations into a single operation is that the single operation is more effectively performed. Thus, it would have been obvious 
As per claim 11:
The additional limitation(s) of claim 11 basically recite the additional limitation(s) of claim 3. Therefore, claim 11 is rejected for the same reason(s) as claim 3.
As per claim 12:
The additional limitation(s) of claim 12 basically recite the additional limitation(s) of claim 4. Therefore, claim 12 is rejected for the same reason(s) as claim 4.
As per claim 14:
The additional limitation(s) of claim 14 basically recite the additional limitation(s) of claim 6. Therefore, claim 14 is rejected for the same reason(s) as claim 6.
As per claim 15:
The additional limitation(s) of claim 15 basically recite the additional limitation(s) of claim 7. Therefore, claim 15 is rejected for the same reason(s) as claim 7.
As per claim 16:
The additional limitation(s) of claim 16 basically recite the additional limitation(s) of claim 8. Therefore, claim 16 is rejected for the same reason(s) as claim 8.
As per claim 19:
The additional limitation(s) of claim 19 basically recite the additional limitation(s) of claim 3. Therefore, claim 19 is rejected for the same reason(s) as claim 3.
As per claim 20:
The additional limitation(s) of claim 20 basically recite the additional limitation(s) of claim 4. Therefore, claim 20 is rejected for the same reason(s) as claim 4.

	Conclusion
The following is text cited from 37 CFR 1.111(c): In amending in reply to a rejection of claims in an application or patent under reexamination, the applicant or patent owner must clearly point out the patentable novelty which he or she thinks the claims present in view of the state of the art disclosed by the references cited or the objections made. The applicant or patent owner must also show how the amendments avoid such references or objections.
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.  
Steinmacher-Burow (U.S. 2017/0064051), taught coalescing network messages using a NIC.
Chakrabarti et al. (U.S. 2019/0042478), taught write combining for remote writes.
Thakker et al. (U.S. 2020/0004550), taught combining load and store operations.
Makineni et al. (U.S. 8,718,096), taught packet coalescing.
Nyland et al. (U.S. 8,392,669), taught coalescing memory accesses of parallel threads.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACOB A. PETRANEK whose telephone number is (571)272-5988.  The examiner can normally be reached on M-F 8:00-4:30.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee Li can be reached on (571) 272-4169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JACOB PETRANEK/Primary Examiner, Art Unit 2183