Remarks
This office action is in response to the amendment filed on 1/31/2022.
Claims 14 and 17 have been cancelled.
Claim 21 has been added.
Claims 9-13, 15-16, and 18-20 have been amended.
Objection to claim 9 is withdrawn in view of Applicant’s amendment.
The 35 U.S.C. 101 rejection to claims 15-20 is withdrawn in view of Applicant’s amendment.
As indicated in previous office action mailed on 11/02/2021, claim 1 is interpreted under 35 U.S.C. 112(f). 
Claims 1-13, 15-16 and 18-21 are allowed with entering Examiner’s Amendment listed below.
Allowed claims 1-13, 15-16 and 18-21 are numbered as 1-19.

Information Disclosure Statement
The information disclosure statements filed 12/20/2021 and 3/25/2022 have been placed in the application file and the information referred to therein has been considered.

Examiner’s Amendment
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given after an interview with Mr. Charles A. Mirho (Reg#:41199) on 4/22/2022 to obviate potential 35 U.S.C. 112 issues, and to put the application in condition for allowance.
The application has been amended as follows: 

IN THE CLAIMS
Please amend claims 10, 15-16 and 20 , listed below:
10. (currently amended) A method for promoting thread convergence comprising: 
defining and inserting a convergence point of an execution barrier in a common section of code executed by a branch of the code; 
predicting that a plurality of threads will arrive at  the convergence point to execute the common section of the code; 
canceling the predicted arrival of a first one or more of the threads at the convergence point in [[a]] the branch of  the code  that will not execute to the convergence point; and 
delaying execution of a second one or more of the threads at the convergence point until a number of the threads that have not canceled their predicted arrival at the convergence point arrive at the convergence point or cancel their arrival at the convergence point.

15. (currently amended) An apparatus comprising:  
one or more processors; 
Page 6 of 12a memory comprising instructions that when executed by the one or more processors result in: 
analyzing program code and determining a common code segment to execute in parallel as a plurality of threads, and a divergent code segment; 
identifying synchronization point of the plurality of threads; 
inserting, into the common code segment, an instruction to set a convergence barrier based on the synchronization point;
inserting, into [[a]] the  common code segment , an instruction to join [[a]] the convergence barrier; 
inserting, into [[a]] the divergent code segment of the threads, an instruction to cancel the join to the convergence barrier, the instruction to cancel the join to the convergence barrier being independent from other instruction to join a different convergence barrier; and 
i
executing the program code with the inserted instructions to promote thread convergence by cooperating with a thread scheduler. 

16. (currently amended) The apparatus of claim 15, the convergence barrier located in a function called from a branch of the  common code segment.


21. (currently amended) The apparatus of claim 15, further comprising: [[a]] the thread scheduler configured to delay execution of those threads within the threads that reach the convergence  barrier until a number of remaining threads within the threads that have not canceled their join to the convergence barrier arrive at the convergence barrier or cancel their arrival at the convergence barrier.

Allowable Subject Matter
The following is an examiner’s statement of reasons for the identified allowable subject matter:
Claims 1-9 have been allowed in previous office action mailed on 10/04/2021. As indicated in the previous office action (p.19-25), and based on the further search performed for the claimed invention and considering the Applicant’s IDS, the closest prior art(s) as cited does not teach or suggest, either solely, or in combination, about the claimed limitations. 
Therefore, in view of the recited system/components, and the other limitations recited therewith in their entirety in claim 1, present subject matter that is novel and non-obvious over the prior art. 
Consequently, claim 1 is allowed. Claims 2-9 are also allowed due to their dependency on allowable independent claim 1.

For claims 10-13, the closest prior art(s) Sahu (Sahu et al., US2007/0143755A1) discloses limitation about predicting that a plurality of threads will arrive at an execution convergence point, executing a second one or more the threads at the convergence point until a number of the threads  that have cancelled their predicted arrival at the convergence point arrive at the convergence point.
However Sahu does not explicitly disclose the method for promoting thread convergence comprising: defining and inserting a convergence point of an execution barrier in a common section of code executed by a branch of the code; predicting that a plurality of threads will arrive at an execution the convergence point to execute the common section of the code; canceling the predicted arrival of a first one or more of the threads at the convergence point in the branch of the code  that will not execute to the convergence point; and delaying execution of a second one or more of the threads at the convergence point until a number of the threads that have not canceled their predicted arrival at the convergence point arrive at the convergence point or cancel their arrival at the convergence point.
Mazumdar (Subhra Mazumdar, US 2017/0315806A1) discloses delaying threads that reach the convergence point by inserting and executing a wait instruction.
However, Sahu modified by Mazumdar does not explicitly disclose defining and inserting a convergence point of an execution barrier in a common section of code executed by a branch of the code; predicting that a plurality of threads will arrive at the convergence point to execute the common section of the code; canceling the predicted arrival of a first one or more of the threads at the convergence point in the branch of the code  that will not execute to the convergence point.
Therefore, in view of the recited method for promoting thread convergence comprising: “defining and inserting a convergence point of an execution barrier in a common section of code executed by a branch of the code; predicting that a plurality of threads will arrive at an execution the convergence point to execute the common section of the code; canceling the predicted arrival of a first one or more of the threads at the convergence point in the branch of the code  that will not execute to the convergence point; and delaying execution of a second one or more of the threads at the convergence point until a number of the threads that have not canceled their predicted arrival at the convergence point arrive at the convergence point or cancel their arrival at the convergence point” in claim 10, and the other limitations recited therewith in their entirety, present subject matter that is novel and non-obvious over the prior art. 
Consequently, claim 10 is allowed. Claims 11-13 are also allowed due to their dependency on allowable independent claim 10.

For claims 15-16 and 18-21, the closest prior art(s) Jiao (Yang Jiao, US2012/0096474A1) discloses an apparatus comprising: one or more processors; a memory comprising instructions that when executed by the one or more processors result in: inserting, into a code segment, at least one instruction, inserting, into at least one first branch of the code segment, an instruction to cancel the join, and inserting, into the code segment, an instruction to wait.
Jiao does not explicitly disclose a convergence barrier. However, Houston (Houston et al., US9,424,099B2) discloses creating synchronization points and convergence barrier. 
The combination of Jiao and  Houston does not explicitly disclose the limitation about analyzing program code and determining a common code segment to execute in parallel as a plurality of threads, and a divergent code segment; identifying synchronization point of the plurality of threads; inserting, into the common code segment, an instruction to set a convergence barrier based on the synchronization point; and executing the program code with the inserted instructions to promote thread convergence by cooperating with a thread scheduler.
Therefore, in view of the recited apparatus with instructions when executed by processors result in: “analyzing program code and determining a common code segment to execute in parallel as a plurality of threads, and a divergent code segment; identifying synchronization point of the plurality of threads; inserting, into the common code segment, an instruction to set a convergence barrier based on the synchronization point; and executing the program code with the inserted instructions to promote thread convergence by cooperating with a thread scheduler” in claim 15, and the other limitations recited therewith in their entirety, present subject matter that is novel and non-obvious over the prior art. 
Consequently, claim 15 is allowed. Claims 16 and 18-21 are also allowed due to their dependency on allowable independent claim 15.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Brunie et al., “Simultaneous Branch and Warp Interweaving for Sustained GPU Performance”, discloses a method for simultaneous branch and warp interweaving multiple threads for improving performance.
Diamos et al., “SIMD Re-Convergence At Thread Frontiers”, discloses a method for re-convergence threads in thread frontiers for branch divergence, wherein the thread frontiers include all threads that have branched away from the current warp.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZHENG WEI whose telephone number is (571)270-1059 and Fax number is (571) 270-2059.  The examiner can normally be reached on M-F 9:00AM-5:00PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hyung S. Sough can be reached on 571-272-6799.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Any inquiry of a general nature of relating to the status of this application or proceeding should be directed to the TC 2100 Group receptionist whose telephone number is 571- 272-1000.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
 
/Z. W./
Examiner, Art Unit 2192

/S. SOUGH/SPE, AU 2192