DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 2-21 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-5, 7-14, 16-17, 19-23 of U.S. Patent No.10909741. Although the claims at issue are not identical, they are not patentably distinct from each other. The following table illustrates the claim correspondence between the instant application and the U.S. Patent 10909741.
Instant Application
Claim 2
3
4
5
6
7
8
9
10
Patent No: 10909741
Claim 1
2
3
4
5
7
8
9
10


Instant Application
Claim 11
12
13
14
15
16
17
18
19
Patent No: 10909741
Claim 11
12
13
14
16
17
19
20
21


Instant Application
Claim 20
21
Patent No: 10909741
Claim 22
23


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Clothset et al. (US 2012/0139926 A1, hereinafter Clothset) in view of Afra et al. (US 2017/0178398 A1, hereinafter Afra).
Regarding claim 2, Clothset teaches:
An apparatus comprising: 
execution circuitry to execute shaders; ([0038], “In a more particular example, a system for multiprocessing comprises a plurality of processing units, each comprising a Single Instruction Multiple Data ( SIMD) execution unit.” [0109], “Processing resources 255 can execute shader code in execution cores 263a-263n, and in this particular example, shader instances 270a and 270b are depicted, which are differences instances of the same shader. Shader instance 270n is also depicted as an instance of shader code for a different shader. The above is a specific example of grouping for program code instances relating to shading of intersections between rays and scene geometry, during ray tracing. However, the example applies in general SIMD processing of program code instances.”) and 
ray tracing circuitry (FIG. 5 and 6) to execute a ray traversal thread, ([0114], “FIG. 6 depicts aspects of a method that includes ray traversal,”) the ray tracing circuitry is to: 
responsive to the ray traversal thread, traverse a single ray through an acceleration data structure comprising a plurality of hierarchically arranged nodes and intersect the single ray with a primitive contained within at least one of the plurality of hierarchically arranged nodes; ([0114], “FIG. 6 depicts that rays can be traversed and 
defer and aggregate multiple shader invocations resulting from the ray traversal thread traversing the single ray until a particular triggering event is detected, (FIG. 5 and 6, [0118]-[0120], “In either case, sorting (307) of intersections (or possible intersections) into object-associated buffers can be made based on the intersected information then-available (actual and/or possible intersections), and can be implemented by a sorter. Buffers 308, 310, and 312 are depicted as example buffers for receiving intersection information sorted by object; such buffers can be implemented as FIFOs, ring buffers, linked lists, and so on. Other implementations can sort rays into buffers based on association with a particular code segment, such as a shader. In some implementations, sorting 307 of rays into buffers associated with a particular shader or a particular object can be implemented using ray tracing deferral aspects described above. In some cases, primitives can each be given a unique number, some portion of which identifies a scene object to which the primitive belongs, and the sorting of the rays into various of the buffers can be based on a primitive identifier associated with the ray, or the scene object-identifying portion thereof. A buffer selection 318 can control from which buffer ray intersection information is obtained for conducting shading operations. Buffer selection 318 can operate by selecting a fuller or fullest buffer from among buffers 308, 310, and 312. In some cases, buffer selection 318 can select any buffer having more than a minimum 
 wherein the aggregated multiple shader invocations resulting from the ray traversal thread traversing the single ray cause a single shader …to be dispatched (FIG. 6, [0119] “A buffer selection 318 can control from which buffer ray intersection information is obtained for conducting shading operations. Buffer selection 318 can operate by selecting a fuller or fullest buffer from among buffers 308, 310, and 312. In some cases, buffer selection 318 can select any buffer having more than a minimum number of rays collected therein (collecting rays preferably refers to collecting identifiers for the rays, but also can include collecting definition data for the rays in the buffers). In some examples, a ray result lookup function can be provided where buffers 308-312 store ray identifiers, but less than all data that would be used to identify a particular intersection, such as a primitive identifier. A mux 316 can be controlled by buffer selector 318, so that a selected buffer from buffers 308-312 can be outputted. Ray definition data 311 can be used as a source of ray definition information, where  
However, Clothset does not explicitly teach:
cause a single shader batch to be dispatched to the execution circuitry.
On the other hand, Afra teaches:
cause a single shader batch to be dispatched to the execution circuitry. ([0125], “In one embodiment, the shaders 1350 are then dispatched with one unique material ID per SIMD batch, to minimize divergence and external bandwidth. Consequently, the shaders may operate concurrently on multiple different rays with the same material properties, thereby reducing computations and memory access operations, which results in higher performance. In one embodiment, the path state associated with the rays is loaded into SIMD registers using gather instructions (e.g., such as those available in AVX2).”)
Clothset teaches collecting shaders before dispatching the shaders.  Clothset also teaches parallel processing. Afra teaches collected shaders are dispatched to executing unit in a single batch.
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to have combined the teachings of Clothset with the specific teachings of Afra to dispatch the useful shaders in a batch to allow parallel executing of the shaders to improve system processing time.

claim 3, Clothset and Afa teaches:
The apparatus of claim 2, wherein the particular triggering event comprises a particular temporal event or processing event. (Clothset, [0119], “A buffer selection 318 can control from which buffer ray intersection information is obtained for conducting shading operations. Buffer selection 318 can operate by selecting a fuller or fullest buffer from among buffers 308, 310, and 312. In some cases, buffer selection 318 can select any buffer having more than a minimum number of rays collected therein (collecting rays preferably refers to collecting identifiers for the rays, but also can include collecting definition data for the rays in the buffers).”)

Regarding claim 4, Clothset and Afa teaches:
The apparatus of claim 2, further comprising: a scheduler to dispatch the single shader batch on the execution circuitry responsive to the particular triggering event. (Clothset teches a scheduler as shown in Fig. 5. Afra further teaches at [0125], “In one embodiment, the shaders 1350 are then dispatched with one unique material ID per SIMD batch, to minimize divergence and external bandwidth. Consequently, the shaders may operate concurrently on multiple different rays with the same material properties, thereby reducing computations and memory access operations, which results in higher performance. In one embodiment, the path state associated with the rays is loaded into SIMD registers using gather instructions (e.g., such as those available in AVX2).” The combination of claim 1 is incorporated here.)

Regarding claim 5, Clothset and Afa teaches:
The apparatus of claim 2, wherein the ray traversal thread is to be suspended pending execution results of shader batch executed on the execution circuitry, wherein a first traversal context of the ray traversal thread is to be maintained while the ray traversal thread is suspended. (Clothset, [0110], “Scheduler 260 can create points of aggregation at which rays can be collected to defer their shading in favor of shading collections of other rays. Collection point 272 shows that a scheduler can aggregate rays (or more generally computation instances) to await execution of the two depicted shader instances 270a and 270b (depicts an entrance point of such shader code). Thus, as rays are deferred, they are collected into a collection associated with collection point 272.”)

Regarding claim 6, Clothset and Afa teaches:
The apparatus of claim 5, wherein the ray tracing circuitry is to aggregate the multiple shader invocations based on the multiple shader invocations being associated with the first traversal context. (Clothset, [0118]-[0119], “In either case, sorting (307) of intersections (or possible intersections) into object-associated buffers can be made based on the intersected information then-available (actual and/or possible intersections), and can be implemented by a sorter. Buffers 308, 310, and 312 are depicted as example buffers for receiving intersection information sorted by object; such buffers can be implemented as FIFOs, ring buffers, linked lists, and so on. Other implementations can sort rays into buffers based on association with a particular code segment, such as a shader. In some implementations, sorting 307 of rays into buffers associated with a particular shader or a particular object can be implemented using ray tracing deferral aspects described above. In some cases, primitives can each be given a unique number, some portion of which identifies a scene object to which the primitive belongs, and 

Regarding claim 7, Clothset and Afa teaches:
The apparatus of claim 2, wherein a primary ray shader thread executed on the execution circuitry is to spawn the ray traversal thread. (Afra, [0067], “During execution, the graphics and media pipelines send thread initiation requests to thread execution logic 600 via thread spawning and dispatch logic. In some embodiments, thread execution logic 600 includes a local thread dispatcher 604 that arbitrates thread initiation requests from the graphics and media pipelines and instantiates the requested threads on one or more execution units 608A-608N. For example, the geometry pipeline (e.g., 536 of FIG. 5) dispatches vertex processing, tessellation, or geometry processing threads to thread execution logic 600 (FIG. 6). In some embodiments, thread dispatcher 604 can also process runtime thread spawning requests from the executing shader programs.” In the case described in FIG. 13, the ray shader thread spawn traversal thread to implement intersection test. Clothset teaches multiple threads and parallel EU. Afra explicitly teaches in execution pipeline, a parent thread can spawn child thread to implement individual functions in a pipeline. It would have been obvious before the effective 

Regarding claim 8, Clothset and Afa teaches:
The apparatus of claim 2, further comprising: sorting circuitry to regroup data associated with the single shader batch to increase occupancy for operations performed by the execution circuitry. (Clothset, [0118], “sorting (307) of intersections (or possible intersections) into object-associated buffers can be made based on the intersected information then-available (actual and/or possible intersections), and can be implemented by a sorter. Buffers 308, 310, and 312 are depicted as example buffers for receiving intersection information sorted by object; such buffers can be implemented as FIFOs, ring buffers, linked lists, and so on. Other implementations can sort rays into buffers based on association with a particular code segment, such as a shader. In some implementations, sorting 307 of rays into buffers associated with a particular shader or a particular object can be implemented using ray tracing deferral aspects described above. In some cases, primitives can each be given a unique number, some portion of which identifies a scene object to which the primitive belongs, and the sorting of the rays into various of the buffers can be based on a primitive identifier associated with the ray, or the scene object-identifying portion thereof.”)


Regarding claim 9, Clothset and Afa teaches:
The apparatus of claim 2, wherein deferring and aggregating multiple shader invocations comprises storing a data entry in a data structure in a memory, the data structure comprising at least one entry for each shader, each entry usable to identify shader information required to execute a corresponding shader. (Clothset, [0135-[0136]], “FIG. 9 depicts that a cluster controller 455 can maintain a plurality of program counters 456-458. Each program counter can be used to reference sequence of program instructions available from an instruction memory hierarchy 460.”…” a sequence of instructions, e.g. instruction 462 and instruction 464 can be provided from instruction memory hierarchy 462 a plurality of ALUs 471-473. As depicted in FIG. 9, each ALU executes the same instruction concurrently.”)

Regarding claim 10, Clothset and Afa teaches:
A method comprising: executing shaders on execution circuitry; (Clothset [0038], “In a more particular example, a system for multiprocessing comprises a plurality of processing units, each comprising a Single Instruction Multiple Data ( SIMD) execution unit.” [0109], “Processing resources 255 can execute shader code in execution cores 263a-263n, and in this particular example, shader instances 270a and 270b are depicted, which are differences instances of the same shader. Shader instance 270n is also depicted as an instance of shader code for a different shader. The above is a specific example of grouping for program code instances relating to shading of intersections between rays and scene geometry, during ray tracing. However, the example applies in general SIMD processing of program code instances.”) executing a ray traversal thread on a ray tracing engine; (Clothset [0114], “FIG. 6 depicts aspects of a method that includes ray traversal,” (FIG. 5 and 6))


Claims 11-16 recite similar limitations of claim 3-8 respectively, thus are rejected using the same rejection rationale of claim 3-8 respectively. 

Regarding claim 17, Clothset and Afa teaches:
A non-transitory machine-readable medium having program code stored thereon which, when executed by a machine, capable of causing the machine to perform: (Clothset, [0225], “can comprise intersection testing resources including particular fixed-purpose testing cells, and/or general purpose computers configured with computer readable instructions from a computer readable medium to perform the particular intersection tests described and interpret the results of the tests.”)
 operations executing shaders on execution circuitry; (Clothset [0038], “In a more particular example, a system for multiprocessing comprises a plurality of processing units, each comprising a Single Instruction Multiple Data ( SIMD) execution unit.” [0109], “Processing resources 255 can execute shader code in execution cores 263a-263n, and in this particular example, shader instances 270a and 270b are depicted, which are differences instances of the same shader. Shader instance 270n is also depicted as an instance of shader code for a different shader. The above is a specific example of grouping for program code instances relating to shading of intersections between rays and scene geometry, during ray tracing. However, the example applies in general SIMD processing of program code instances.”)executing a ray traversal thread on a ray tracing engine; (Clothset [0114], “FIG. 6 depicts aspects of a method that includes ray traversal,” (FIG. 5 and 6))
The rest of claim 10 recites similar limitations of claim 2, thus are rejected using the same rejection rationale of claim 2. 

Claims 18-19 recite similar limitations of claim 3-4 respectively, thus are rejected using the same rejection rationale of claim 3-4 respectively. 

Regarding claim 20, Clothset and Afa teaches:
The non-transitory machine-readable medium of claim 19, wherein the ray traversal thread is to be suspended pending execution results of shader batch executed on the execution circuitry, wherein a first traversal context of the ray traversal thread is to be maintained while the ray traversal thread is suspended. (Clothset, [0110], “Scheduler 260 can create points of aggregation at which rays can be collected to defer their shading in favor of shading collections of other rays. Collection point 272 shows that a scheduler can aggregate rays (or more generally computation instances) to await execution of the two depicted shader instances 270a and 270b (depicts an entrance point of such shader code). Thus, as rays are deferred, they are collected into a collection associated with collection point 272.”)

Regarding claim 21, Clothset and Afa teaches:
The non-transitory machine-readable medium of claim 20, wherein the multiple shader invocations are to be aggregated based on the multiple shader invocations being associated with the first traversal context. (Clothset, [0118]-[0119], “In either case, sorting 



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YANNA WU whose telephone number is (571)270-0725. The examiner can normally be reached Monday-Thursday 8:00-5:30 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on 571-272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.