Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Stanard (U.S. PGPUB 20190156550) in view of Saleh (U.S. Patent No. 10,930,050) and further in view of Muthler et al. (U.S. PGPUB 20210390757).
With respect to claim 1, Stanard discloses an apparatus (paragraph 42, FIG. 1 illustrates a generalized example of a suitable computer system (100) in which several of the described innovations may be implemented), comprising:
shader circuitry configured to execute a ray intersect instruction for a first SIMD
group (paragraph 68, To test for intersections of a ray with geometric objects (such as triangles) in a computer-represented environment, the ray can be tested against a BVH that encloses the geometric objects, paragraph 69, For a GPU architecture, BVH traversal can be performed in parallel for a group of n rays, using n threads of a processing unit. (More specifically, the processing can be performed using n threads of a processing unit such as a SIMD unit.), paragraph 70, Each of the n threads traces a single ray in the group of n rays), wherein the instruction indicates coordinate Intersection points (distances, coordinates, etc.) will typically be different even if the n rays intersect the same geometric object); and
ray intersect circuitry configured to implement, in hardware, the ray intersect instruction executed by the shader circuitry (paragraph 57, The innovations can be described in the general context of computer-executable instructions, such as those included in program modules, being executed in a computer system on a target real or virtual processor. The HLSL code in the code listings can be compiled to run in threads of a processing unit (e.g., a SIMD unit of a GPU)), including to traverse multiple nodes in a spatially organized acceleration data structure (paragraph 68, If there is an intersection between the ray and the bounding volume for the root node of the BVH, the ray can be tested against the bounding volumes for the respective child nodes of the root node, and so on, paragraph 73, FIG. 4 shows the behavior of multiple threads executing code in parallel, including thread 1 to thread n. The same set of instructions (e.g., for BVH traversal, for leaf processing) is executable on each of the threads), wherein the nodes include nodes that indicate coordinates of bounding regions and nodes that indicate primitives in the graphics scene (paragraph 68, If there is an intersection between the ray and the bounding volume for the root node of the BVH, the ray can be tested against the bounding volumes for the respective child nodes of the root node, and so on. In this way, the ray can be tested against successively smaller, enclosed bounding volumes of the BVH. When there is an intersection between the ray and the bounding volume of a leaf node of the BVH, the ray can be tested for intersections with the geometric objects (e.g., triangles) enclosed by the bounding volume of the leaf node, paragraph 74, Each of the threads receives (410, 41n) one or more parameters for a given ray among the multiple rays. For example, the parameter(s) for a given ray include an origin of the given ray, a direction of the given ray, and a distance to a leading intersection for the given ray, which is the closest intersection point (closest to the ray origin) found so far for the given ray). However, Stanard does not expressly disclose:
the ray intersect circuitry includes multiple node tester circuits configured to perform arithmetic operations for intersection tests between rays and bounding regions corresponding to nodes;
wherein the ray intersect circuitry is configured to invoke formation of, in response to reaching a node of the acceleration data structure that indicates one or more primitives, a second SIMD group to be processed by the shader circuitry, wherein the second SIMD group operates on a second set of rays that only partially overlaps with the first set of rays, wherein the second SIMD group includes one or more instructions to determine whether rays in the second set of rays intersect the one or more primitives; and wherein the shader circuitry is configured to shade one or more primitives that are indicated as intersected based on results of execution of the second SIMD group.
Saleh, who also deals with ray tracing, discloses a method wherein the ray intersect circuitry is configured to invoke formation of, in response to reaching a node of the acceleration data structure that indicates one or more primitives, a second SIMD group to be processed by the shader circuitry, wherein the second SIMD group operates on a second set of rays that only partially overlaps with the first set of rays Rays 5 and 6 originate from other waves, but were determined to intersect a triangle with the same material that rays 1 and 2 intersected, and thus are grouped together into wave 2), wherein the second SIMD group includes one or more instructions to determine whether rays in the second set of rays intersect the one or more primitives; and wherein the shader circuitry is configured to shade one or more primitives (column 11, lines 29-31, Each of rays 1, 2, 5, and 6 execute the hit and/or intersection shaders specified for the material of triangle O4); and wherein the shader circuitry is configured to shade one or more primitives that are indicated as intersected based on results of execution of the second SIMD group (column 11, lines 39-44, After this, lanes 1 (ray 1), 3 (ray 5), and 4 (ray 6) execute the closest hit shaders for the material of triangle O4, which may, for example, result in a color being generated for that ray (e.g., the texture of O4 would be sampled and the color that is sampled would be recorded as the color for the ray)).
	Stanard and Saleh are in the same field of endeavor, namely computer graphics.
Before the effective filing date of the claimed invention, it would have been obvious to apply the method wherein the ray intersect circuitry is configured to invoke formation of, in response to reaching a node of the acceleration data structure that indicates one or more primitives, a second SIMD group to be processed by the shader circuitry, wherein the second SIMD group operates on a second set of rays that only partially overlaps with the first set of rays, wherein the second SIMD group includes one or more instructions to determine whether rays in the second set of rays intersect the one or more primitives; and wherein the shader circuitry is configured to shade one or more primitives that are indicated as intersected based on results of execution of the 
Muthler et al., who also deal with ray tracing, disclose a method wherein the ray intersect circuitry includes multiple node tester circuits (paragraph 118, the GPU 730 includes a plurality of programmable high performance processors that can be referred to as “streaming multiprocessors” (“SMs”) 732, paragraph 120, To enable the GPU 730 to perform ray tracing in real time in an efficient manner, the GPU provides one or more “TTUs” 738 coupled to one or more SMs 732. The TTU 738 includes hardware components configured to perform (or accelerate) operations commonly utilized in ray tracing algorithms) configured to perform arithmetic operations for intersection tests between rays and bounding regions corresponding to nodes (paragraph 121, SMs 732 and the TTU 738 may cooperate to cast rays into a 3D model and determine whether and where that ray intersects the model's geometry, paragraph 140, The intersections for the bounding volumes are determined in the ray-complet test path of the TTU 738 including one or more ray-complet test blocks 1010 and one or more traversal logic blocks 1012. A complet specifies root or interior nodes of a bounding volume. Thus, a complet may define one or more bounding volumes for the ray-complet test. In example embodiments herein, a complet may define a plurality of “child” bounding volumes that (whether or not they represent leaf nodes) that don't necessarily each have descendants but which the TTU will test in parallel for ray-bounding volume intersection to determine whether geometric primitives associated with the plurality of bounding volumes need to be tested for intersection).
Stanard, Saleh, and Muthler et al. are in the same field of endeavor, namely computer graphics.
Before the effective filing date of the claimed invention, it would have been obvious to apply the method wherein the ray intersect circuitry includes multiple node tester circuits configured to perform arithmetic operations for intersection tests between rays and bounding regions corresponding to nodes, as taught by Muthler et al., to the Stanard as modified by Saleh system, because since ray tracing techniques are even more computationally intensive than rasterization due in part to the large number of rays that need to be traced, the TTU 738 is capable of accelerating in hardware certain of the more computationally-intensive aspects of that process (paragraph 121 of Muthler et al.).
	With respect to claim 2, Stanard as modified by Saleh and Muther et al. disclose the apparatus of claim 1, wherein the shader circuitry is configured to indicate to the ray intersect circuitry whether to continue traversal for a ray based on intersect results of executing the second SIMD group (Saleh: column 10, lines 22-33, Depending on the results of the any hit shader and the intersection shader in any particular lane, a lane either executes a closest hit shader for the material associated with the currently running wave, or traverses the acceleration structure to determine a new closest intersection and then writes out a material data item for the ray into a material data structure associated with the material of the closest intersection triangle, column 11, lines 33-39, For ray 2, the follow-on shader traverses the acceleration structure and determines the next closest intersection, which is with triangle O2, and writes out the material data item for ray 2 and triangle O2 into the material data structure for O2. The other lanes are idle during this acceleration structure traversal and material data item write-out).
With respect to claim 3, Stanard as modified by Saleh and Muther et al. disclose the apparatus of claim 1, wherein the first SIMD group and the second SIMD group operate on a first data structure that stores information for a first ray (Saleh: column 2, lines 61-66, The processor 102 maintains, in system memory 104, one or more control logic modules for execution by the processor 102. The control logic modules include an operating system 120, a driver 122, and applications 126. These control logic modules control various features of the operation of the processor 102 and the APD 116, column 4, lines 12-17, Much of the work involved in ray tracing is performed by programmable shader programs, executed on the SIMD units 138 in the compute units 132), wherein the first data structure is stored in a shader memory space that is also accessible to the ray intersect circuitry (Saleh: column 3, lines 45-47, each of the compute units 132 can have a local L1 cache. In an implementation, multiple compute units 132 share a L2 cache).
With respect to claim 4, Stanard as modified by Saleh and Muther et al. disclose the apparatus of claim 3, wherein the second SIMD group is configured to access thread data from a buffer in the shader memory space identified by the ray intersect circuitry for the second SIMD group (Stanard: paragraph 44, Each element of the SIMD unit can be considered a separate thread of the SIMD unit. A group of n threads for a SIMD unit can also be called a wave or warp. Threads of a given SIMD unit execute the same code in lockstep on (potentially) different data).
	With respect to claim 5, Stanard as modified by Saleh and Muther et al. disclose the apparatus of claim 1, wherein the second SIMD group includes different threads configured to test a first ray against multiple different primitives (Stanard: paragraph 70, Each of the n threads traces a single ray in the group of n rays. The n rays have different ray directions and can potentially intersect different geometric objects).
With respect to claim 6, Stanard as modified by Saleh and Muther et al. disclose the apparatus of claim 5, wherein the second SIMD group includes a SIMD reduction instruction that performs an operation based on input values from multiple threads that operate on the first ray (Stanard: paragraph 79, the thread tests (550) whether its given ray intersects those of the multiple triangles that are in the bounding volume for the given node, paragraph 80, the thread schedules (560) multiple child nodes of the given node for subsequent traversal, as the given node, in later ones of the multiple iterations).
With respect to claim 7, Stanard as modified by Saleh and Muther et al. disclose the apparatus of claim 1, further comprising bounding region test circuitry configured to test in parallel, during the traversal, whether a ray intersects multiple different ones of the bounding regions indicated by a node of the acceleration data structure (Stanard: paragraph 69, the n threads can all execute code for leaf processing operations in parallel for the n rays (e.g., intersection tests for triangles)).
the thread pushes node indices for left and right child nodes of the given node on a stack, paragraph 81, The thread checks (570) whether to continue in another iteration of BVH traversal. If so, the thread loads (510) the bounding volume for the next (scheduled) node in the BVH traversal).
	With respect to claim 9, Stanard as modified by Saleh and Muther et al. disclose the apparatus of claim 1, wherein the apparatus is a computing device that includes:
a graphics unit that includes the shader circuitry and the ray intersect circuitry (Stanard: paragraph 44, The computer system (100) also includes processing units (130 . . . 13x) and local memory (138) of a GPU);
a central processing unit (Stanard: paragraph 43, the computer system (100) includes processing units (110 . . . 11x) and local memory (118) of a central processing unit ("CPU")); and
network interface circuitry (Stanard: paragraph 47, The computer system (100) includes one or more network interface devices (140)).
	With respect to claim 10, Stanard as modified by Saleh and Muther et al. disclose a method as executed by the system of claim 1; see rationale for rejection of claim 1.
With respect to claim 11, Stanard as modified by Saleh and Muther et al. disclose the method of claim 10 as executed by the system of claim 2; see rationale for rejection of claim 2.

With respect to claim 13, Stanard as modified by Saleh and Muther et al. disclose the method of claim 10 as executed by the system of claim 7; see rationale for rejection of claim 7.
With respect to claim 14, Stanard as modified by Saleh and Muther et al. disclose the method of claim 10 as executed by the system of claim 8; see rationale for rejection of claim 8.
	With respect to claim 15, Stanard as modified by Saleh and Muther et al. disclose a non-transitory computer readable storage medium having stored thereon design information that specifies a design of at least a portion of a hardware integrated circuit in a format recognized by a semiconductor fabrication system that is configured to use the design information to produce the circuit according to the design (Stanard: paragraph 56, The innovations can be described in the general context of computer-readable media. Computer-readable media are any available tangible media that can be accessed within a computing environment, paragraph 57, The innovations can be described in the general context of computer-executable instructions, such as those included in program modules, being executed in a computer system on a target real or virtual processor), wherein the design information specifies that the circuit is implemented as in claim 1; see rationale for rejection of claim 1.

With respect to claim 17, Stanard as modified by Saleh and Muther et al. disclose the non-transitory computer readable storage medium of claim 15 for implementing the system of claim 3; see rationale for rejection of claim 3.
With respect to claim 18, Stanard as modified by Saleh and Muther et al. disclose the non-transitory computer readable storage medium of claim 17, wherein the shader memory space includes:
a memory region for buffers to store thread data for dynamically-formed SIMD
groups (Stanard: paragraph 44, Each element of the SIMD unit can be considered a separate thread of the SIMD unit, The local memory (138) may be volatile memory (e.g., registers, cache, RAM), non-volatile memory (e.g., ROM, EEPROM, flash memory, etc.), or some combination of the two, accessible by the respective processing unit(s) (130 . . . 13x));
a memory region for ray data (Stanard: paragraph 45, The computer system (100) includes shared memory (120), which may be volatile memory (e.g., registers, cache, RAM), non-volatile memory (e.g., ROM, EEPROM, flash memory, etc.), or some combination of the two, accessible by the processing unit(s) (110 . . . 11x) of the CPU and the processing units (130 . . . 13x) of the GPU. The memory (120) stores software (180) implementing one or more innovations for ray-triangle intersection testing with tetrahedral planes, at least for high-level control of operations performed by threads of the processing units (130 . . . 13x), in the form of computer-executable instructions); and the thread pushes node indices for left and right child nodes of the given node on a stack, stack implementation is stored in a memory).
	With respect to claim 19, Stanard as modified by Saleh and Muther et al. disclose the non-transitory computer readable storage medium of claim 15 for implementing the system of claim 7; see rationale for rejection of claim 7.
With respect to claim 20, Stanard as modified by Saleh and Muther et al. disclose the non-transitory computer readable storage medium of claim 15 for implementing the system of claim 8; see rationale for rejection of claim 8.
Response to Arguments
Applicant’s arguments with respect to claim(s) 1 have been considered but are moot in view of the new ground(s) of rejection.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
U.S. PGPUB 20210295463 to Mandal et al. for a method of including a plurality of traversal/intersection circuits.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANDREW GUS YANG whose telephone number is (571)272-5514. The examiner can normally be reached M-F 9 AM - 5:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reached on (571)272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and 





/ANDREW G YANG/Primary Examiner, Art Unit 2619                                                                                                                                                                                                        
2/16/22