Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Allowable Subject Matter


Claims 6, 10, and 17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
The prior art of record does not teach, suggest, or disclose the claim limitation “… alternating between executing local data share (LDS) reduction operations and wavefront reduction operations when computing texels at a first mipmap level, second mipmap level, and third mipmap level” in combination with the other recited claim limitations.
Claim Rejections - 35 USC § 103





The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 8, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Reshetov et al. (U.S. PG-PUB 2018/0158227, 'RESHETOV') in view of Harris et al. (U.S. PG-PUB 2020/0027260, 'HARRIS') and Alcorn (U.S. Patent 5,870,509; 'ALCORN').
The Examiner notes that the instant specification, regarding the claim limitation ‘execution units’, discloses “Graphics processing units (GPUs) and other multithreaded processing units typically include multiple processing elements (which are also referred to as processor cores, compute units, or execution units) that concurrently execute multiple instances of a single program on multiple data sets. The instances are referred to as threads or work-15items, and groups of threads or work-items are created (or spawned) and then dispatched to each processing element” (¶ [0002]).
Regarding claim 8, RESHETOV discloses a method comprising: 20
executing, by a plurality of execution units, a plurality of thread groups (RESHETOV; FIG. 2; ¶ 0025; “FIG. 2 illustrates a parallel processing unit (PPU) 200 … the PPU 200 is a multi-threaded processor that is implemented on … integrated circuit devices. The PPU 200 is a latency hiding architecture designed to process a large number of threads in parallel. A thread (i.e., a thread of execution) is an instantiation of a set of instructions configured to be executed by the PPU 200. … the PPU 200 is a graphics processing unit (GPU) configured to implement a graphics rendering pipeline for processing … (3D) graphics data in order to generate … (2D) image data for display on a display device such as a liquid crystal display (LCD) device” ¶ 0026; “… the PPU 200 includes … general processing clusters (GPCs) 250 [‘execution units’]; ¶ 0034; “ … a host processor executes a driver kernel that implements an application programming interface (API) that enables … applications executing on the host processor to schedule operations for execution on the PPU 200. An application may generate instructions (i.e., API calls) that cause the driver kernel to generate … tasks for execution by the PPU 200. The driver kernel outputs tasks to … streams being processed by the PPU 200. Each task may comprise … groups of related threads, referred to herein as a warp. A thread block may refer to a plurality of groups of threads including instructions to perform the task.”) to downsample a plurality of patches of a source image texture to generate one or more higher mipmap levels for the texture (RESHETOV; FIG. 1; ¶ 0022; “At step 104, the processor generates an infinite resolution texture (IRT) acceleration data structure [which] includes a texture map … [which] is a … (2D) array of color values sampled from the image [‘source image texture’] at an appropriate resolution. … the texture map may be identical to a raster image, or resampled at a different resolution. … the texture map may be a MIP map that includes a plurality of down-sampled versions of the image [‘downsample a plurality of patches’], each down-sampled version of the image associated with a different level of detail (LOD) [‘higher mipmap levels’].”); 
 
rendering pixels to be driven to a display based on one or more mipmap level texels (RESHETOV; ¶ 0056; “The PPU 200 … processes the graphics primitives to generate a frame buffer (i.e., pixel data for each of the pixels of the display).” ¶ 0057; “… the contents of the frame buffer are transmitted to a display controller for display on a display device.” ¶ 0104; “… the infinite resolution texture … implements MIP mapping [which] is a technique for storing a hierarchy of down-sampled versions of the raster image corresponding to different levels of detail [which] may be calculated based on the ratio of rendered pixel size to texel size (i.e., comparing the size of a pixel in the raster image to a size of the pixel in the image being rendered) during run time. … each [LOD] corresponds to a raster image at a resolution of ½ the level of detail below it (i.e., LOD 1 corresponds to half the resolution of LOD 0).”).
RESHETOV does not explicitly disclose executing a last active thread group, which HARRIS discloses (HARRIS; FIG. 7; ¶ 0262; “… the fragment shader 12 executes a respective execution thread for each fragment that it is shading. … it processes the fragments that it receives from the rasterizer 11 as respective 2×2 fragment quads … This is so as to facilitate … the derivation of appropriate … derivatives across each 2×2 fragment quad (which derivatives may then be used, for example, for texture mipmap selection …). … the fragment shader core 12 will process the fragments that it receives from the rasterizer, as respective groups (quads) of 2×2 fragments.” ¶ 0263-264; “… the fragment shader core 12 is operable to process respective execution threads (with each execution thread corresponding to a respective fragment) in the form of execution thread groups (“warps”), where the execution threads of the group are executed in lockstep, one instruction at a time. The fragment shader core 12 could … execute thread groups comprising four threads (i.e. to have a “warp” width of 4), in which case the fragment shader core 12 will execute a 2×2 fragment quad in lockstep, or it could be the case that the fragment shader core 12 has a warp width that is greater than 4 (e.g. 8 or 16), and so will process as a given thread group (warp) a plurality of 2×2 fragment quads” FIG. 9; ¶ 0317-318; “This is repeated for all the quads in the thread group (warp) in question, and once all the quads in the warp have been considered, the weighted texture instruction is completed (step 909) … This process is repeated for each group of quads (thread group (warp)) [‘executing a last active thread group’] that the shader program in question is executed for” [The Examiner notes the disclosure of repeatedly processing thread groups or warps impliedly teaches that there is an execution of a ‘last active’ thread group, since repetition implies iteration until a logical or reasonable stopping point.]).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method of RESHETOV to include the executing a last active thread group of HARRIS. The motivation for this modification could have been to provide techniques for concurrently sharing a limited resource, such as a graphics processing unit, using threading techniques.

    PNG
    media_image1.png
    742
    443
    media_image1.png
    Greyscale

RESHETOV-HARRIS do not explicitly disclose computing remaining mipmap levels for the texture, which ALCORN discloses (ALCORN; FIG. 1; Col. 2, Lines 30-54; “… a scheme has been developed that involves the creation of a series of MIP (multum in parvo, or many things in a small place) maps for each texture, and storing the MIP maps of the texture associated with the object being rendered in the local memory of the texture mapping hardware. A MIP map for a texture includes a base map that corresponds directly to the texture map, as well as a series of filtered maps [‘remaining mipmap levels for the texture’], wherein each successive map is reduced in size by a factor of two in each of the two texture map dimensions. … The MIP maps include a base map 2 that is eight-by-eight texels in size, as well as a series of maps 4, 6 and 8 that are respectively four-by-four texels, two-by-two texels, and one texel in size. (13) The four-by-four map 4 is generated by box filtering (downsampling) the base map 2, such that each texel in the map 4 corresponds to an equally weighted average of four texels in the base map 2. … the texel 4a in map 4 equals the average of the texels 2a, 2b, 2c, and 2d in map 2, and texels 4b and 4c in map 4 respectively equal the averages of texels 2e-2h and 2i-2l in map 2. The two-by-two map 6 is similarly generated by box filtering map 4, such that texel 6a in map 6 equals the average of texels 4a-4d in map 4. The single texel in map 8 is generated by averaging the four texels in map 6.”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method of RESHETOV-HARRIS to include the computing remaining mipmap levels for the texture of ALCORN. The motivation for this modification could have been to obviate the need for the texture mapping hardware to read a large number of texels that map to a pixel from a local memory to generate an average value, then necessitating a large number of memory-reads and the averaging of many texel values, which would be time consuming and would degrade system performance (ALCORN; Col. 2, Lines 23-29).
Independent claim 1 recites essentially similar limitations and exhibits a parallel scope when compared to independent claim 8; therefore, the same motivation(s) to combine references are maintained.
Regarding claim 1, RESHETOV-HARRIS-ALCORN disclose an apparatus (RESHETOV; FIG. 2; ¶ 0025; “… the PPU 200 is a graphics processing unit (GPU) …”) comprising: an interface configured to access a memory storing source image texture data; and 5a plurality of execution units configured to (RESHETOV; ¶ 0035; “FIG. 3A illustrates a GPC 250 [‘execution unit’] of the PPU 200 … each GPC 250 includes a number of hardware units for processing tasks. … each GPC 250 includes … a memory management unit (MMU) 390” ¶ 0040; “… the texture units 345 [‘interface configured to access a memory’] are configured to load texture maps (e.g., a 2D array of texels) from the memory 204 [‘memory storing source image texture data’] and sample the texture maps to produce sampled texture values for use in shader programs executed by the SM 340. The texture units 345 implement texture operations such as filtering operations using mip-maps (i.e., texture maps of varying levels of detail). The texture unit 345 is also used as the Load/Store path for SM 340 to MMU 390.”):

    PNG
    media_image2.png
    519
    445
    media_image2.png
    Greyscale


Independent claim 15 recites essentially similar limitations and exhibits a parallel scope when compared to independent claim 8; therefore, the same motivation(s) to combine references are maintained.
Regarding claim 15, RESHETOV-HARRIS-ALCORN disclose a system comprising: 
a first processor; and a second processor (RESHETOV; FIG. 2; ¶ 0025; “… the PPU 200 is a graphics processing unit (GPU) [‘second processor’]” ¶ 0060; “FIG. 5 illustrates a System-on-Chip (SoC) 500 including the PPU 200 … the SoC 500 includes a CPU 550 [‘first processor’] and a PPU 200 [‘second processor’] …”) configured to: receive a single-pass downsampling kernel from the first processor, 5wherein the kernel comprises a plurality of thread groups (RESHETOV; ¶ 0034; “… host processor executes a driver kernel that implements an … (API) that enables … applications executing on the host processor to schedule operations for execution on the PPU 200. An application … generates instructions … that cause the driver kernel to generate … tasks for execution by … PPU 200. The driver kernel outputs tasks to … streams being processed by the PPU 200. Each task may comprise … groups of related threads, referred to … as a warp. A thread block … refers to a plurality of groups of threads including instructions to perform the task. Threads in the same group of threads … exchange data through shared memory.”);
Claims 2, 9, and 11 are rejected under 35 U.S.C. 103 as being unpatentable over RESHETOV in view HARRIS and ALCORN as applied to claims 1 and 8 above, respectively, and further in view of Schmalstieg et al. (U.S. PG-PUB 2019/0066370, 'SCHMALSTIEG').
Regarding claim 2 and claim 9, RESHETOV-HARRIS-ALCORN disclose the apparatus as recited in claim 1 and the method as recited in claim 8, further comprising a scheduling unit (RESHETOV; FIG. 2, element ‘Scheduler Unit 220’; ¶ 0030-31, 0033); however, RESHETOV-HARRIS-ALCORN do not explicitly disclose that the apparatus as recited in claim 1 and the method as recited in claim 8 further comprise launching the plurality of thread groups on the plurality of execution units responsive to receiving a single-pass downsampling kernel, wherein the single-pass downsampling kernel is a compute shader kernel, which SCHMALSTIEG discloses (SCHMALSTIEG; ¶ 0051; “In a first pass for texel shading, a visibility pass, GPU 160 may mark shading work directly in the texture atlas. … GPU 160 may shade 8×8 texels, e.g., using conservative rasterization and/or slight over-shading. GPU 160 may select basic mip-map levels using standard screen size measurements. GPU 160 may select mip-map level bias based on vertex normal variance, e.g., where relatively flat surfaces are given relatively less shading, and more contoured surfaces are given relatively more shading. In a second pass for texel shading, GPU 160 may execute a compute shader that bulk-executes [‘launch the plurality of thread groups’] fragment shading work. In this second pass, GPU 160 may perform spatial sub-sampling [‘downsampling’] via mip-map bias …”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the apparatus as recited in claim 1 and the method as recited in claim 8 of RESHETOV-HARRIS-ALCORN to include the launching the plurality of thread groups on the plurality of execution units responsive to receiving a single-pass downsampling kernel, wherein the single-pass downsampling kernel is a compute shader kernel of SCHMALSTIEG. The motivation for this modification could have been to take advantage of the highly parallelized structure of a graphics processing unit to simultaneously process a plurality of texels within a texture for mip-mapping.
Regarding claim 11, RESHETOV-HARRIS-ALCORN-SCHMALSTIEG disclose the method as recited in claim 9, wherein the second processor is further configured 25to execute a subset of threads of each thread group to compute texels at a third mipmap level (ALCORN; FIG. 1; Col. 2, Lines 30-54; “The single texel in map 8 is generated by averaging the four texels in map 6” [The Examiner notes that map 8 is the third iterative downsampling from the original texture 2 in FIG. 1.]).
Claims 3-4, 16, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over RESHETOV in view of HARRIS and ALCORN as applied to claims 1 and 15 above, respectively, and further in view of Coon et al. (U.S. PG-PUB 2009/0240931, 'COON').
Regarding claim 3 and claim 16, RESHETOV-HARRIS-ALCORN disclose the apparatus as recited in claim 1 and the system as recited in claim 15, wherein the second processor is further configured 15to execute the plurality of thread groups to downsample the plurality of patches of the texture down to a single texel (ALCORN; FIG. 1; Col. 2, Lines 30-54; “The single texel in map 8 is generated by averaging the four texels in map 6.”); however, RESHETOV-HARRIS-ALCORN do not explicitly disclose that each thread group executes independently of other thread groups, which COON discloses (COON; ¶ 0090; “FIG. 5B is a flow diagram of method steps for unwinding CRS stack 425 to complete step 525 of FIG. 5A … CRS stack 425 includes an execution stack 455 for each of the G thread groups that may be executed concurrently by processing engines 302, so that each thread group … progresses independently of the other thread groups.”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the apparatus as recited in claim 1 and the system as recited in claim 15 of RESHETOV-HARRIS-ALCORN to include the disclosure that each thread group executes independently of other thread groups of COON. The motivation for this modification could have been to implement instruction issue techniques that are used to support parallel execution of a large number of threads without providing multiple independent instruction units (COON; ¶ [0038]).
Regarding claim 4, RESHETOV-HARRIS-ALCORN-COON disclose the apparatus as recited in claim 3, wherein each thread group of the plurality of thread groups comprises a plurality of threads, and wherein the plurality of execution 25units are further configured to: 
execute each thread group to fetch a plurality of texels in a corresponding patch of the texture; and 
downsample, by each thread of each thread group, a sub-patch of texels to compute texels at a first mipmap level and a second mipmap level (ALCORN; Col. 2, Lines 39-53; “… a set of MIP maps is shown in FIG. 1. The MIP maps include a base map 2 that is eight-by-eight texels in size [‘a plurality of texels in a corresponding patch of the texture’], as well as a series of maps 4, 6 and 8 that are respectively four-by-four texels, two-by-two texels, and one texel in size. The four-by-four map 4 [‘first mipmap level’] is generated by box filtering (downsampling) the base map 2, such that each texel in the map 4 corresponds to an equally weighted average of four texels in the base map 2. For example, the texel 4a in map 4 equals the average of the texels 2a, 2b, 2c, and 2d in map 2 [‘downsample … a sub-patch of texels’], and texels 4b and 4c in map 4 respectively equal the averages of texels 2e-2h and 2i-2l in map 2. The two-by-two map 6 [‘second mipmap level’] is similarly generated by box filtering map 4, such that texel 6a in map 6 equals the average of texels 4a-4d in map 4.”).
Regarding claim 18, RESHETOV-HARRIS-ALCORN-COON disclose the system as recited in claim 16, wherein the second processor is further configured 25to execute a subset of threads of each thread group to compute texels at a third mipmap level (ALCORN; FIG. 1; Col. 2, Lines 30-54; “The single texel in map 8 is generated by averaging the four texels in map 6.”).
Claims 5 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over RESHETOV in view of HARRIS, ALCORN, and COON as applied to claim 4 above, and further in view of McKellar (US PGPUB 2007/0052718, 'MCKELLAR').
Regarding claim 5, RESHETOV-HARRIS-ALCORN-COON disclose the apparatus as recited in claim 4; however, RESHETOV-HARRIS-ALCORN-COON do not explicitly disclose that the plurality of execution units [is] further configured to: 
store, by each thread, one or more second mipmap level texels as a result of 5downsampling the sub-patch of texels; and 
execute the last active thread group to compute the remaining mipmap levels using the plurality of downsampled single texels generated from the plurality of patches of the source image texture, both of which MCKELLAR discloses (MCKELLAR; ¶ 0032; “This tile buffer can be utilized as the local storage buffer 2 of FIG. 2 for the auto-generation of mipmaps. In a first pass the original image (up to one tile in size) [‘source image texture’] is stored in this buffer. The box filter 2 then filters the image to 1/2 those dimensions [‘downsampling the sub-patch of texels’] and stores the result in the tile buffer. This takes up only 1/4 of the buffer is the original data was one tile in size. The new data is then filtered again with the results being stored in the tile buffer now at 1/16 of the size of the original data. The process repeats until the final mipmap level (generally one pixel by one pixel) [‘downsampled single texels’] is generated. After each step, the mipmap level generation is stored temporarily in the tile buffer [‘store … second mipmap level texels’] and then in the main system memory.”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the apparatus as recited in claim 4 of RESHETOV-HARRIS-ALCORN-COON to include the storing, by each thread, one or more second mipmap level texels as a result of 5downsampling the sub-patch of texels and the executing the last active thread group to compute the remaining mipmap levels using the plurality of downsampled single texels generated from the plurality of patches of the source image texture of MCKELLAR. The motivation for this modification could have been to provide a mipmap chain that is automatically created from a texture map (MCKELLAR; ¶ [0007]).
Regarding claim 7, RESHETOV-HARRIS-ALCORN-COON-MCKELLAR disclose the apparatus as recited in claim 5, wherein the plurality of execution units [is] further configured 25to execute a subset of threads of each thread group to compute texels at a third mipmap level (ALCORN; FIG. 1; Col. 2, Lines 30-54; “The single texel in map 8 is generated by averaging the four texels in map 6.”).
Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over RESHETOV in view of HARRIS, ALCORN, and SCHMALSTIEG as applied to claim 11 above, and further in view of Winser (US 5,495,563; 'WINSER') and Sander ("AMD GCN Assembly: Cross-Lane Operations", 'https://gpuopen.com/learn/amd-gcn-assembly-cross-lane-operations/', published 10 Aug. 2016, 'SANDER').
Regarding claim 12, RESHETOV-HARRIS-ALCORN-SCHMALSTIEG disclose the method as recited in claim 11, further comprising executing the subset of threads (RESHETOV; FIG. 2; ¶ 0025); however, RESHETOV-HARRIS-ALCORN-SCHMALSTIEG do not explicitly disclose storing third mipmap level texels to a subset of , which WINSER discloses (WINSER; Col. 2, Lines 26-33; “The known hardware implementation adopts the "multum in parvo" or "MIP map" method of storage as described by Williams, whereby three color components (R, G and B) of a single texture are stored in a square, 2-D addressable memory. The MIP map allows a compact storage and very efficient retrieval of pyramidal texture maps using a 2-D memory …” Col. 4, Lines 11-23; “The display apparatus may … generate a 2-D array of texel values for storage in the texture memory as part of a pyramidal array by causing systematic readout and 2-D interpolation of the texel values in a previously stored 2-D array [‘loading second mip level texels’] forming a first level of the pyramidal array, and feedback means for storing the resulting interpolated texel values in the memory [‘storing third mipmap level texels’] (in linear form) to form a 2-D array of texel values to form a second, lower-resolution level of the said pyramidal array. … by selecting certain texture coordinate pairs, the interpolated texel values can be made equal to corresponding stored texel values of the next lower resolution array in the pyramid” FIG. 1, element 24 ‘MRAM’; Col. 14, Lines 63-64; “Entire pyramidal texture arrays may be stored in a database … in the main memory 24 …”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method as recited in claim 11 of RESHETOV-HARRIS-ALCORN-SCHMALSTIEG to include the storing third mipmap level texels to a subset  of WINSER. The motivation for this modification could have been to open the possibility of the mapping hardware generating within itself at high speed a pyramidal or part-pyramidal array from just one externally generated high-resolution 2-D array (WINSER; [Col. 4, Lines 23-26]).
RESHETOV-HARRIS-ALCORN-SCHMALSTIEG-WINSER do not explicitly disclose LDS entries, which SANDER discloses (SANDER; p. 1; § “Why Not Just Use LDS?”; “Local data share (LDS) was introduced exactly for that reason: to allow efficient communication and data sharing between threads in the same compute unit. LDS is a low-latency RAM physically located on chip in each compute unit (CU).”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method as recited in claim 11 of RESHETOV-HARRIS-ALCORN-SCHMALSTIEG-WINSER to include the LDS entries of SANDER. The motivation for this modification could have been to take advantage of the on-chip architecture of the compute unit within the GPU to provide a low-latency memory access from a random-access memory (RAM).
Claims 13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over RESHETOV in view of HARRIS and ALCORN as applied to claims 8 and 15 above, respectively, and further in view of Vanco et al. (U.S. PG-PUB 2020/0034214, 'VANCO').
Regarding claim 13 and claim 20, RESHETOV-HARRIS-ALCORN disclose the method as recited in claim 8 and the system as recited in claim 15; however, RESHETOV-HARRIS-ALCORN do not explicitly disclose that the method as recited in claim 8 and the system as recited in claim 15 further comprise:
maintaining an atomic counter; or
incrementing the atomic counter, both of which VANCO discloses (VANCO; ¶ 0056; “In response to receiving a request from a worker thread, the bus is locked in a block 1002 of flowchart 1000. The atomic counter is incremented by 1 in a block 1004, and the counter value is saved on the stack. The bus is then unlocked in a block 1006.”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method as recited in claim 8 and the system as recited in claim 15 of RESHETOV-HARRIS-ALCORN to include the maintaining an atomic counter and the incrementing the atomic counter of VANCO. The motivation for this modification could have been to perform atomic memory operations using an atomic counter, where each read/modify/write operation will be completed in its entirety before other operations are permitted.
RESHETOV-HARRIS-ALCORN-VANCO disclose incrementing the atomic counter each time a patch has been processed by a corresponding thread group (ALCORN; FIG. 1; Col. 2, Lines 30-54); and 10
executing the last active thread group (HARRIS; FIG. 7; ¶ 0262-264) to compute the remaining mipmap levels (ALCORN; FIG. 1; Col. 2, Lines 30-54) responsive to the atomic counter reaching a threshold (VANCO; ¶ 0056; “In a decision block 1008, a determination is made to whether the counter is greater than a threshold, such as the number of slots on the request ring. If the counter is less than the threshold, the answer to decision block 1008 is YES, and the worker thread is provided with the index generated by the arbiter and allowed to add its descriptor to the request ring.”).
Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over RESHETOV in view of HARRIS and ALCORN as applied to claim 8 above, and further in view of Kirsch et al. (U.S. PG-PUB 2010/0316254, 'KIRSCH').

    PNG
    media_image3.png
    690
    622
    media_image3.png
    Greyscale

Regarding claim 14, RESHETOV-HARRIS-ALCORN disclose the method as recited in claim 8; however, RESHETOV-HARRIS-ALCORN do not explicitly disclose that the method as recited in claim 8 further comprises generating thread indices for fetching pixels based on a Morton ordering pattern, which KIRSCH discloses (KIRSCH; FIG. 3; ¶ 0022; “… a graphical representation of a 16*16 block of pixels is shown, which may be used to demonstrate how image data can be stored and read out from image buffer 204. Block 300 may be stored in raster order--that is, the image data may be stored in a sequence that proceeds from left to right (e.g., from col. 0 to col. 15) across the first line (e.g., row 0), then from left to right across the second line (e.g., row 1), etc. While image buffer 204 may store the pixels in raster order, the pixels may be read out of image buffer 204 in Z-order ([a.k.a.] Morton order). The Z-ordered sequence is illustrated … by arrows linking one pixel to another. The Z order allows groups of four neighboring pixels to be read out one after another (e.g., in a Z-shaped sequence, such as pixels 0, 1, 2, and 3 in the top left corner). Each group of four-pixels may be referred to sometimes as a "Z-quad."”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method as recited in claim 8 of RESHETOV-HARRIS-ALCORN to include the generating thread indices for fetching pixels based on a Morton ordering pattern of KIRSCH. The motivation for this modification could have been to enable graphics processing units to store texture maps in Morton order to increase spatial locality of reference during texture mapped rasterization. This allows cache lines to represent rectangular tiles, increasing the probability that nearby accesses are in the cache. At a larger scale, it also decreases the probability of costly, so called, "page breaks" (i.e. the cost of changing rows) in memory.
Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over RESHETOV in view of HARRIS, ALCORN, and COON as applied to claim 18 above, and further in view of WINSER and SANDER.
Regarding claim 19, RESHETOV-HARRIS-ALCORN-COON disclose the system as recited in claim 18, wherein the second processor is further configured to execute the subset of threads (RESHETOV; FIG. 2; ¶ 0025) however, RESHETOV-HARRIS-ALCORN-COON do not explicitly disclose storing third mipmap level texels to a subset , which WINSER discloses (WINSER; Col. 2, Lines 26-33; Col. 4, Lines 11-23; FIG. 1, element 24 ‘MRAM’; Col. 14, Lines 63-64).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method as recited in claim 11 of RESHETOV-HARRIS-ALCORN-COON to include the storing third mipmap level texels to a subset  of WINSER. The motivation for this modification could have been to open the possibility of the mapping hardware generating within itself at high speed a pyramidal or part-pyramidal array from just one externally generated high-resolution 2-D array (WINSER; [Col. 4, Lines 23-26]).
RESHETOV-HARRIS-ALCORN-COON-WINSER do not explicitly disclose LDS entries, which SANDER discloses (SANDER; p. 1; § “Why Not Just Use LDS?”; “Local data share (LDS) was introduced exactly for that reason: to allow efficient communication and data sharing between threads in the same compute unit. LDS is a low-latency RAM physically located on chip in each compute unit (CU).”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method as recited in claim 11 of RESHETOV-HARRIS-ALCORN-COON-WINSER to include the LDS entries of SANDER. The motivation for this modification could have been to take advantage of the on-chip architecture of the compute unit within the GPU to provide a low-latency memory access from a random-access memory (RAM).
Conclusion























Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONATHAN M COFINO whose telephone number is (303) 297-4268. The examiner can normally be reached Monday-Friday 10A-4P MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KENT W CHANG can be reached on (571) 272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JONATHAN M COFINO/             Examiner, Art Unit 2619                                                                                                                                                                                           
/KENT W CHANG/             Supervisory Patent Examiner, Art Unit 2619