DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
	This Office Action is in response to the Applicants' Amendment/Remark filed on November 17, 2021.  Claims 1, 21 and 24 have been amended, claims 1-26 are still pending in this application.

35 USC § 103- Rejections
Applicant's arguments filed November 17, 2021have been fully considered but they are not persuasive. 
Regarding to independent claims 1 and 24:
a.	Applicant’s argument:  Applicant argues on page 16 that “the division of rectangular regions of the screen space is divided by a static pattern (i.e., checkerboard pattern as shown in FIG. 1D), and is not dynamically divided based on any information indicating overlap of pieces of geometry with screen regions. In fact, Dimitrov et al. uses the checkboard pattern shown in FIG. 1D to discard primitives that do not overlap with screen regions assigned by the checkboard pattern to one GPU (e.g., a clipping circuit also shown in FIG. 1D), such that remaining primitives are processed by that GPU…  Further Garritsen and Bakalash et al., taken alone or in combination, fail to overcome the shortcomings of Dimitrov et al. That is, none of the references to Garritsen or 
b. 	Examiner’s response:  Examiner respectfully disagrees with the argument, because the Garritsen fairly teaches the highlight claim invention that the screen divides into regions, the primitives overlay each region (e. g. “windows 1002a-1002f) (see FIG. 10), then the method may include optimizing the size and position of subdivisions to efficiently utilize the on-chip memory. In selected examples, the size and position of subdivisions may be optimized to encompass entire batches of primitives. (Garritsen, FIGs. 11-12, par. [0021, 0060-0061] (please see the detail rejection below).  Therefore, the argument is not persuasive.

c.	Applicant’s argument:  Applicant argues on page 17 that “On the other hand, embodiments of the present disclosure as recited in independent claims 1 and 24, as amended, disclose that information is generated regarding pieces of geometry and their relations to screen regions, and that screen regions are dynamically assigned to GPUs based on the information during a rendering phase. That is, how pieces of geometry overlap screen regions determines how those screen regions are dynamically allocated to GPUs for performing rendering of the pieces of geometry. The prior art references do not teach any consideration of information relating to overlapping of pieces of geometry with screen regions when allocating screen regions to GPUs, and instead teaches allocation of screen regions based on a checkboard pattern (see FIG. 1D of Dimitrov et al.)”.
Examiner’s response: Examiner respectfully disagrees with the argument because Applicant argues the highlight claim invention that is different from the claim 1, Garritsen fairly discloses the highlight claim invention, since Garritsen discloses the primitives overlay with each region (e. g windows) that shows the relation between primitives and screen regions (Garritsen, FIGs. 10-11, see pars. [0021, 0060-0061]) the argument is similar with the above argument and that has been addressed above.  Therefore, the argument is no persuasive because “dynamically assigning the plurality of screen regions to the plurality of GPUs based on the information for purposes of rendering the plurality of pieces of geometry during a subsequent phase of rendering” that equivalents with Garritsen’s FIGs 10-11.  In addition, Garritsen’s par. [0021] clearly  discloses “how pieces of geometry overlap screen regions determines how those screen regions are dynamically allocated to GPUs for performing rendering of the pieces of geometry” that is not in claim 1.
Claims 2-23 and 25-26 depend from either independent claim 1 or 24.  Therefore, for the reasons stated above, and presented in the detailed action below, the rejections for the First Office Action are maintain.

Double Patenting
	The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at 
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For 

	Claims 1, 8, 10 and 14 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over respective claims (see table below) of copending Application No. 16/780,680 in view of Garritsen (US 20090122068 A1) in view of Bakalash et al. (US 20090027383 A1).

Instant claims
1
8
10
14
16/780680
1
10 and 11
13
18 and 20



For example:
Instant claim 1
16/780680 claim 1

A method for graphics processing, comprising:

A method for graphics processing, comprising:

rendering graphics for an application using a plurality of graphics processing units (GPUs);

rendering graphics for an application using a plurality of graphics processing units (GPUs);





dynamically assigning the plurality of screen regions to the plurality of GPUs based on the information for purposes of rendering the plurality of pieces of geometry during a subsequent phase of rendering.

assigning a GPU a piece of geometry of an image frame generated by an application for geometry pretesting;

generating information at the plurality of GPUs regarding the plurality of pieces of geometry and their relations to the plurality of screen regions based on the overlap of each the plurality of pieces of 

performing the geometry pretesting at the GPU to generate information regarding the piece of geometry and its relation to each of the plurality of screen regions; and

determining in the analysis pre-pass phase overlap of each the plurality of pieces of geometry with each of a plurality of screen regions;

using the information at each of the plurality of GPUs when rendering the image frame.


Instant claim 1 further recites “determining in the analysis pre-pass phase overlap of each the plurality of pieces of geometry with each of a plurality of screen regions; generating information at the plurality of GPUs regarding the plurality of pieces of geometry and their relations to the plurality of screen regions based on the overlap of each the plurality of pieces of geometry with each of the plurality of screen regions;”.  However, Garritsen and Bakalash meet these limitations as set forth in the rejection of claim 1 below.
	This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

Claim Rejections - 35 USC § 103
	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having 

	Claims 1-4, 6, 9-11, 13-16, 18, 22 and 24-26 are rejected under 35 U.S.C. 103 as being unpatentable over Dimitrov et al. (US 20190206023 A1, hereinafter “Dimitrov”) in view of Garritsen (US 20090122068 A1) and further in view of Bakalash et al. (US 20090027383 A1, hereinafter “Bakalash”).
As to claim 1.  Dimitrov discloses a method for graphics processing, comprising: 
rendering graphics for an application using a plurality of graphics processing units (GPUs) (Dimitrov, see at least par. [0052] and FIG. 1D, “rendering results of rectangular regions 132 are composited together within a single GPU to generate a final rendered frame. In various configurations, any number of GPUs may operate together to generate a final rendered frame. In certain embodiments, two or more final rendered frames are generated’); 
dividing responsibility for processing a plurality of pieces of geometry of an image frame during an analysis pre-pass phase between the plurality of GPUs, wherein each of the plurality of pieces of geometry is assigned to a corresponding GPU (Dimitrov, see FIG. 1D and at least col. 1, lines 1-6 of the par. [0045] “FIG. 1D illustrates a technique for allocating rendering work based on a screen space checkerboard pattern, in accordance with one embodiment. As shown, a screen space scene 130 is divided into rectangular regions 132. Different rectangular regions 132 may be allocated to different GPUs to be rendered,”); 
assigning the plurality of screen regions to the plurality of GPUs based on the information for purposes of rendering the plurality of pieces of geometry during a subsequent phase of rendering (Dimitrov, see at least col. 2, and lines 1-5 of the .
Dimitrov does not disclose “dynamically assigning the plurality of screen regions to the plurality of GPUs based on the information for purposes of rendering the plurality of pieces of geometry during a subsequent phase of rendering; determining in the analysis pre-pass phase overlap of each the plurality of pieces of geometry with each of a plurality of screen regions”.  However, 
Garritsen discloses dynamically assigning the plurality of screen regions to the plurality of GPUs based on the information (Garritsen, see at least par. [0021], “the method may include dividing a frame buffer into subdivisions, or windows, and determining which primitives are associated with each subdivision. This may be accomplished by determining which primitive and batch bounding boxes overlap with each subdivision. Each subdivision may then be rendered individually using on-chip memory local to the GPU. In selected examples, the method may include optimizing the size and position of subdivisions to efficiently utilize the on-chip memory. In selected examples, the size and position of subdivisions may be optimized to encompass entire batches of primitives”) for purposes of rendering the plurality of pieces of geometry during a subsequent phase of rendering (Garritsen, see FIGs. 10-11, see at least par. [0061], Referring to FIG. 11, as mentioned, the optimization module 904 may, in certain examples, optimize window dimensions and placement to encompass selected 

    PNG
    media_image1.png
    756
    539
    media_image1.png
    Greyscale

determining in the analysis pre-pass phase (Garritsen, see at least par. [0039], “a GPU 108 in accordance with the invention may implement a modified pipeline 307 that is divided into two separate passes: a geometry pass 308 and a rendering pass 310. In general, the geometry pass 308 may be used to convert all or many of the vertices received from the graphics driver 304 to triangles (or other primitives).”) overlap of each the plurality of pieces of geometry with each of a plurality of screen regions (Garritsen, see FIGs. 4 and 7, at least par. [0051], “The graphics driver 304 may use the bounding box to determine if a primitive overlap with a subdivision, or window, as will be shown in FIGS. 10 and 11. In selected examples, the primitive buffer 400b may also store the area 708 of each primitive. The setup engine 208 may use the area to perform interpolation computations. Similarly, the primitive buffer 400b may store other data or attributes 710 as needed); 
Therefore, it would have been obvious to one of ordinary skill in the art before the
effective filling date of claimed invention to include Dimitrov disclosed invention, and have “dynamically assigning the plurality of screen regions to the plurality of GPUs based on the information for purposes of rendering the plurality of pieces of geometry during a subsequent phase of rendering determining in the analysis pre-pass phase overlap of each the plurality of pieces of geometry with each of a plurality of screen regions”, as taught Garritsen, thereby “provide many of the advantages of retained mode and immediate mode architectures while avoiding many of the disadvantages of each. Further needed is an apparatus and method to dynamically and seamlessly switch between retained and immediate modes of operation in a way that is transparent to an application. Further needed are apparatus and methods to 
Dimitrov in view of Garritsen does not disclose “generating information at the plurality of GPUs regarding the plurality of pieces of geometry and their relations to the plurality of screen regions based on the overlap of each the plurality of pieces of geometry with each of the plurality of screen regions”.  However, Bakalash discloses:
generating information at the plurality of GPUs regarding the plurality of pieces of geometry and their relations to the plurality of screen regions based on the overlap of each the plurality of pieces of geometry with each of the plurality of screen regions (Bakalash, see FIG. 2D1, and at least par. [0050], “FIG. 2D1 is a schematic representation of the complementary-type partial image generation process of the present invention carried out within GPU1 of the dual-GPU embodiment of the parallel graphics rendering system of FIG. 2C, wherein a Global Depth Map (GDM) is generated within the Z Buffer for all objects within the 3D scene (showing three different depth values namely the background having the highest depth (2415), wherein object A is closest to the viewer, has the lowest depth value (2416), its pixels have passed the Z-test and their depth values are written to the Z Buffer of GPU1, wherein object C (2414) has a middle depth value, its pixels have passed the z-test and their depth values are written to the Z buffer of GPU1, wherein object B has the deepest depth values, its pixels have all failed the z-test and their depth values have been replaced by the depth values of its occluding object A (2416) written in the Z Buffer in GPU1, and wherein a color-based complementary-type partial image is generated within the Color Buffer of GPU1 by recompositing (iii) the pixels of assigned object A rendered/drawn in color, (ii) 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to include Dimitrov in view of Garritsen disclosed invention, and have “generating information at the plurality of GPUs regarding the plurality of pieces of geometry and their relations to the plurality of screen regions based on the overlap of each the plurality of pieces of geometry with each of the plurality of screen regions”, as taught by Bakalash, thereby to improved method of and apparatus for carrying out parallel 3D graphics processing, while avoiding the shortcomings and drawbacks of the prior art apparatus and methodologies (Bakalash, see par. [0015]).
As to claim 2.  Dimitrov in view of Garritsen and further in view of Bakalash further discloses wherein the analysis pre-pass phase is performed using a vertex shader or compute shader (Garritsen, see FIG. 3 and at least pars. [0039]- [0040], “a GPU 108 in accordance with the invention may implement a modified pipeline 307 that is divided into two separate passes: a geometry pass 308 and a rendering pass 310…  A vertex shader 204 may transform the vertices or perform tasks such as modifying the color, texture, or lighting on the vertices. A primitive assembler 206 may assemble the vertices received from the vertex shader 204 to create triangles or other primitives”).

As to claim 3.  Dimitrov in view of Garritsen and further in view of Bakalash further discloses wherein the determining the overlap includes: 
approximating the overlap of each the plurality of pieces of geometry with each of the plurality of screen regions (Garritsen, see at least par. [0058], “A rendering module 906 may be provided to render the primitives once the window size 912, placement 914, and order 916 has been determined. For example, the rendering module 906 may determine the current window 918 (the window currently being rendered). A primitive/batch determination module 920 may then determine which primitives or batches of primitives overlap or are contained within the current window 918, such as by evaluating the overlap between the primitive and batch bounding boxes 706, 804 and the current window 918. A transmission module 922 may then send the primitives/batches that overlap with the current window 918 to the rendering pass 310 by way of the pipe 316c.”).

	As to claim 4.  Dimitrov in view of Garritsen and further in view of Bakalash further discloses wherein the approximating the overlap includes: 
determining overlap of one or more bounding boxes of one or more primitives of a piece of geometry with each of the plurality of screen regions (Garritsen, see at least par. [0051], “a bounding box 706 may be calculated for each primitive 702 passing through the geometry pass 308.  This bounding box 706 may be stored with the primitive in the primitive buffer 400b.  In certain examples, a left coordinate, a top coordinate, a right coordinate, and a bottom coordinate may be used to identify the bounding box of a primitive 702”).
As to claim 6.  Dimitrov in view of Garritsen and further in view of Bakalash further discloses further comprising: 
rendering during the subsequent phase of rendering the plurality of pieces of geometry at each of the plurality of GPUs based on GPU to screen region assignments determined from the assigning the plurality of screen regions to the plurality of GPUs (Dimitrov, see at least lines 1-7 of the par. [0024], “Each of the two or more GPUs render selected primitives within assigned regions. Rendering may include multiple rendering passes, and results from one rendering pass stored in one or more surfaces may be used by the two or more GPUs for one or more subsequent rendering passes. Rendering a given pass for an assigned region on a first GPU may require remote data from a second GPU”).

As to claim 9.  Dimitrov in view of Garritsen and further in view of Bakalash further discloses wherein the information includes an accurate or approximate area that primitives of a piece of geometry occupies in a corresponding region (Garritsen, see FIGs. 7 and 10 and at least par. [0051] “a bounding box 706 may be calculated for each primitive 702 passing through the geometry pass 308. This bounding box 706 may be stored with the primitive in the primitive buffer 400b. In certain examples, a left coordinate, a top coordinate, a right coordinate, and a bottom coordinate may be used to identify the bounding box of a primitive 702. The graphics driver 304 may use the bounding box to determine if a primitive overlap with a subdivision, or window, as will be shown in FIGS. 10 and 11. In selected examples, the primitive buffer 400b may also store the area 708 of each primitive. The setup engine .

As to claim 10.  Dimitrov in view of Garritsen and further in view of Bakalash further discloses wherein the information includes the number of pixels shaded per screen region, or wherein the information includes a vertex count per screen region (Bakalash, see at least par. [0028], “a (color) frame buffer (memory) having a color value for each pixel and a z-buffer with the same number of entries is provided for storing a z-value for each pixel in the frame buffer; and wherein the z-buffer is initialized to zero, representing the z-value at the back clipping plane of the 3D scene, wherein the frame buffer is initialized to the background color, and wherein the largest value that can be stored in the z-buffer represents the z value of the front clipping plane”).

As to claim 11.  Dimitrov in view of Garritsen and further in view of Bakalash further discloses further discloses wherein corresponding information may be generated or not generated depending on one or more properties of a corresponding piece of geometry (Dimitrov, see at least par. [0047], “Geometric primitives, such as triangles 134, may be transmitted to both GPU 139(0) and GPU 139(1), with a respective clipping circuit 138 configured to either keep or discard any given geometric primitive based on whether the geometric primitive intersects a rectangular region 132 allocated to be processed in a corresponding GPU”).

As to claim 13.  Dimitrov in view of Garritsen and further in view of Bakalash further discloses wherein the information is generated by one or more shaders, wherein the one or more shaders use at least one dedicated instruction to accelerate generation of the information (Bakalash, see FIG. 2C and at least par. [0049], “FIG. 2C is a schematic representation illustrating the three primary stages of the generalized method of the present invention carried out on a dual-GPU embodiment of the parallel graphics processing system of the present invention, operating in an object division (OD) mode of operation according to the present invention, wherein each GPPL includes (i) a GPU having a geometry subsystem, a rasterizer, and a pixel subsystem with a pixel shader and raster operators including a Z test operator”).

As to claim 14.  Dimitrov in view of Garritsen and further in view of Bakalash further discloses wherein the information is generated by one or more shaders, wherein the one or more shaders do not perform allocations of a position or parameter cache (Dimitrov, see at least par. [0084], “The raster engine 325 includes a number of fixed function hardware units configured to perform various raster operations. In one embodiment, the raster engine 325 includes a setup engine, a course raster engine, a culling engine, a clipping engine, a fine raster engine, and a tile coalescing engine. The setup engine receives transformed vertices and generates plane equations associated with the geometric primitive defined by the vertices. The plane equations are transmitted to the coarse raster engine to generate coverage information (e.g., an x,y coverage mask for a tile) for the primitive. The output of the coarse raster engine may be transmitted to the culling engine where fragments associated with the primitive that z-test are culled, and transmitted to a clipping engine where fragments lying outside a viewing frustum are clipped. Those fragments that survive clipping and culling may be passed to a fine raster engine to generate attributes for the pixel fragments based on the plane equations generated by the setup engine. The output of the raster engine 380 comprises fragments to be processed, for example, by a fragment shader implemented within a TPC 320.”).

As to claim 15.  Dimitrov in view of Garritsen and further in view of Bakalash further discloses wherein the information is generated by one or more shaders, wherein the one or more shaders are configurable to output the information or to output vertex position and parameter information for use by the subsequent phase of rendering (Garritsen, see at least par. [0058], “A rendering module 906 may be provided to render the primitives once the window size 912, placement 914, and order 916 has been determined. For example, the rendering module 906 may determine the current window 918 (the window currently being rendered). A primitive/batch determination module 920 may then determine which primitives or batches of primitives overlap or are contained within the current window 918, such as by evaluating the overlap between the primitive and batch bounding boxes 706, 804 and the current window 918. A transmission module 922 may then send the primitives/batches that overlap with the current window 918 to the rendering pass 310 by way of the pipe 316c”).

As to claim 16.  Dimitrov in view of Garritsen and further in view of Bakalash further discloses further discloses wherein at least one of the plurality of GPUs is assigned to a screen region prior to commencement of or during the subsequent phase of rendering (Dimitrov, see at least par. [0186], “Geometric primitives, such as triangles 134, may be transmitted to both GPU 139(0) and GPU 139(1), with a respective clipping circuit 138 configured to either keep or discard any given geometric primitive based on whether the geometric primitive intersects a rectangular region 132 allocated to be processed in a corresponding GPU. In one embodiment, a clipping circuit 138(0) in GPU 139(0) may be configured to discard geometric primitives that do not cover or intersect a rectangular region to be processed by GPU 139(0). Similarly, a clipping circuit 138(1) in GPU 139(1) may be configured to discard geometric primitives that do not cover or intersect a rectangular region to be processed by GPU 139(1). The clipping circuit 138 may perform any necessary transformation operations (e.g., on primitive vertices) to map geometric primitives from an arbitrary space (e.g., a world space) to screen space prior to performing clipping operations in screen space”).

As to claim 18.  Dimitrov in view of Garritsen and further in view of Bakalash further discloses wherein a screen region is assigned to more than one of the plurality of GPUs (Dimitrov, see at least par. [0024], “Each of the two or more GPUs render selected primitives within assigned regions”).

As to claim 22. Dimitrov in view of Garritsen and further in view of Bakalash further discloses wherein the information is used to schedule transfer of a Z-buffer or render target data for a screen region to a first GPU from a second GPU (Bakalash, see FIG. 3A2 and at least par. [0057], “the method of the present invention, carried out on a dual-GPU embodiment of the parallel graphics processing system of the present invention, wherein (a) the first stage involves during the special rendering pass (i.e. GDM creating pass), providing a Global Data Map (GDM) to the Z buffer of each GPPL involving the transmission of graphics commands and data to all GPPLs for all objects in the frame of the 3D scene to be rendered”).

As to claim 24, is rejected for the same rationale of claim 1.
As to claim 25, is rejected for the same rationale of claim 2.
As to claim 26, is rejected for the same rational of claim 3.  

	Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Dimitrov et al. (US 20190206023 A1, hereinafter “Dimitrov”) in view of Garritsen (US 20090122068 A1) and further in view of Bakalash et al. (US 20090027383 A1, hereinafter “Bakalash”) as applied claim 4 above and further in view of FISHWICK et al. (US 20140292782 A1, hereinafter “FISHWICK”).
As to claim 5.  Dimitrov in view of Garritsen and further in view of Bakalash does not disclose “further comprising: excluding one or more screen regions having no overlap”.  However, FISHWICK discloses further comprising: 
excluding one or more screen regions having no overlap (FISHWICK, see FIG. 9, and at least par. [0080], “exclude non-overlapped tiles G, H, and P to produce the minimal set of seven tiles overlapped by the primitives of the parameter block for use as the bounding region.”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to include Dimitrov in view of Garritsen and further in view of Bakalash disclosed invention, and have “further comprising: excluding one or more screen regions having no overlap”, as taught by FISHWICK, in order to “reduce the counts of graphics data items with relatively high count values in the cache”, (FISHWICK, see at least par. [0104])).

	Claims 7 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Dimitrov et al. (US 20190206023 A1, hereinafter “Dimitrov”) in view of Garritsen (US 20090122068 A1), further in view of Bakalash et al. (US 20090027383 A1, hereinafter “Bakalash”) as applied claim 1 above and further in view of Diard (US 20050041031 A1).
As to claim 7.  Dimitrov in view of Garritsen and further in view of Bakalash does not discloses “further comprising: determining GPU usage when rendering of a previous image frame; and assigning the plurality of screen regions to the plurality of GPUs based on the information and the GPU usage when rendering the previous image frame.  However, Diard discloses further comprising:
determining GPU usage when rendering of a previous image frame (Diard, see at least par. [0054], “determine whether an adjustment to the clip rectangles for the GPUs needs to be made. If the GPUs are equally loaded, the likelihood of either GPU finishing a frame first is about 50%, and the average value over a suitable number of frames (e.g., 20) will be about 0.5 if identifier values of 0 and 1 are used. An average value in excess of 0.5 indicates that GPU-1 (which renders the bottom portion of the image) is more heavily loaded than GPU-0, and an average value below 0.5 indicates that GPU-0 (which renders the top portion of the image) is more heavily loaded than GPU-1); and 
assigning the plurality of screen regions to the plurality of GPUs based on the information and the GPU usage when rendering the previous image frame (Diard, see at least par. [0059], “The identifiers for different GPUs may have any value. Correspondingly, the high threshold and low threshold may have any values, and the two threshold values may be equal (e.g., both equal to 0.5), so long as the high threshold is not less than the low threshold. Both thresholds are advantageously set to values near or equal to the arithmetic mean of the two identifiers; an optimal selection of thresholds in a particular system may be affected by considerations such as the frequency of load rebalancing and any overhead associated with changing the clip rectangles assigned to each GPU. The threshold comparison is advantageously defined such that there is some condition for which the load is considered balanced (e.g., if the average is exactly equal to the arithmetic mean).”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to include Dimitrov in view of Garritsen and further in view of Bakalash disclosed invention, and have “determining GPU usage when rendering of a previous image frame; and assigning the plurality of screen regions 

As to claim 20.  Dimitrov in view of Garritsen, further in view of Bakalash and further in view of Diard further discloses wherein the rendering command buffer is shared between the plurality of GPUs as a common rendering command buffer (Diard, see at least par. [0033], “a graphics driver program (or other program) executing on CPU 102 delivers rendering commands and associated data for processing by GPUs 114a, 114b. In some embodiments, CPU 102 communicates asynchronously with each of GPUs 114a, 114b using a command buffer, which may be implemented in any memory accessible to both the CPU 102 and the GPUs 114a, 114b”), wherein the format of the common rendering command buffer allows a command to be executed only by a subset of the plurality of GPUs (Diard, lines 12-15 of the par. [0045], “Graphics processing subsystems can be implemented using various expansion card formats, including PCI, PCIX (PCI Express), AGP (Accelerated Graphics Port)”).

	Claims 8 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Dimitrov et al. (US 20190206023 A1, hereinafter “Dimitrov”) in view of Garritsen (US 20090122068 A1) and further in view of Bakalash et al. (US 20090027383 A1, .
As to claim 8.  Dimitrov in view of Garritsen and further in view of Bakalash does not discloses “wherein the piece of geometry corresponds to geometry used or generated by a draw call”.  However, Sorgard disclose wherein the piece of geometry corresponds to geometry used or generated by a draw call (Sorgard, see at least par. [0331], “Although in this embodiment the primitives are simply indexed in ascending order, it would be possible to use other indexing arrangements, if desired. For example, where the primitives to be rendered are arranged in draw calls, the individual draw calls could be numbered consecutively, and then the primitives within each draw call numbered consecutively”), or wherein the geometry used or generated by a draw call is subdivided into smaller pieces of geometry corresponding to the plurality of pieces of geometry, such that the information is generated for the smaller pieces of geometry.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to include Dimitrov in view of Garritsen and further in view of Bakalash disclosed invention, and have “wherein the piece of geometry corresponds to geometry used or generated by a draw call”, as taught by Sorgard, in order to “be able to identify and know those primitives that are actually present in a given sub-region (tile), so as to, e.g., avoid unnecessarily rendering primitives that are not actually present in a tile”, (Sorgard, see par. [0011]).

As to claim 19.  Dimitrov in view of Garritsen, further in view of Bakalash and further in view of Sorgard further discloses wherein a rendering order of the plurality of pieces of geometry does not match an order of corresponding draw calls in a rendering command buffer (Sorgard, see at least par. [0192], “it may be the case that within a given batch, not all of the available primitive indices will be needed and thus used. In this case, when it comes to using the indices to order the primitives, e.g., for rendering, the system would, once all the primitives for a given batch have been processed (e.g. rendered), then step to the next batch ( drawing call) and start to, e.g., render primitives from there. To facilitate this, where desired, an indicator, such as a jump or link command, could be included in the index to allow the system to recognize when the primitives for a given batch (e.g. drawing call) have been exhausted, such that the system knows to then move on to the next batch ( drawing call) in the index order.”).

	Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Dimitrov US 20190206023 A1, hereinafter “Dimitrov”) in view of Garritsen (US 20090122068 A1), further in view of Bakalash et al. (US 20090027383 A1, hereinafter “Bakalash”) as applied claim 1 above and further in view of Broadhurst et al. (US 20180197271 A1, hereinafter “Broadhurst”).
As to claim 12.  Dimitrov in view of Garritsen and further in view of Bakalash further discloses further comprising: 
considering the plurality of costs when assigning the plurality of screen regions to the plurality of GPUs (Dimitrov, see at least par. [0022], “the number of regions is dynamically determined and updated for new frames to reduce remote load balancing among the two or more GPUs. The rectangles are rendered separately by the different GPUs for the frame and combined to form a complete frame in a frame buffer. In one embodiment, the frame buffer is located in local memory for one of the two or more GPUs.”).
Dimitrov in view of Garritsen and further in view of Bakalash does not disclose “determining a plurality of costs for rendering the plurality of pieces of geometry during the subsequent phase of rendering”.  However, Broadhurst discloses determining a plurality of costs for rendering the plurality of pieces of geometry during the subsequent phase of rendering (Broadhurst, see FIG. 9 and at least par. [0112], “the scheduling logic 316 may determine which of the tiles to subdivide for the current render based on information relating to processing costs for corresponding tiles in a previous render”);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to include Dimitrov in view of Garritsen and further in view of Bakalash disclosed invention, and have “determining a plurality of costs for rendering the plurality of pieces of geometry during the subsequent phase of rendering”, as taught by Broadhurst, in order to “improvements in the physical implementation of the devices, apparatus, modules, and systems (such as reduced silicon area) may be traded for improved performance. This may be done, for example, by manufacturing multiple instances of a module within a predefined area budget.”, (Broadhurst, see par. [0152]).

17 is rejected under 35 U.S.C. 103 as being unpatentable over Dimitrov et al. (US 20190206023 A1, hereinafter “Dimitrov”) in view of Garritsen (US 20090122068 A1), further in view of Bakalash et al. (US 20090027383 A1, hereinafter “Bakalash”) as applied claim 1 above and further in view of ARTICO et al. (US 20200110670 A1, hereinafter “ARTICO”).
As to claim 17.  Dimitrov in view of Garritsen and further in view of Bakalash further does not disclose “wherein a screen region initially assigned to a first GPU is reassigned to a second GPU during the subsequent phase of rendering”.  However, ARTICO disclose wherein a screen region initially assigned to a first GPU is reassigned to a second GPU during the subsequent phase of rendering (ARTICO, see at least par. [0031], “the GPUs communicating amongst each other and reassigning workload therebetween in order to maintain consistency between processing times”). 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to include Dimitrov and further in view of Garritsen and further in view of Bakalash disclosed invention, and have “wherein a screen region initially assigned to a first GPU is reassigned to a second GPU during the subsequent phase of rendering”, as taught by ARTICO, in order to “improvements in high performance computing (HPC)”, (ARTICO, see par. [0001]).

	Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Dimitrov et al. (US 20190206023 A1, hereinafter “Dimitrov”) in view of Garritsen (US 20090122068 A1) further in view of Bakalash et al. (US 20090027383 A1, hereinafter “Bakalash”) as .
As to claim 21. Dimitrov in view of Garritsen and further in view of Bakalash does not disclose wherein the information allows relaxation of rendering phase dependencies, resulting in a first GPU proceeding to the subsequent phase of rendering while a second GPU is still processing the analysis pre-pass phase.  However, 
ANDONIEH discloses wherein the information allows relaxation of rendering phase dependencies, resulting in a first GPU proceeding to the subsequent phase of rendering while a second GPU is still processing the analysis pre-pass phase (ANDONIEH, see at least par. [0156], if any GPUs [0 . . . n] is allowed to sit idle while waiting for another GPU to finish rendering, the distribution of work in this example is less than optimal. By way of example, optimal performance is achieved when the render time for each of the n GPUs is substantially equivalent. This equivalent work distribution can be achieved by adjusting the partition size allocated to each of the n GPUs such that render time is equivalent for each partition).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to include Dimitrov in view of Garritsen and further in view of Bakalash disclosed invention, and have wherein the information allows relaxation of rendering phase dependencies, resulting in a first GPU proceeding to the subsequent phase of rendering while a second GPU is still processing the analysis pre-pass phase, as taught by ANDONIEH, in order to produce extraordinarily realistic video images. In most video systems, a specialized .

	Claim 23 is rejected under 35 U.S.C. 103 as being unpatentable over Dimitrov et al. (US 20190206023 A1, hereinafter “Dimitrov”) in view of Garritsen (US 20090122068 A1) further in view of Bakalash et al. (US 20090027383 A1, hereinafter “Bakalash”) as applied claim 1 above and further in view of Ford et al. (US 20100115510 Al, hereinafter “Ford”).
As to claim 23.  Dimitrov in view of Garritsen and further in view of Bakalash does not disclose “wherein one or more of the plurality of GPUs are portions of a larger GPU that is configured as a plurality of virtual GPUs”.  However, Ford discloses wherein one or more of the plurality of GPUs are portions of a larger GPU that is configured as a plurality of virtual GPUs (Ford, see at least the 5 lines for the bottom of the par. [0026], “one or both of GPUs 204 and 206 can be a virtual GPU. As used herein, a virtual GPU refers to a software emulation of a physical GPU. In some embodiments, GPU 204 can be a physical GPU while GPU 206 is a virtual GPU”).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of claimed invention to include Dimitrov in view of Garritsen and further in view of Bakalash disclosed invention, and have “wherein one or more of the plurality of GPUs are portions of a larger GPU that is configured as a plurality of virtual .

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KIM THANH THI TRAN whose telephone number is (571)270-1408.  The examiner can normally be reached on Monday-Friday 8:00am-5:00pm.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JENNIFER MEHMOOD can be reached on 5712722976.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/KIM THANH T TRAN/Examiner, Art Unit 2612                                                                                                                                                                                             

/JACINTA M CRAWFORD/Primary Examiner, Art Unit 2612