DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The indicated allowability of claims 3, 4, 12 and 13 is withdrawn in view of the newly discovered reference(s) to Gruber et al. U.S. Pub. No. 2013/0021360.  Rejections based on the newly cited reference(s) follow.  

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 8/22/2022 has been entered.
 
CLAIM INTERPRETATION
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are:  graphics pipeline, buffer, primitive hub, counter, shader processor input,  in claim 1, 3-9 and 19.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.  
The specification discloses:  
in [0021] – “The pipeline circuitry is implemented in some embodiments of the compute units 121-123 or the processor cores 131-133… the pipeline circuitry is used to implement a graphics pipeline that executes shaders of different types including, but not limited to, the vertex shaders, hull shaders, domain shaders, geometry shaders, and pixel shaders.  The pipeline circuitry also includes buffers that hold primitives generated by the shaders… The pipeline circuitry also includes a primitive hub that monitors fullness of the buffers… A shader processor input (SPI) selectively throttles the waves launched by the geometry shader based on a signal from the primitive hub indicating the fullness, an indication of relative resource usage of geometry waves and pixel waves in the graphics pipeline, or an indication of lifetimes of the geometry waves.”  Pipeline circuitry is considered to be the corresponding structure for the graphics pipeline, buffers, primitive hub and the shader processor input.  
in [0031], [0033] – “… the SPI 301-303 include (or are associated with) counters that are used to throttle wave launches for the shaders 311-313… The portion 400 of the graphics pipeline includes counters 435 that indicate how many dead cycles are used to selectively throttle wave launch.”  The graphics pipeline is considered be the corresponding structure for the counters.  

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 10, 19; 2, 3, 4, 11, 12, 13 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Goel et al. U.S. Pub. No. 2009/0295804 in view of Gould et al. U.S. Pub. No. 2020/0004460 and Gruber et al. U.S. Pub. No. 2013/0021360.  
Re:  claims 1 and 10 Goel teaches, 
1. (Currently Amended) An apparatus comprising:  a graphics pipeline configured to execute a first shader of a first type and a second shader of a second type; at least one buffer configured to hold primitives generated by the first shader and provide the primitives to the second shader; (“When operating on the triangle primitive, geometry shader 130 retrieves the three vertices from memory device 120 and emits a plurality of primitives… Each vertex emitted from geometry shader 130 is stored in memory device 140… Prior to rasterizer 160 processing the data in memory device 140, the data in memory device 140 is copied into a cache memory (not shown) by copy shader 150 so that rasterizer 160 can have rapid access to the data.  Rasterizer 160 converts the data in the cache memory into a raster image… for output, for example, on a video display… Pixel shader 170 manipulates pixels generated by rasterizer 160.”; Goel, [0034], [0035], [0036], [0037], Figs. 1, 2a-2b)
Fig. 1 illustrates a graphics pipeline 100 that executes a geometry shader (first shader of a first type) and a pixel shader (second shader of a second type).  Fig. 1 also illustrates memory device 140 that holds primitives generated by the geometry shader (first shader).  The primitives from the geometry shader that are stored in the memory device are copied to cache (which now holds primitives generated by the geometry shader) by the copy shader, which the rasterizer accesses to convert this data into a raster image.  The pixel shader then manipulates pixels generated by the rasterizer.  Thus, the memory device, holding the primitives generated by the geometry shader, is providing these primitives to the pixel shader via the copy shader, cache and rasterizer.  
Goel is silent, however, Gould and Gruber teach and a primitive hub configured to monitor at least one fullness of the at least one buffer, wherein launching of waves from the first shader is throttled based on the at least one fullness and based on a number of dead cycles associated with the graphics pipeline. (“When data is pushed into various FIFO queues, it may be desired to have a mechanism to know how much data was written, and be able to launch shader threads to consume the data… In another example, the Auto-Dispatcher (e.g., FIFO queue work-launching program 330) can manage a variable per FIFO queue to track how much data any consumer (or read) threads are intended to consume… In this example, the Auto-Dispatcher can periodically check if Write Done Pointer 1830 has been updated… Based on prioritization algorithms (which may include tracking of fullness of various FIFO queues…), the Auto-Dispatcher can... select a shader to be launched, determine the number of threads or thread groups to be launched, launch the Shader and corresponding threads/thread groups…”; Gould, [0231] )
The Auto-Dispatcher uses prioritization algorithms (primitive hub) to track the fullness of FIFO queues (monitor at least one fullness of the at least one buffer).  Based on the prioritization algorithms’ tracking the fullness of the FIFO queues (based on the at least one fullness), the Auto-Dispatcher can determine the number of threads or thread groups (waves) to be launched (which would include throttling the launching of waves from the first shader).  Goel and Gould are silent, however, Gruber teaches that the throttling is based on a number of dead cycles associated with the graphics pipeline.  (“As an illustrative example, assume that GPU 10 executes geometry shaders 16A and 16B.  Also, assume that the graphics data produced by geometry shader 16A should be consumed before the graphics data produced by geometry shader 16B… In this illustrative example, geometry shader 16B completed producing graphics data before geometry shader 16A… Accordingly, the second storage location within the geometry shader count buffer 18 may store a value indicative of the amount of graphics data produced by geometry shader count buffer 16B before the first storage location within geometry shader count buffer 18 stores a value indicative of the amount of graphics data produced by geometry shader 16A.  Although the second storage location of geometry shader count buffer 18 may store a value, controller 12 may not yet execute one or more pixel shaders 24  because the graphics data produced by geometry shader 16A should be consumed first, and geometry shader 16A has not yet completed the production of graphics data… ”; Gruber, [0071], [0072])
When geometry shader 16B completes producing graphics data before geometry shader 16A, the geometry data of geometry shader 16B is held back or waits (throttled) until the geometry data of geometry shader 16A stores a value indicative of the amount of graphics data produced by geometry shader 16A.  This wait time for the graphics data of graphics shader 16A is considered to be dead cycles associated with the pipeline.  Thus, the throttling of the graphics data of geometry shader 16B (the launching of waves of the first shader) is based on the wait time (dead cycles associated with the pipeline).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing date to modify the method of Goel by adding the feature of a primitive hub configured to monitor at least one fullness of the at least one buffer, wherein launching of waves from the first shader is throttled based on the at least one fullness and based on a number of dead cycles associated with the graphics pipeline, in order to have a mechanism to know how much data was written and be able to launch shader threads to consume the data, as taught by Gould ([0231]) and in order to ensure that pixel shaders do not only consume the graphics data when available, but also consume the graphics data in the order in which the graphics data should be consumed, as taught by Gruber ([0073]).      
Claim 19 is an apparatus analogous to the apparatus of claim 1, is similar in scope and is rejected under the same rationale. Claim 19 has an additional limitation.  Re:  claim 19, Goel is silent, however, Gould teaches and a shader processor input (SPI) configured to selectively throttle waves launched by the geometry shader based on a number of dead cycles associated with the graphics pipeline and at least one of a signal from the primitive hub indicating the at least one fullness, an indication of relative resource usage of geometry waves and pixel waves in the graphics pipeline, and an indication of lifetimes of the geometry waves. (“When data is pushed into various FIFO queues, it may be desired to have a mechanism to know how much data was written, and be able to launch shader threads to consume the data… In another example, the Auto-Dispatcher (e.g., FIFO queue work-launching program 330) can manage a variable per FIFO queue to track how much data any consumer (or read) threads are intended to consume...  In this example, the Auto-Dispatcher can periodically check if Write Done Pointer 1830 has been updated… Based on prioritization algorithms (which may include tracking of fullness of various FIFO queues…), the Auto-Dispatcher can... select a shader to be launched, determine the number of threads or thread groups to be launched, launch the Shader and corresponding threads/thread groups…”; Gould, [0231] )
The Auto-Dispatcher uses prioritization algorithms (primitive hub) and a Write Done Pointer (a signal from the primitive hub) to track the fullness of FIFO queues (indicating the at least one fullness).  Based on the prioritization algorithms’ tracking the fullness of the FIFO queues (based on the at least one fullness), the Auto-Dispatcher can determine the number of threads or thread groups (waves) to be launched (which would include selectively throttling the launching of waves from the geometry shader).  Goel and Gould are silent, however, Gruber teaches that the throttling is based on a number of dead cycles associated with the graphics pipeline.  (“As an illustrative example, assume that GPU 10 executes geometry shaders 16A and 16B.  Also, assume that the graphics data produced by geometry shader 16A should be consumed before the graphics data produced by geometry shader 16B… In this illustrative example, geometry shader 16B completed producing graphics data before geometry shader 16A… Accordingly, the second storage location within the geometry shader count buffer 18 may store a value indicative of the amount of graphics data produced by geometry shader count buffer 16B before the first storage location within geometry shader count buffer 18 stores a value indicative of the amount of graphics data produced by geometry shader 16A.  Although the second storage location of geometry shader count buffer 18 may store a value, controller 12 may not yet execute one or more pixel shaders 24  because the graphics data produced by geometry shader 16A should be consumed first, and geometry shader 16A has not yet completed the production of graphics data… ”; Gruber, [0071], [0072])
When geometry shader 16B completes producing graphics data before geometry shader 16A, the geometry data of geometry shader 16B is held back or waits (throttled) until the geometry data of geometry shader 16A stores a value indicative of the amount of graphics data produced by geometry shader 16A.  This wait time for the graphics data of graphics shader 16A is considered to be dead cycles associated with the pipeline.  Thus, the throttling of the graphics data of geometry shader 16B (the launching of waves of the first shader) is based on the wait time (dead cycles associated with the pipeline).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing date to modify the method of Goel by adding the feature of a shader processor input (SPI) configured to selectively throttle waves launched by the geometry shader based on a number of dead cycles associated with the graphics pipeline and at least one of a signal from the primitive hub indicating the at least one fullness, an indication of relative resource usage of geometry waves and pixel waves in the graphics pipeline, and an indication of lifetimes of the geometry waves, in order to have a mechanism to know how much data was written and be able to launch shader threads to consume the data, as taught by Gould ([0231]) and in order to ensure that pixel shaders do not only consume the graphics data when available, but also consume the graphics data in the order in which the graphics data should be consumed, as taught by Gruber ([0073]).      
Re:  claims  2 and 11, Goel teaches 
2. (Original) The apparatus of claim 1, wherein the first shader is a geometry shader (Goel, Fig. 1, Geometry Shader 130), wherein the second shader is a pixel shader (Goel, Fig. 1, Pixel Shader 170), 
(“Graphics pipeline 100 includes… a geometry shader 130… a pixel shader 170”; Goel, [0031], Fig. 1)
Goel and Gould teach and wherein the at least one buffer is a first-in-first-out (FIFO) buffer. (“Memory device 120 (and memory device 140…) can be located either on the same chip as the GPU or off-chip.  Examples of memory devices that can be used for storage of vertex data are dynamic random access memory and static random access memory.  In the alternative, other types of memory storage devices can be used… When operating on the triangle primitive, geometry shader 130 retrieves the three vertices from memory 120 and emits a plurality of primitives… Each vertex emitted from the geometry shader 130 is stored in memory device 140.”; Goel, [0032], [0034], [0035], Figs. 1 and 2a-2b)
Fig. 1 illustrates that the geometry shader saves its output to a memory device 140.  Goel is silent, however, Gould teaches the memory buffer can be a FIFO.  (“For example, data-production shader program 310 can be concurrently executed by multiple threads to write data to one or more given FIFO queues… Data-production shader program 310 can include one or more data-production routine 312 for producing data to be written to the FIFO queue (e.g., data related to performing one or more graphics-related tasks, such as rendering instructions, instructions for defining corresponding primitives, vectors, shading rates, etc…)…”; Gould, [0056], Figs. 1 and 3)
A data-production shader (such as a geometry shader) saves or writes data to a FIFO queue (FIFO buffer).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing date to modify the method of Goel by adding the feature of the at least one buffer is a first-in-first-out (FIFO) buffer, in order to have a mechanism to know how much data was written and be able to launch shader threads to consume the data, as taught by Gould. ([0231])  
Re:  claims 3 and 12, Goel is silent, however, Gruber teaches
3. (Currently Amended) The apparatus of claim 2, further comprising:  a counter configured to  indicate the number of dead cycles; and a shader processor input (SPI) configured to selectively throttle waves launched by the geometry shader based on the counter. (“As an illustrative example, assume that GPU 10 executes geometry shaders 16A and 16B.  Also, assume that the graphics data produced by geometry shader 16A should be consumed before the graphics data produced by geometry shader 16B… In this illustrative example, geometry shader 16B completed producing graphics data before geometry shader 16A… Accordingly, the second storage location within the geometry shader count buffer 18 may store a value indicative of the amount of graphics data produced by geometry shader count buffer 16B before the first storage location within geometry shader count buffer 18 stores a value indicative of the amount of graphics data produced by geometry shader 16A.  Although the second storage location of geometry shader count buffer 18 may store a value, controller 12 may not yet execute one or more pixel shaders 24  because the graphics data produced by geometry shader 16A should be consumed first, and geometry shader 16A has not yet completed the production of graphics data… ”; Gruber, [0071], [0072], Fig. 1)
The waves from geometry shader 16B are throttled (selectively throttle waves launched by the geometry shader) based on the counter for geometry shader 16A, since geometry shader 16A must be consumed before geometry shader 16B.  The counting of the dead cycles includes the wait time for the geometry shader 16A to complete production of graphics data and store a value indicative of the amount of graphics data produced.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing date to modify the method of Goel by adding the feature of a counter configured to indicate the number of dead cycles; and a shader processor input (SPI) configured to selectively throttle waves launched by the geometry shader based on the counter, in order to ensure that pixel shaders do not only consume the graphics data when available, but also consume the graphics data in the order in which the graphics data should be consumed, as taught by Gruber ([0073]).      
Re:  claims 4 and 13, Goel is silent, however, Gruber teaches 
4. (Original) The apparatus of claim 3, wherein the primitive hub is configured to provide a feedback signal indicating the at least one fullness to the SPI, and wherein a first number of dead cycles is determined based on the feedback signal. (“… controller 12 my monitor the values stored in the storage locations within geometry shader count buffer 18.  Controller 12 may wait to execute one or more pixel shaders 24 until the storage location within geometry shader count buffer 18, which is assigned to the geometry shader of geometry shaders 16 whose graphics data should be consumed first, stores a value indicative of the produced graphics data.”; Gruber, [0069])
The geometry shader count buffer stores a value indicative of the amount of produced graphics data.  The amount of produced graphics data is indicative of the amount of graphics data stored in the geometry buffer (fullness of the buffer).  The controller waits (dead cycles) to execute the geometry data on the pixel shaders until the geometry shader counter saves the count value to the geometry shader counter buffer. This count value indicates the wait time (dead cycles) as well as the amount of graphics data produced.  Fig. 1 illustrates that the controller (SPI), receives feedback from, for example, the geometry shader count buffer (primitive hub), which indicates the amount of graphics produced and stored (fullness of the buffer).    
 (“As an illustrative example, assume that GPU 10 executes geometry shaders 16A and 16B.  Also, assume that the graphics data produced by geometry shader 16A should be consumed before the graphics data produced by geometry shader 16B… In this illustrative example, geometry shader 16B completed producing graphics data before geometry shader 16A… Accordingly, the second storage location within the geometry shader count buffer 18 may store a value indicative of the amount of graphics data produced by geometry shader count buffer 16B before the first storage location within geometry shader count buffer 18 stores a value indicative of the amount of graphics data produced by geometry shader 16A.  Although the second storage location of geometry shader count buffer 18 may store a value, controller 12 may not yet execute one or more pixel shaders 24  because the graphics data produced by geometry shader 16A should be consumed first, and geometry shader 16A has not yet completed the production of graphics data… ”; Gruber, [0071], [0072], Fig. 1)
Geometry shader 16A must be consumed before geometry shader 16B.  The counting of the dead cycles includes the wait time for the geometry shader 16A to complete production of graphics data and store a value indicative of the amount of graphics data produced in the geometry shader buffer, which provides this information as feedback to the controller (Fig. 1).  Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing date to modify the method of Goel by adding the feature of the primitive hub is configured to provide a feedback signal indicating the at least one fullness to the SPI, and wherein a first number of dead cycles is determined based on the feedback signal, in order to ensure that pixel shaders do not only consume the graphics data when available, but also consume the graphics data in the order in which the graphics data should be consumed, as taught by Gruber ([0073]).      
Re:  claim 20, Goel is silent, however, Gould teaches
20. (Original) The apparatus of claim 19, wherein the signal from the primitive hub comprises two bits having values mapped to different ranges of the at least one fullness. (“For example, the pointers 216, 218, 220, 222, 228, 232, 233, 234 may be constructed to have a number of low order bits to indicate a memory location within a page (e.g., a number of bits equal to a page size divided by a memory unit size for the FIFO).  For example, for pages that are 64 kB and where the FIFO uses a 16-byte memory unit size, the low order bits can include enough bits to indicate 4096 (2^12) memory locations (e.g., 12 bits).”; Gould, [0053], Fig. 2)
The pointers (signals) include a number of bits (which would include two bits) to indicate a memory location within a page.  For example, for 64 kB pages where the FIFO uses a 16-byte memory unit size, low order bits can include enough bits to indicate 4096 (2^12) memory locations (e.g., 12 bits).  Thus, the pointers signals include a number of bits to indicate memory locations having, for example, 16-byte memory unit sizes (mapped to different ranges of the at least one fullness) within a page.  Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing date to modify the method of Goel by adding the feature of the signal from the primitive hub comprises two bits having values mapped to different ranges of the at least one fullness, in order to have a mechanism to know how much data was written and be able to launch shader threads to consume the data, as taught by Gould. ([0231]).  
Allowable Subject Matter
Claims 5-9 and 14-18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.  Claims 6-9 and 15-18 depend directly or indirectly from claims 5 and 14.  None of the prior art teaches or suggests:  
from claim 5 – “wherein the SPI is configured to determine at least one of a first relative allocation of local data store (LDS) resources to in-flight geometry shader waves and in-flight pixel shader waves and a second relative allocation of vector general-purpose registers (VGPRs) to the in-flight geometry shader waves and the in-flight pixel shader waves.”  
from claim 14 – “determining, at the SPI, at least one of a first relative allocation of local data store (LDS) resources to in-flight geometry shader waves and in-flight pixel shader waves and a second relative allocation of vector general-purpose registers (VGPRs) to the in-flight geometry shader waves and the in-flight pixel shader waves.”  
As allowable subject matter has been indicated, applicant's reply must either comply with all formal requirements or specifically traverse each requirement not complied with.  See 37 CFR 1.111(b) and MPEP § 707.07(a).

Response to Arguments
Applicant's arguments filed 6/02/2022 have been fully considered but they are not persuasive.  Applicant argues:  
“As acknowledged by the Examiner in the interview, Gould fails to describe the claimed counter and primitive hub configured to throttle the launching of waves ‘based on the at least one fullness and based on a number of dead cycles associated with the graphics pipeline,” as provided by claim 1.  In addition, Goel would not make up for Gould’s deficiencies.”
Examiner disagrees.  Gould and Gruber teach this amended limitation.  Gould teaches, “When data is pushed into various FIFO queues, it may be desired to have a mechanism to know how much data was written, and be able to launch shader threads to consume the data… In another example, the Auto-Dispatcher (e.g., FIFO queue work-launching program 330) can manage a variable per FIFO queue to track how much data any consumer (or read) threads are intended to consume… In this example, the Auto-Dispatcher can periodically check if Write Done Pointer 1830 has been updated… Based on prioritization algorithms (which may include tracking of fullness of various FIFO queues…), the Auto-Dispatcher can... select a shader to be launched, determine the number of threads or thread groups to be launched, launch the Shader and corresponding threads/thread groups…” (Gould, [0231] ).  The Auto-Dispatcher uses prioritization algorithms (primitive hub) to track the fullness of FIFO queues (monitor at least one fullness of the at least one buffer).  Based on the prioritization algorithms’ tracking the fullness of the FIFO queues (based on the at least one fullness), the Auto-Dispatcher can determine the number of threads or thread groups (waves) to be launched (which would include throttling the launching of waves from the first shader).  Gruber teaches that the throttling is based on a number of dead cycles associated with the graphics pipeline.  Gruber teaches, “As an illustrative example, assume that GPU 10 executes geometry shaders 16A and 16B.  Also, assume that the graphics data produced by geometry shader 16A should be consumed before the graphics data produced by geometry shader 16B… In this illustrative example, geometry shader 16B completed producing graphics data before geometry shader 16A… Accordingly, the second storage location within the geometry shader count buffer 18 may store a value indicative of the amount of graphics data produced by geometry shader count buffer 16B before the first storage location within geometry shader count buffer 18 stores a value indicative of the amount of graphics data produced by geometry shader 16A.  Although the second storage location of geometry shader count buffer 18 may store a value, controller 12 may not yet execute one or more pixel shaders 24  because the graphics data produced by geometry shader 16A should be consumed first, and geometry shader 16A has not yet completed the production of graphics data… ” (Gruber, [0071], [0072], Fig. 1).  When geometry shader 16B completes producing graphics data before geometry shader 16A, the geometry data of geometry shader 16B is held back or waits (throttled) until the geometry data of geometry shader 16A stores a value indicative of the amount of graphics data produced by geometry shader 16A.  This wait time for the graphics data of graphics shader 16A is considered to be dead cycles associated with the pipeline.  Thus, the throttling of the graphics data of geometry shader 16B (the launching of waves of the first shader) is based on the wait time (dead cycles associated with the pipeline).  
Applicant's arguments filed 6/02/2022 have been fully considered but they are not persuasive.  Applicant argues:  
“Claims 10 and 19 each recites features similar to those discussed above with respect to claim 1.  Accordingly, as acknowledged by the Examiner, the alleged combination of Gould and Goel fail to teach the features of these claims.”
Examiner disagrees.  Claims 10 and 19 have been rejected.  Please see the corresponding rejections.  
Applicant's arguments filed 6/02/2022 have been fully considered but they are not persuasive.  Applicant argues:  
“Claims 2-9, 11-18, and 20 depend on claims 1, 10, and 19, respectively.  Accordingly, the cited references individually and in combination, fail to disclose or render obvious the features of dependent claims 2-9, 11-18, and 10, at least by virtue of their respective dependence on claims 1, 10, and 19.  In addition, these dependent claims recite additional novel and non-obvious features.”
Examiner disagrees.  Claims 2-4, 11-13 and 20 have been rejected.  Please see the corresponding rejections.  Claims 5-9 and 14-18 include allowable subject matter,  

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DONNA J RICKS whose telephone number is (571)270-7532.  The examiner can normally be reached on M-F 7:30am-5pm EST (alternate Fridays off).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached on 571-272-2976.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Donna J. Ricks/Examiner, Art Unit 2612 



/JENNIFER MEHMOOD/Supervisory Patent Examiner, Art Unit 2612