DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


Claim(s) 3 and 22 is/are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 3 and 22 recite the limitation "the first and second types" in line 1.  There is insufficient antecedent basis for this limitation in the claim.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


(s) 1-3, 5-14, 17-22, 24-33 and 36 is/are rejected under 35 U.S.C. 103 as being unpatentable over Qiu (US 2018/0322078)	in view of Clark (US 2004/0210679).
Regarding claim(s) 1 and 20, Qiu teaches:
A memory request tracking circuit for use with a streaming cache memory, the memory request tracking circuit comprising: a tag check configured to detect cache misses;      Fig. 4 and [0063] If tag processor 416 does not locate a tag, a cache miss occurs and tag processor 416 allocates a tag and requests miss data from external memory, such as local memory or global memory. Tag processor 416 interacts with m-stage 426 to process miss data. [0064] if tag processor 416 does not locate a tag, a cache miss occurs and tag processor 416 allocates a tag and requests miss data for the transaction. In doing so tag processor 416 interacts with M-stage 416, as mentioned. Tag processor 416 pushes the memory transaction onto t2d FIFO 420.	
Qiu does not explicitly teach, but Clark teaches:
plural tracking queues; and a queue mapper coupled to the tag check and the plural tracking queues, the queue mapper being configured to distribute request tracking information to the plural tracking queues to enable in-order and out-of-order memory request returns.		Fig. 2 and [0031] In an embodiment, the queues 205, 210, and 215 are FIFO (First-In-First-Out) queues, meaning that commands are removed only in the same order in which they were inserted. But, in other embodiments, commands are assigned priorities within the queues and are removed and processed in an order based on the priorities. In another embodiment, the commands may be removed and processed from a particular queue in any appropriate order. [0029] shows that any number and type of queues may be present. For example, in other embodiments, a read/write command to internal facilities queue, a 
It would have been obvious to a person having ordinary skill in the art, at the time the invention was filed, to combine cache memory system/method of Qiu with the multiple command queue method/system of Clark. The motivation for doing so would have been to maximize throughput, so that if a first queue that services commands of a first type is full, a second queue can still accept and make progress executing commands of a second type. This is taught by Clark in [0004].
		
Regarding claim(s) 2 and 21, Clark teaches:	
wherein the queue mapper is programmable to preserve in-order memory request return handling for a first type of memory requests and to enable out-of-order memory request return handling for a second type of memory requests different from the first type of memory requests.	Fig. 2 and [0031] In an embodiment, the queues 205, 210, and 215 are FIFO (First-In-First-Out) queues, meaning that commands are removed only in the same order in which they were inserted. But, in other embodiments, commands are assigned priorities within the queues and are removed and processed in an order based on the priorities. In another embodiment, the commands may be removed and processed from a particular queue in any appropriate order.
		
Regarding claim(s) 3 and 22, Qiu teaches:		
wherein the first and second types of memory requests are selected from the group consisting of loads from local or global memory; texture memory/storage; and acceleration data structure storage.	[0063] For a given non-texture memory transaction, tag processor 416 

Regarding claim(s) 5 and 24, Qiu teaches:
wherein the plural tracking queues each comprise a first-in-first-out storage.	Fig. 2 and [0031] In an embodiment, the queues 205, 210, and 215 are FIFO (First-In-First-Out) queues, meaning that commands are removed only in the same order in which they were inserted.
		
Regarding claim(s) 6 and 25, the combination of Qiu and Clark teaches:		
further includes a pipelined checker picker that selects tracking queue outputs for application to a check in response to cache miss fill indications, the pipelined checker picker being dynamically configured to perform the selection.		Qiu in [0064] shows that if tag processor 416 does not locate a tag, a cache miss occurs and tag processor 416 allocates a tag and requests miss data for the transaction. In doing so tag processor 416 interacts with M-stage 416, as mentioned. Tag processor 416 pushes the memory transaction onto t2d FIFO 420. Clark in [0032] shows that the scorecard registers allow the designer to determine the dependencies between the queues at run time, rather than making these determinations early in the design cycle of the chip 126 before the chip has been committed to silicon and coding the command ordering rules into the silicon. Fig. 5 and [0053] Control then continues to block 510 where the 
		
Regarding claim(s) 7 and 26, Clark teaches:		
wherein the check comprises tracking structures that indicate memory system return data needed by the head of each tracking queue and are configured to track when the memory system has returned all sectors needed by the head of each tracking queue.	Fig. 5 and [0058] Control then continues to block 530 where the command processor 240 clears the appropriate bit for the just-executed command in every other command's associated hold-off vector. Control then continues to block 535 where the command processor 240 clears the appropriate bit associated with the just-executed command in the in-use vector 230. Control then returns to block 510, as previously described above.
		
Regarding claim(s) 8 and 27, Clark teaches:		
wherein the checks are configured to provide plural sector valid bits for each of plural tag banks.	Fig. 5 and [0058] Control then continues to block 530 where the command processor 240 clears the appropriate bit for the just-executed command in every other 
		
Regarding claim(s) 9 and 28, Clark teaches:		
further including a first commit picker configured to process a first traffic type.	    Fig. 5 and [0053] Control then continues to block 510 where the command processor 240 selects a queue to process. In an embodiment, the command processor 240 selects a queue to process based on a round-robin selection technique. In another embodiment, the command processor 240 selects a queue based on a priority scheme where queues are given priorities and certain queues have a higher priority than other queues. In another embodiment, any appropriate technique may be used for selecting the next queue to process.
		
Regarding claim(s) 10 and 29, Clark teaches:		
further including a second commit picker configured to process a second traffic type different from the first traffic type.	Clark does not explicitly teach "second commit picker". However, having two commit picker/processor instances instead of one instance provides more resources to process and handle commands, but does not produce new and unexpected result.  Therefore, having two commit picker/processor instances instead of one instance pertains to duplication of parts which has no patentable significance.  See In re Harza, 274 F.2d 669, 124 USPQ 378 (CCPA 1960).
		
Regarding claim(s) 11 and 30, Clark teaches:	
configured to provide a first mode that provides in-order allocation and deallocation of tracking resources and a second mode that provides out-of-order allocation and deallocation of tracking resources.	Fig. 2 and [0031] In an embodiment, the queues 205, 210, and 215 are FIFO (First-In-First-Out) queues, meaning that commands are removed only in the same order in which they were inserted. But, in other embodiments, commands are assigned priorities within the queues and are removed and processed in an order based on the priorities. In another embodiment, the commands may be removed and processed from a particular queue in any appropriate order. [0029] shows that any number and type of queues may be present. For example, in other embodiments, a read/write command to internal facilities queue, a read/write command to external facilities queue, or any other appropriate type of queue may be used.
		
Regarding claim(s) 12 and 31, Clark teaches:		
further including a state packet queue configured to hold state packets and further configured to restrict out-of-order state packet processing so state packets remain in program order relative to memory access requests that reference the state packets.	[0032] The scorecard registers allow the designer to determine the dependencies between the queues at run time, rather than making these determinations early in the design cycle of the chip 126 before the chip has been committed to silicon and coding the command ordering rules into the silicon.  [0034] The hold-off vectors indicate whether their respective command slots contain commands that are ready for execution or whether their respective command slots contain commands that must be held off waiting for another command or commands to execute first. Fig. 2 and [0031] In an embodiment, the queues 205, 210, and 215 are FIFO (First-In-First-Out) 
		
Regarding claim(s) 13 and 32, Clark teaches:		
further including an indicator encoder that encodes an indicator of a last packet in a commit group.	Fig. 2 and [0035] The chip 126 further includes an in-use vector 230, which indicates which slots in which queues contain commands. The in-use vector 230 includes respective in-use slots associated with each respective command slot in each of the queues 205, 210, and 215. In the example shown, each slot contains one bit, and when the bit is "1" the corresponding command slot in the corresponding queue contains a command, and when the bit is "0" the corresponding slot in the corresponding queue does not contain a command.
		
Regarding claim(s) 14 and 33, Clark teaches:	
further including a commit picker configured to selecting tracked memory request entries for moving to a commit queue in response to the indicator indicating that all memory request entries in a commit group have been serviced and are in the cache.	Fig. 5 and [0058] Control then continues to block 530 where the command processor 240 clears the appropriate bit for the just-executed command in every other command's associated hold-off vector. Control then continues to block 535 where the command processor 240 clears the appropriate bit associated with the just-executed command in the in-use vector 230. Control then returns to block 510, as previously described above.

Regarding claim(s) 17 and 36, Qiu teaches:	
further including reference counters that count a number of inflight references to a memory slot, 	[0075] As shown in FIG. 6A, a data slot 600 includes a data present bit 602 and a reference (ref) counter 604.	
Clark teaches reference counter outputs and/or operations being configured to adapt to out of order processing.	Fig. 2 and [0031] In an embodiment, the queues 205, 210, and 215 are FIFO (First-In-First-Out) queues, meaning that commands are removed only in the same order in which they were inserted. But, in other embodiments, commands are assigned priorities within the queues and are removed and processed in an order based on the priorities. In another embodiment, the commands may be removed and processed from a particular queue in any appropriate order.
		
Regarding claim(s) 18, Qiu teaches:		
A streaming cache comprising: a tag pipeline including a coalescer, a tag memory and a tag processor, the tag pipeline detecting cache misses; a commit FIFO; an evict FIFO; and     Fig. 4 and [0061] shows that unified cache 316 includes address logic 400, a parameter queue (PQ) 402, a tag pipeline 410, a tag-to-data (t2d) first-in first-out (FIFO) 420, a commit FIFO 422, an evict FIFO 424, a miss stage (m-stage) 426, a data memory 430, and a crossbar (x-bar) 432. Tag pipeline 410 includes a coalescer 412, a tag memory 414, and a tag processor 416. Tag pipeline 410 is coupled to a texture pipeline 428.	
Clark teaches an out-of-order memory request tracking circuit comprising plural tracking queues.     Fig. 2 and [0031] In an embodiment, the queues 205, 210, and 215 are FIFO (First-In-First-Out) queues, meaning that commands are removed only in the same order in which they were inserted. But, in other embodiments, commands are assigned priorities within the queues and are removed and processed in an order based on the priorities. In another 
		
Regarding claim(s) 19, Clark teaches:		
further including a queue mapper coupled to the tag pipeline and the plural tracking queues, the queue mapper being configured to distribute request tracking information to the plural tracking queues to selectively enable out-of-order memory request returns.	Fig. 2 and [0031] In an embodiment, the queues 205, 210, and 215 are FIFO (First-In-First-Out) queues, meaning that commands are removed only in the same order in which they were inserted. But, in other embodiments, commands are assigned priorities within the queues and are removed and processed in an order based on the priorities. In another embodiment, the commands may be removed and processed from a particular queue in any appropriate order.


Claim(s) 4 and 23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Qiu (US 2018/0322078) and Clark (US 2004/0210679), further in view of John (US 2017/0317944).
Regarding claim(s) 4 and 23, the combination of Qiu and Clark does not explicitly teach, but John teaches:	
wherein the plural tracking queues comprise first through N tracking queues, and the queue mapper allocates a first tracking queue to a particular warp and distributes certain types of memory requests evenly across second through N tracking queues.	[0046] 
the number of threads allocated to the first latency queue and the second latency queue can be dynamically changed based on a first number of messages in the first latency queue and based on a second number of messages in the second latency queue. Thus, the loads at the first latency queue and the second latency queue may be balanced by adjusting the number of threads allocated to each queue.
It would have been obvious to a person having ordinary skill in the art, at the time the invention was filed, to combine cache memory system/method of Qiu and Clark with the latency-based queuing method/system of John. The motivation for doing so would have been to reduce delays in sending messages by having messages to be sent to queues based on the historical latency times of their recipients. This is taught by John in [0034].


Claim(s) 15, 16, 34 and 35 is/are rejected under 35 U.S.C. 103 as being unpatentable over Qiu (US 2018/0322078) and Clark (US 2004/0210679), further in view of	Davis (US 2016/0041868).
Regarding claim(s) 15 and 34, the combination of Qiu and Clark does not explicitly teach, but Davis teaches:
further including: first and second traffic paths, wherein the first path bypasses the plural tracking queues and the second path is through at least one of the plural tracking queues; and an interlock circuit that guarantees the first path is always faster than the second path.	Fig. 5 and [0037] The CPU 156 determines to obtain a duplicate read bit 522 via a second path 552, which leads to a redundant copy of the particular bit 538. In some embodiments the parity bit can be used to rebuild the page to lead to the redundant copy. The second path 552 bypasses the first path 550, and thus avoids the delays indicated by the feedback from the operations queues 510 along the first path 550.
It would have been obvious to a person having ordinary skill in the art, at the time the invention was filed, to combine cache memory system/method of Qiu and Clark with the data retrieval method/system of Davis. The motivation for doing so would have been to enable faster data retrieval in storage systems by comparing estimated delays along different paths and choosing the faster path. This is taught by Davis in [0037].
			
Regarding claim(s) 16 and 35, Davis teaches:		
wherein the interlock circuit includes a comparator configured to compare the age of traffic in the fast path with the age of traffic in the slow path.	[0037] In making such a determination, the CPU 156 compares estimated delays along the first path 550 and the second path 552, and chooses the faster path.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Pho (US 2015/0095583): discloses a data processing system wherein multiple burst requests are provided by the cache controller to external memory in response to cache misses and the burst requests are generated to either allow in order or out of order data returns.


Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JARED RUTZ can be reached on 571-272-5535. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CHARLES J CHOI/Examiner, Art Unit 2133