DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
The claim objections to claims 7, 16 and 20 have been withdrawn in light of the instant amendment to claims. 
Applicant's arguments filed 09/07/2022 have been fully considered but they are not persuasive. 
For claims 1, 10 and 17, Applicant argues that that the cited references do not disclose the amended limitations.  The Office disagrees.
	Specifically, Meier (Col. 6, line 39-45; Col. 7, lines 1-6, 13-17, 26-31, Col. 7) teaches aggregated multiple fetch and Gracia ([0006]) teaches a tag value shared by multiple cache entries. The combination of Meier and Gracia teaches the limitations of claims 1, 10 and 17.
Applicant’s arguments for dependent claims 2-9, 11-15 and 18-20 are based on their respective base independent claims 1, 10 and 17,  which are addressed above.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 10, 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Meier et al. (US 20140006465 A1), in view of Gracia et al. (US 20190079874 A1).  
Regarding Claim 1, Meier teaches
An apparatus, comprising: cache circuitry that includes multiple cache entry circuits, wherein a given cache entry circuit is configured to store a cache block; (Meier Col. 6, line 39-45: The data cache 30 may have any capacity and configuration. For example, set associative, fully associative, and direct mapped configurations may be used in various embodiments. The data cache 30 may be configured to cache data in cache blocks, where a cache block is a set of bytes from contiguous memory locations that are allocated and deallocated space in the cache as a unit.)
tag circuitry configured to maintain a tag value shared by multiple cache entry circuits; (Meier Col. 10, lines 7-21: The tag memory 40A may include multiple entries, each entry storing a tag for a corresponding access map in the map memory 40B. In an embodiment, the access map memory 40 may be fully associative and thus the tag memory 40A may be content addressable memory (CAM).)
cache control circuitry configured to, in response to a miss for a request for a first cache block, initiate a fetch request to a next level cache or memory; (Meier Col. 7, lines 1-6: The external cache 34 may have any capacity and configuration as well. In an embodiment, the external cache 34 may be a level 2 (L2) cache. In another embodiment, the processor 10 may include an L2 cache and the external cache 34 may be a level 3 (L3) cache. The external cache 34 may be any level of cache in a memory hierarchy. Col. 6,  lines 58-63: Cache misses in data cache 30 and instruction cache 14, as well as translation accesses, non-cacheable accesses, etc. may be communicated to the external interface unit 32. The external interface unit 32 may be configured to transmit transactions to the external cache 34 in response to the various accesses generated in the processor 10. Col. 7, lines 13-17: The request queue 36 may be configured to receive requests from the processor 10 (and potentially other processors in a multiprocessor configuration) to access the external cache 34. The requests may be demand fetches, or may be prefetch requests. Col. 7, lines 26-31:  Each of these requests may be queued in the request queue 36; and the requests may be serviced by the external cache 34 from the request queue 36. If the requests are a miss in the external cache 34, the requests may be transmitted to lower level caches and/or a main memory in a system including the processor 10.) (i.e. in response to cache miss, fetch request to next level cache or memory)
aggregation circuitry configured to aggregate multiple fetch requests for cache blocks that share the tag value; (Meier Col. 7, line 65 – Col. 8, line 1: The filter buffer 48 may be configured to merge multiple memory operations to the same access map and present the operations to the access map memory 40, the shifter 42, and the control circuit 46.) (i.e. prefetch circuit 20 which comprises the filter buffer is the aggregation circuitry)
and fetch circuitry configured to initiate a single multi-block fetch operation to the next level cache or memory that returns cache blocks for the aggregated multiple fetch requests. (Meier Col. 7, line 65 – Col. 8, line 1: The filter buffer 48 may be configured to merge multiple memory operations to the same access map and present the operations to the access map memory 40, the shifter 42, and the control circuit 46. Col. 7, lines 13-17: The request queue 36 may be configured to receive requests from the processor 10 (and potentially other processors in a multiprocessor configuration) to access the external cache 34. The requests may be demand fetches, or may be prefetch requests. Col. 7, lines 26-31:  Each of these requests may be queued in the request queue 36; and the requests may be serviced by the external cache 34 from the request queue 36. If the requests are a miss in the external cache 34, the requests may be transmitted to lower level caches and/or a main memory in a system including the processor 10.) (i.e. Fig. 2 Prefetch Circuit 20 which comprises the filter buffer is the fetch circuitry)
Meier does not teach tag circuitry configured to maintain a tag value shared by multiple cache entry circuits;
However, Gracia teaches a tag value shared by multiple cache entry circuits; (Gracia [0006] A said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion.)
Meier and Gracia are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier and Gracia before him or her to modify the Meier’s system with Gracia’s teaching. The motivation for doing so would be to have (Gracia [0006]) a shareable tag portion which may be associated with a plurality of cache lines.
Regarding Claim 10, Meier teaches
A non-transitory computer readable storage medium having stored thereon design information that specifies a design of at least a portion of a hardware integrated circuit in a format recognized by a semiconductor fabrication system that is configured to use the design information to produce the circuit according to the design, wherein the design information specifies that the circuit includes: (Meier Col. 17, lines 44-46: the system 150 includes at least one instance of a system on a chip (SOC) 152 coupled to one or more peripherals 154 and an external memory 158. Col. 18, line 8:  The peripherals 154 may include any desired circuitry)
cache circuitry that includes multiple cache entry circuits, wherein a given cache entry circuit is configured to store a cache block; (Meier Col. 6, line 39-45: The data cache 30 may have any capacity and configuration. For example, set associative, fully associative, and direct mapped configurations may be used in various embodiments. The data cache 30 may be configured to cache data in cache blocks, where a cache block is a set of bytes from contiguous memory locations that are allocated and deallocated space in the cache as a unit.)
tag circuitry configured to maintain a tag value shared by multiple cache entry circuits; (Meier Col. 10, lines 7-21: The tag memory 40A may include multiple entries, each entry storing a tag for a corresponding access map in the map memory 40B. In an embodiment, the access map memory 40 may be fully associative and thus the tag memory 40A may be content addressable memory (CAM).)
cache control circuitry configured to, in response to a miss for a request for a first cache block, initiate a fetch request to a next level cache or memory; (Meier Col. 7, lines 1-6: The external cache 34 may have any capacity and configuration as well. In an embodiment, the external cache 34 may be a level 2 (L2) cache. In another embodiment, the processor 10 may include an L2 cache and the external cache 34 may be a level 3 (L3) cache. The external cache 34 may be any level of cache in a memory hierarchy. Col. 6,  lines 58-63: Cache misses in data cache 30 and instruction cache 14, as well as translation accesses, non-cacheable accesses, etc. may be communicated to the external interface unit 32. The external interface unit 32 may be configured to transmit transactions to the external cache 34 in response to the various accesses generated in the processor 10. Col. 7, lines 13-17: The request queue 36 may be configured to receive requests from the processor 10 (and potentially other processors in a multiprocessor configuration) to access the external cache 34. The requests may be demand fetches, or may be prefetch requests. Col. 7, lines 26-31:  Each of these requests may be queued in the request queue 36; and the requests may be serviced by the external cache 34 from the request queue 36. If the requests are a miss in the external cache 34, the requests may be transmitted to lower level caches and/or a main memory in a system including the processor 10.) (i.e. in response to cache miss, fetch request to next level cache or memory)
aggregation circuitry configured to aggregate multiple fetch requests for cache blocks that share the tag value; (Meier Col. 7, line 65 – Col. 8, line 1: The filter buffer 48 may be configured to merge multiple memory operations to the same access map and present the operations to the access map memory 40, the shifter 42, and the control circuit 46.) (i.e. prefetch circuit 20 which comprises the filter buffer is the aggregation circuitry)
and fetch circuitry configured to initiate a single multi-block fetch operation to the next level cache or memory that returns cache blocks for the aggregated multiple fetch requests. (Meier Col. 7, line 65 – Col. 8, line 1: The filter buffer 48 may be configured to merge multiple memory operations to the same access map and present the operations to the access map memory 40, the shifter 42, and the control circuit 46. Col. 7, lines 13-17: The request queue 36 may be configured to receive requests from the processor 10 (and potentially other processors in a multiprocessor configuration) to access the external cache 34. The requests may be demand fetches, or may be prefetch requests. Col. 7, lines 26-31:  Each of these requests may be queued in the request queue 36; and the requests may be serviced by the external cache 34 from the request queue 36. If the requests are a miss in the external cache 34, the requests may be transmitted to lower level caches and/or a main memory in a system including the processor 10.) (i.e. Fig. 2 Prefetch Circuit 20 which comprises the filter buffer is the fetch circuitry)
Meier does not teach tag circuitry configured to maintain a tag value shared by multiple cache entry circuits;
However, Gracia teaches a tag value shared by multiple cache entry circuits; (Gracia [0006] A said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion.)
Meier and Gracia are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier and Gracia before him or her to modify the Meier’s system with Gracia’s teaching. The motivation for doing so would be to have (Gracia [0006]) a shareable tag portion which may be associated with a plurality of cache lines.
Regarding Claim 17, Meier teaches
A method, comprising: storing, by cache circuitry, multiple cache blocks using multiple different cache entry circuits; (Meier Col. 6, line 39-45: The data cache 30 may have any capacity and configuration. For example, set associative, fully associative, and direct mapped configurations may be used in various embodiments. The data cache 30 may be configured to cache data in cache blocks, where a cache block is a set of bytes from contiguous memory locations that are allocated and deallocated space in the cache as a unit.)
maintaining, by tag circuitry, a tag value shared by multiple cache entry circuits; (Meier Col. 10, lines 7-21: The tag memory 40A may include multiple entries, each entry storing a tag for a corresponding access map in the map memory 40B. In an embodiment, the access map memory 40 may be fully associative and thus the tag memory 40A may be content addressable memory (CAM).)
in response to a miss for a request for a first cache block, cache control circuitry initiating a fetch request to a next level cache or memory; (Meier Col. 7, lines 1-6: The external cache 34 may have any capacity and configuration as well. In an embodiment, the external cache 34 may be a level 2 (L2) cache. In another embodiment, the processor 10 may include an L2 cache and the external cache 34 may be a level 3 (L3) cache. The external cache 34 may be any level of cache in a memory hierarchy. Col. 6,  lines 58-63: Cache misses in data cache 30 and instruction cache 14, as well as translation accesses, non-cacheable accesses, etc. may be communicated to the external interface unit 32. The external interface unit 32 may be configured to transmit transactions to the external cache 34 in response to the various accesses generated in the processor 10. Col. 7, lines 13-17: The request queue 36 may be configured to receive requests from the processor 10 (and potentially other processors in a multiprocessor configuration) to access the external cache 34. The requests may be demand fetches, or may be prefetch requests. Col. 7, lines 26-31:  Each of these requests may be queued in the request queue 36; and the requests may be serviced by the external cache 34 from the request queue 36. If the requests are a miss in the external cache 34, the requests may be transmitted to lower level caches and/or a main memory in a system including the processor 10.) (i.e. in response to cache miss, fetch request to next level cache or memory)
aggregating, by aggregation circuitry, multiple fetch requests for cache blocks that share the tag value; (Meier Col. 7, line 65 – Col. 8, line 1: The filter buffer 48 may be configured to merge multiple memory operations to the same access map and present the operations to the access map memory 40, the shifter 42, and the control circuit 46.) (i.e. prefetch circuit 20 which comprises the filter buffer is the aggregation circuitry)
and initiating, by fetch circuitry, a single multi-block fetch operation to the next level cache or memory that returns cache blocks for the aggregated multiple fetch requests. (Meier Col. 7, line 65 – Col. 8, line 1: The filter buffer 48 may be configured to merge multiple memory operations to the same access map and present the operations to the access map memory 40, the shifter 42, and the control circuit 46. Col. 7, lines 13-17: The request queue 36 may be configured to receive requests from the processor 10 (and potentially other processors in a multiprocessor configuration) to access the external cache 34. The requests may be demand fetches, or may be prefetch requests. Col. 7, lines 26-31:  Each of these requests may be queued in the request queue 36; and the requests may be serviced by the external cache 34 from the request queue 36. If the requests are a miss in the external cache 34, the requests may be transmitted to lower level caches and/or a main memory in a system including the processor 10.) (i.e. Fig. 2 Prefetch Circuit 20 which comprises the filter buffer is the fetch circuitry)
Meier does not teach a tag value shared by multiple cache entry circuits;
However, Gracia teaches a tag value shared by multiple cache entry circuits; (Gracia [0006] A said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion.)
Meier and Gracia are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier and Gracia before him or her to modify the Meier’s system with Gracia’s teaching. The motivation for doing so would be to have (Gracia [0006]) a shareable tag portion which may be associated with a plurality of cache lines.
Claim(s) 2, 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Meier et al. (US 20140006465 A1), in view of Gracia et al. (US 20190079874 A1), further in view of Smith (US 20150121009  A1).  
Regarding Claim 2, Meier and Gracia teach
The apparatus of claim 1, 
Meier-Gracia teaches TLB (Meier Col. 6, lines 53-57:  In an embodiment, the data cache 30 may be virtually indexed and a translation lookaside buffer (TLB, not shown in FIG. 1) may be accessed in parallel to translate the virtual address to a physical address of a memory location in the memory. Col. 13, lines 27-32: The control circuit 46 may initialize the tag portion of the allocated entry (in the tag CAM 40A) with the virtual address of the access region (e.g. bits M:P+1 of the VA) and the physical address (PA) provided by a translation lookaside buffer (TLB) associated with the data cache), but Meier-Gracia does not explicitly teach wherein the cache blocks store respective page table entries, wherein the cache circuitry is included in translation circuitry that is configured to convert an input address in a first address space to an output address in a second address space based on a page table entry.
However, Smith teaches wherein the cache blocks store respective page table entries, wherein the cache circuitry is included in translation circuitry that is configured to convert an input address in a first address space to an output address in a second address space based on a page table entry. (Smith [0018] Table entry formatter 150 as shown is processor 100 (more specifically, memory management unit 120) that is executing code.  Table entry formatter 150 is further operable to take PTEs that identify multiple pages (one explicitly, the balance implicitly) and convert the address that is explicit to one storage location to an address that is explicit for a second storage location.) (i.e. the table entry formatter is the translation circuitry that convert an input address in a first address space to an output address in a second address space based on a page table entry)
Meier, Gracia and Smith are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Smith before him or her to modify the Meier-Gracia’s system with Smith’s teaching. The motivation for doing so would be to have (Smith [0002]) methods and devices configuring page table entries to match storage constraints presented by cache memory for facilitating increased hit rates in cache.
Regarding Claim 11, Meier and Gracia teach
The non-transitory computer readable storage medium of claim 10,
Meier-Gracia teaches TLB (Meier Col. 6, lines 53-57:  In an embodiment, the data cache 30 may be virtually indexed and a translation lookaside buffer (TLB, not shown in FIG. 1) may be accessed in parallel to translate the virtual address to a physical address of a memory location in the memory. Col. 13, lines 27-32: The control circuit 46 may initialize the tag portion of the allocated entry (in the tag CAM 40A) with the virtual address of the access region (e.g. bits M:P+1 of the VA) and the physical address (PA) provided by a translation lookaside buffer (TLB) associated with the data cache), but Meier-Gracia does not explicitly teach wherein the cache blocks store respective page table entries, wherein the cache circuitry is included in translation circuitry that is configured to convert an input address in a first address space to an output address in a second address space based on a page table entry.
However, Smith teaches wherein the cache blocks store respective page table entries, wherein the cache circuitry is included in translation circuitry that is configured to convert an input address in a first address space to an output address in a second address space based on a page table entry. (Smith abst: A device for and method of storing page table entries in a first cache. [0018] Table entry formatter 150 as shown is processor 100 (more specifically, memory management unit 120) that is executing code.  Table entry formatter 150 is further operable to take PTEs that identify multiple pages (one explicitly, the balance implicitly) and convert the address that is explicit to one storage location to an address that is explicit for a second storage location.) (i.e. the table entry formatter is the translation circuitry that convert an input address in a first address space to an output address in a second address space based on a page table entry)
Meier, Gracia and Smith are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Smith before him or her to modify the Meier-Gracia’s system with Smith’s teaching. The motivation for doing so would be to have (Smith [0002]) methods and devices configuring page table entries to match storage constraints presented by cache memory for facilitating increased hit rates in cache.
Claim(s) 3, 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Meier et al. (US 20140006465 A1), in view of Gracia et al. (US 20190079874 A1), further in view of Wang (US 20190042422 A1).  
Regarding Claim 3, Meier and Gracia teach
The apparatus of claim 1, 
	Meier-Gracia does not teach
wherein the cache circuity maintains a validity field per cache entry circuit and wherein the cache circuitry is configured to use the tag circuitry and validity fields to determine whether requests hit or miss in the cache circuitry at cache block granularity.
However, Wang teaches wherein the cache circuity maintains a validity field per cache entry circuit (Wang [0034] The last level cache circuitry 118 includes a number of cache blocks 222, which are referenced by an index 224, according to one embodiment. Each of the number of cache blocks 222 include, but are not limited to, a validity bit 226, a dirty bit 228, a tag 230, data 232, a near memory way ID 122, and a near memory inclusive bit 124, according to one embodiment. The validity bit 226 is an indication of whether or not the information in a particular cache block is valid.)  and wherein the cache circuitry is configured to use the tag circuitry and validity fields to determine whether requests hit or miss in the cache circuitry at cache block granularity. (Wang [0039] At operation 302, the process 300 includes generating a last level cache miss request, according to one embodiment. This may occur when a processor is unable to find information for a particular memory address (which includes a tag, index, and offset) in any higher-level caches (e.g., first level cache) nor in the last level cache.) (i.e. use tag circuitry for tag match to determine cache hit/miss when cache block is valid)
Meier, Gracia and Wang are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Wang before him or her to modify the Meier-Gracia’s system with Wang’s teaching. The motivation for doing so would be to have (Wang [0034, 0039]) validity bit for each cache block, an indication of information in particular cache block is valid, and tag match to determine cache hit/miss to determine hits and missed at cache block granularity.
Regarding Claim 12, Meier and Gracia teach
The non-transitory computer readable storage medium of claim 10, 
	Meier-Gracia does not teach
wherein the cache circuity maintains a validity field per cache entry circuit and wherein the cache circuitry is configured to use the tag circuitry and validity fields to determine whether requests hit or miss in the cache circuitry at cache block granularity.
However, Wang teaches wherein the cache circuity maintains a validity field per cache entry circuit (Wang [0034] The last level cache circuitry 118 includes a number of cache blocks 222, which are referenced by an index 224, according to one embodiment. Each of the number of cache blocks 222 include, but are not limited to, a validity bit 226, a dirty bit 228, a tag 230, data 232, a near memory way ID 122, and a near memory inclusive bit 124, according to one embodiment. The validity bit 226 is an indication of whether or not the information in a particular cache block is valid.)  and wherein the cache circuitry is configured to use the tag circuitry and validity fields to determine whether requests hit or miss in the cache circuitry at cache block granularity. (Wang [0039] At operation 302, the process 300 includes generating a last level cache miss request, according to one embodiment. This may occur when a processor is unable to find information for a particular memory address (which includes a tag, index, and offset) in any higher-level caches (e.g., first level cache) nor in the last level cache.) (i.e. use tag circuitry for tag match to determine cache hit/miss when cache block is valid)
Meier, Gracia and Wang are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Wang before him or her to modify the Meier-Gracia’s system with Wang’s teaching. The motivation for doing so would be to have (Wang [0034, 0039]) validity bit for each cache block, an indication of information in particular cache block is valid, and tag match to determine cache hit/miss to determine hits and missed at cache block granularity.
Claim(s) 4-6, 13-15, 18-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Meier et al. (US 20140006465 A1), in view of Gracia et al. (US 20190079874 A1), further in view of Jiao et al. (US 20070067567 A1).  
Regarding Claim 4, Meier and Gracia teach
The apparatus of claim 1, a set of cache entry circuits that share the tag value (Gracia [0006] A said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion.) 
Meier-Gracia teaches access map has metadata for cache blocks (Meier Col. 8 line 67- Col. 9 line 8),  Meier-Gracia does not teach wherein the cache circuitry maintains a fetch pending field for a set of cache entry circuits that share the tag value, wherein the fetch pending field indicates whether a fetch is pending for any of the cache entry circuits that share the tag value.
However, Jiao teaches wherein the cache circuitry maintains a fetch pending field for a set of cache blocks that share the tag value, wherein the fetch pending field indicates whether a fetch is pending for any of the cache entry circuits that share the tag value. (Jiao [0058] FIG. 4 is a block diagram showing various components within the L2 cache 210. [0062] The L2 cache 210 further comprises a missed write request table 420 and a missed read request table 422, which both feed into a pending memory access unit (MXU) request FIFO 424. [0079] FIG. 7 is an illustration of a structure for an entry in a missed read request table 422.The missed read request table 422, within the L2 cache 210, records misses in the L2 cache 210.) (Fig. 7 has Cache Line No. column, the miss read request/pending memory access is at cache line/cache block level. The corresponding entry in the missed write request table is feed into pending memory access unit and request FIFO)
Meier, Gracia and Jiao are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Jiao before him or her to modify the Meier-Gracia’s access map with Jiao’s miss read request/pending memory access to have fetch pending field indicates a fetch is pending for cache block. The motivation for doing so would be to (Jiao [0005, 0029]) improve processing efficiency using pending request queue.
Regarding Claim 5, Meier and Gracia teach
The apparatus of claim 1, the aggregation circuitry (Meier Col. 7, line 65 – Col. 8, line 1 prefetch circuit)
cache entry circuits that share the tag value (Gracia [0006] A said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion.) 
Meier-Gracia does not teach wherein the aggregation circuitry maintains, for one or more tags with outstanding fetches, a set of fields that indicate whether respective cache entry circuits that share the tag value have a valid fetch request.
However, Jiao teaches wherein the aggregation circuitry maintains, for one or more tags with outstanding fetches, a set of fields that indicate whether respective cache entry circuits that share the tag value have a valid fetch request. (Jiao [0058] FIG. 4 is a block diagram showing various components within the L2 cache 210. [0062] The L2 cache 210 further comprises a missed write request table 420 and a missed read request table 422, which both feed into a pending memory access unit (MXU) request FIFO 424. [0057] As shown in FIG. 6, the L2 data structure comprises a 1-bit valid flag (V), a 1-bit dirty flag (D6), a 17-bittag (T6), and a 2-bit miss reference number (MR), all of which identify an address for a particular data set.) (Fig. 6 - T6 is the Tag field, tags having missed read request/pending memory access have a valid fetch request)
Meier, Gracia and Jiao are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Jiao before him or her to modify the Meier-Gracia’s system with Jiao’s teaching. The motivation for doing so would be to (Jiao [0005, 0029]) improve processing efficiency using pending request queue.
Regarding Claim 6, Meier, Gracia and Jiao teach
The apparatus of claim 5, wherein the fetch circuitry is configured to receive a fetch response for the multi-block fetch operation and update the set of fields based on cache blocks indicated as fetched in the fetch response. (Meier Col. 7, line 65 – Col. 8, line 1: The filter buffer 48 may be configured to merge multiple memory operations to the same access map and present the operations to the access map memory 40, the shifter 42, and the control circuit 46. Col. 8 line 67- Col. 9 line 8: Each cache block within the access region may have an associated symbol in the access map, indicating the type of access that has occurred. In one embodiment, accesses may include demand-accessed (symbol A), prefetched to data cache 30 (symbol P), prefetched to lower level cache (L), successful prefetch (symbol S), or invalid (symbol “.”). Each symbol may be represented by a different code of a value stored for the cache block in the access map.) (i.e. metadata in the access map has symbol S in response to successful prefetch)
Regarding Claim 13, Meier and Gracia teach
The non-transitory computer readable storage medium of claim 12, a set of cache entry circuits that share the tag value (Gracia [0006] A said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion.) 
Meier-Gracia teaches access map has metadata for cache blocks (Meier Col. 8 line 67- Col. 9 line 8),  Meier-Gracia does not teach wherein the cache circuitry maintains a fetch pending field for a set of cache entry circuits that share the tag value, wherein the fetch pending field indicates whether a fetch is pending for any of the cache entry circuits that share the tag value.
However, Jiao teaches wherein the cache circuitry maintains a fetch pending field for a set of cache blocks that share the tag value, wherein the fetch pending field indicates whether a fetch is pending for any of the cache entry circuits that share the tag value. (Jiao [0058] FIG. 4 is a block diagram showing various components within the L2 cache 210. [0062] The L2 cache 210 further comprises a missed write request table 420 and a missed read request table 422, which both feed into a pending memory access unit (MXU) request FIFO 424. [0079] FIG. 7 is an illustration of a structure for an entry in a missed read request table 422.The missed read request table 422, within the L2 cache 210, records misses in the L2 cache 210.) (Fig. 7 has Cache Line No. column, the miss read request/pending memory access is at cache line/cache block level. The corresponding entry in the missed write request table is feed into pending memory access unit and request FIFO)
Meier, Gracia and Jiao are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Jiao before him or her to modify the Meier-Gracia’s access map with Jiao’s miss read request/pending memory access to have fetch pending field indicates a fetch is pending for cache block. The motivation for doing so would be to (Jiao [0005, 0029]) improve processing efficiency using pending request queue.
Regarding Claim 14, Meier and Gracia teach
The non-transitory computer readable storage medium of claim 13, the aggregation circuitry (Meier Col. 7, line 65 – Col. 8, line 1 prefetch circuit)
cache entry circuits that share the tag value (Gracia [0006] A said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion.) 
Meier-Gracia does not teach wherein the aggregation circuitry maintains, for one or more tags with outstanding fetches, a set of fields that indicate whether respective cache entry circuits that share the tag value have a valid fetch request.
However, Jiao teaches wherein the aggregation circuitry maintains, for one or more tags with outstanding fetches, a set of fields that indicate whether respective cache entry circuits that share the tag value have a valid fetch request. (Jiao [0058] FIG. 4 is a block diagram showing various components within the L2 cache 210. [0062] The L2 cache 210 further comprises a missed write request table 420 and a missed read request table 422, which both feed into a pending memory access unit (MXU) request FIFO 424. [0057] As shown in FIG. 6, the L2 data structure comprises a 1-bit valid flag (V), a 1-bit dirty flag (D6), a 17-bittag (T6), and a 2-bit miss reference number (MR), all of which identify an address for a particular data set.) (Fig. 6 - T6 is the Tag field, tags having missed read request/pending memory access have a valid fetch request)
Meier, Gracia and Jiao are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Jiao before him or her to modify the Meier-Gracia’s system with Jiao’s teaching. The motivation for doing so would be to (Jiao [0005, 0029]) improve processing efficiency using pending request queue.
Regarding Claim 15, Meier, Gracia and Jiao teach
The non-transitory computer readable storage medium of claim 14, wherein the fetch circuitry is configured to receive a fetch response for the multi-block fetch operation and update the set of fields based on cache blocks indicated as fetched in the fetch response. (Meier Col. 7, line 65 – Col. 8, line 1: The filter buffer 48 may be configured to merge multiple memory operations to the same access map and present the operations to the access map memory 40, the shifter 42, and the control circuit 46. Col. 8 line 67- Col. 9 line 8: Each cache block within the access region may have an associated symbol in the access map, indicating the type of access that has occurred. In one embodiment, accesses may include demand-accessed (symbol A), prefetched to data cache 30 (symbol P), prefetched to lower level cache (L), successful prefetch (symbol S), or invalid (symbol “.”). Each symbol may be represented by a different code of a value stored for the cache block in the access map.) (i.e. metadata in the access map has symbol S in response to successful prefetch)
Regarding Claim 18, Meier and Gracia teach
The method of claim 17, a set of cache entry circuits that share the tag value (Gracia [0006] A said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion.) 
Meier-Gracia teaches access map has metadata for cache blocks (Meier Col. 8 line 67- Col. 9 line 8),  Meier-Gracia does not teach wherein the cache circuitry maintains a fetch pending field for a set of cache entry circuits that share the tag value, wherein the fetch pending field indicates whether a fetch is pending for any of the cache entry circuits that share the tag value.
However, Jiao teaches wherein the cache circuitry maintains a fetch pending field for a set of cache entry circuits that share the tag value, wherein the fetch pending field indicates whether a fetch is pending for any of the cache entry circuits that share the tag value. (Jiao [0058] FIG. 4 is a block diagram showing various components within the L2 cache 210. [0062] The L2 cache 210 further comprises a missed write request table 420 and a missed read request table 422, which both feed into a pending memory access unit (MXU) request FIFO 424. [0079] FIG. 7 is an illustration of a structure for an entry in a missed read request table 422.The missed read request table 422, within the L2 cache 210, records misses in the L2 cache 210.) (Fig. 7 has Cache Line No. column, the miss read request/pending memory access is at cache line/cache block level. The corresponding entry in the missed write request table is feed into pending memory access unit and request FIFO)
Meier, Gracia and Jiao are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Jiao before him or her to modify the Meier-Gracia’s access map with Jiao’s miss read request/pending memory access to have fetch pending field indicates a fetch is pending for cache block. The motivation for doing so would be to (Jiao [0005, 0029]) improve processing efficiency using pending request queue.
 Regarding Claim 19, Meier and Gracia teach
The method of claim 17, the aggregation circuitry (Meier Col. 7, line 65 – Col. 8, line 1 prefetch circuit) cache entry circuits that share the tag value (Gracia [0006] A said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion.) 
Meier-Gracia does not teach further comprising: maintaining, by aggregation circuitry for the tag value, a set of fields that indicate whether respective cache entry circuits that share the tag value have a valid fetch request.
However, Jiao teaches further comprising: maintaining, by aggregation circuitry for the tag value, a set of fields that indicate whether respective cache entry circuits that share the tag value have a valid fetch request. (Jiao [0058] FIG. 4 is a block diagram showing various components within the L2 cache 210. [0062] The L2 cache 210 further comprises a missed write request table 420 and a missed read request table 422, which both feed into a pending memory access unit (MXU) request FIFO 424. [0057] As shown in FIG. 6, the L2 data structure comprises a 1-bit valid flag (V), a 1-bit dirty flag (D6), a 17-bittag (T6), and a 2-bit miss reference number (MR), all of which identify an address for a particular data set.) (Fig. 6 - T6 is the Tag field, tags having missed read request/pending memory access have a valid fetch request)
Meier, Gracia and Jiao are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Jiao before him or her to modify the Meier-Gracia’s access map with Jiao’s miss read request/pending memory access to have fetch pending field indicates a fetch is pending for cache block. The motivation for doing so would be to (Jiao [0005, 0029]) improve processing efficiency using pending request queue.
Claim(s) 7, 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Meier et al. (US 20140006465 A1), in view of Gracia et al. (US 20190079874 A1), further in view of Di et al. (US 20180307608 A1).  
Regarding Claim 7, Meier and Gracia teach
The apparatus of claim 1, wherein the aggregation circuitry is configured to aggregate fetch requests for (Meier Col. 7, line 65 – Col. 8, line 1: The filter buffer 48 may be configured to merge multiple memory operations to the same access map and present the operations to the access map memory 40, the shifter 42, and the control circuit 46.) (i.e. prefetch circuit 20 which comprises the filter buffer is the aggregation circuitry) cache entry circuits that share the tag value (Gracia [0006] A said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion.)
Meier-Gracia does not teach further comprising: arbitration circuitry configured to arbitrate among requests to use a fetch bus, wherein the aggregation circuitry is configured to aggregate fetch requests for cache entry circuits that share the tag value until a request with the tag value wins arbitration for the fetch bus.
However, Di teaches Meier does not teach further comprising: arbitration circuitry configured to arbitrate among requests to use a fetch bus, wherein the aggregation circuitry is configured to aggregate fetch requests for cache entry circuits that share the tag value until the a request with the tag value wins arbitration for the fetch bus. (Di abst: A cache memory for a processor including an arbiter, a tag array and a request queue. The arbiter arbitrates among multiple memory access requests and provides a selected memory access request. [0023] The processor 100 includes a first level or level-1 instruction (LH) cache 102, and a front end pipe including an instruction fetch (FETCH) engine 104. The L2 cache 116 further interfaces an external system memory 130 via a bus interface or memory controller or the like)
Meier, Gracia and Di are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Di before him or her to modify the Meier-Gracia’s system with Di’s teaching. The motivation for doing so would be to have (Di abst, [0007]) The arbiter arbitrates among multiple memory access requests and provides a selected memory access request to avoid a “cache pollution” situation in which the cache memory is stuffed with information that is ultimately not used.
Regarding Claim 20, Meier and Gracia teach
The method of claim 17, wherein the aggregation circuitry is configured to aggregate fetch requests for (Meier Col. 7, line 65 – Col. 8, line 1: The filter buffer 48 may be configured to merge multiple memory operations to the same access map and present the operations to the access map memory 40, the shifter 42, and the control circuit 46.) (i.e. prefetch circuit 20 which comprises the filter buffer is the aggregation circuitry) cache entry circuits that share the tag value (Gracia [0006] A said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion.)
Meier-Gracia does not teach further comprising: arbitration circuitry configured to arbitrate among requests to use a fetch bus, wherein the aggregation circuitry is configured to aggregate fetch requests for cache entry circuits that share the tag value until a request with the tag value wins arbitration for the fetch bus.
However, Di teaches Meier does not teach further comprising: arbitration circuitry configured to arbitrate among requests to use a fetch bus, wherein the aggregation circuitry is configured to aggregate fetch requests for cache entry circuits that share the tag value until the a request with the tag value wins arbitration for the fetch bus. (Di abst: A cache memory for a processor including an arbiter, a tag array and a request queue. The arbiter arbitrates among multiple memory access requests and provides a selected memory access request. [0023] The processor 100 includes a first level or level-1 instruction (LH) cache 102, and a front end pipe including an instruction fetch (FETCH) engine 104. The L2 cache 116 further interfaces an external system memory 130 via a bus interface or memory controller or the like)
Meier, Gracia and Di are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Di before him or her to modify the Meier-Gracia’s system with Di’s teaching. The motivation for doing so would be to have (Di abst, [0007]) The arbiter arbitrates among multiple memory access requests and provides a selected memory access request to avoid a “cache pollution” situation in which the cache memory is stuffed with information that is ultimately not used.
Claim(s) 8, 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Meier et al. (US 20140006465 A1), in view of Gracia et al. (US 20190079874 A1), further in view of Di et al. (US 20180307608 A1), further in view of Ban et al. (US 20090198858 A1).  
Regarding Claim 8, Meier, Gracia and Di teach
The apparatus of claim 7, multiple cache entry circuits that share the tag value (Gracia [0006] A said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion.) 
Meier-Gracia-Di does not teach wherein the fetch bus has a width that is sufficient to fetch data in parallel for the multiple cache entry circuits that share the tag value.
However, Ban teaches wherein the fetch bus has a width that is sufficient to fetch data in parallel for the multiple cache entry circuits (Ban [0035] More preferably, when the external data bus width is not 2.sup.N and a read command is received, if read data of the memory array begin with data all included in one block, the converter extracts a data for the predetermined data from the read data, converts the data of the block from serial data into parallel data and successively develops such parallel data, and stores, since a data block for the predetermined data of next data extends between two blocks, the first data, converts the first data from serial data into parallel data together with the second data block and successively develops and reads out such parallel data.)
Meier, Gracia, Di and Ban are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia, Di and Ban before him or her to modify the Meier-Gracia-Di’s system with Ban’s teaching. The motivation for doing so would be Simple substitution of one known element (generic bus) for another (Ban’s bus) to obtain predictable results (to handle fetch data in parallel for multiple blocks).
Regarding Claim 16, Meier and Gracia teach
The non-transitory computer readable storage medium of claim 10, wherein the aggregation circuitry is configured to aggregate fetch requests for (Meier Col. 7, line 65 – Col. 8, line 1: The filter buffer 48 may be configured to merge multiple memory operations to the same access map and present the operations to the access map memory 40, the shifter 42, and the control circuit 46.) (i.e. prefetch circuit 20 which comprises the filter buffer is the aggregation circuitry) cache blocks that share the tag value(Gracia [0006] A said tag associated with a said cache line comprises an individual tag portion associated with the single said cache line, and a shareable tag portion which may be associated with a plurality of cache lines, said individual tag portion comprising a pointer to a shareable tag storage location comprising the shareable tag portion.) 
Meier-Gracia does not teach arbitration circuitry configured to arbitrate among requests to use a fetch bus, wherein the aggregation circuitry is configured to aggregate fetch requests for cache blocks that share the tag value until a request with the tag value wins arbitration for the fetch bus.
However, Di teaches Meier does not teach arbitration circuitry configured to arbitrate among requests to use a fetch bus, wherein the aggregation circuitry is configured to aggregate fetch requests for cache blocks that share the tag value until a request with the tag value wins arbitration for the fetch bus. (Di [0008] A cache memory for a processor according to one embodiment includes an arbiter, a tag array and a request queue. The arbiter arbitrates among multiple memory access requests and provides a selected memory access request. [0023] The processor 100 includes a first level or level-1 instruction (LH) cache 102, and a front end pipe including an instruction fetch (FETCH) engine 104. The L2 cache 116 further interfaces an external system memory 130 via a bus interface or memory controller or the like)
Meier, Gracia and Di are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Di before him or her to modify the Meier-Gracia’s system with Di’s teaching. The motivation for doing so would be to have (Di [0007-0008]) The arbiter arbitrates among multiple memory access requests and provides a selected memory access request to avoid a “cache pollution” situation in which the cache memory is stuffed with information that is ultimately not used.
Meier-Gracia-Di does not teach wherein the fetch bus has a width that is sufficient to fetch data in parallel for the multiple cache blocks that share the tag value.
However, Ban teaches wherein the fetch bus has a width that is sufficient to fetch data in parallel for the multiple cache blocks (Ban [0035] More preferably, when the external data bus width is not 2.sup.N and a read command is received, if read data of the memory array begin with data all included in one block, the converter extracts a data for the predetermined data from the read data, converts the data of the block from serial data into parallel data and successively develops such parallel data, and stores, since a data block for the predetermined data of next data extends between two blocks, the first data, converts the first data from serial data into parallel data together with the second data block and successively develops and reads out such parallel data.)
Meier, Gracia, Di and Ban are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia, Di and Ban before him or her to modify the Meier-Gracia-Di’s system with Ban’s teaching. The motivation for doing so would be Simple substitution of one known element (generic bus) for another (Ban’s bus) to obtain predictable results (to handle fetch data in parallel for multiple blocks).
Claim(s) 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Meier et al. (US 20140006465 A1), in view of Gracia et al. (US 20190079874 A1), further in view of Cooray et al. (US 20190213707 A1).  
Regarding Claim 9, Meier and Gracia teach
The apparatus of claim 1, 
Meier-Gracia does not teach wherein cache circuitry is included in a graphics processor that further includes: one or more shader cores; and a memory management unit.  
	However, Cooray teaches wherein cache circuitry is included in a graphics processor that further includes: one or more shader cores; (Cooray [0039] In some embodiments, the processor 102 includes cache memory 104. [0147]  Graphics processor 1340 includes one or more shader core(s) ) and a memory management unit. (Cooray [0146] Graphics processor 1310 additionally includes one or more memory management units (MMUs))
Meier, Gracia and Cooray are analogous art because they are from the same field of endeavor of memory control. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having the teaching of Meier, Gracia and Cooray before him or her to modify the Meier-Gracia’s system with Cooray’s teaching. The motivation for doing so would be to have a simple substitution of Meier-Gracia’s general processor with Cooray’s GPU to obtain predictable results of data processing and memory control.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WEI MA whose telephone number is (571)272-2468. The examiner can normally be reached Monday through Friday from 8am to 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sanjiv Shah can be reached on 571-272-4098. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/WEI MA/Examiner, Art Unit 2135                                                                                                                                                                                                                                                                                    

/GAUTAM SAIN/Primary Examiner, Art Unit 2135