DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on October 27th, 2022 has been entered.

Claim Status
	Claims 1, 10 and 19 have been amended. No Claims have been cancelled. Claims 1-20 remain pending and are ready for examination.
 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-2, 5-11, 14-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Loh et al. (US Publication No. 2013/0138892 -- "Loh") in view of Rolan et al. (US Publication No. 2015/0278096 -- "Rolan") in further view of Steely, JR. et al. (US Publication No. 2014/0052920 – “Steely”).

Regarding claim 1, Loh teaches A system comprising: a processing device comprising a cache controller; an interconnect coupled to the processing device; and (Loh paragraph [0029], Each of the cache memory subsystems 124a-124b and 128 may include a cache memory, or cache array, connected to a corresponding cache controller. The cache memory subsystems 124a-124b and 128 may be implemented as a hierarchy of caches. Caches located nearer the processor cores 122a-122b (within the hierarchy) may be integrated into the processor cores 122a-122b, if desired. This level of the caches may be a level-one (L1) of a multi-level hierarchy. In one embodiment, the cache memory subsystems 124a-124b each represent L2 cache structures, and the shared cache memory subsystem 128 represents an L3 cache structure. In another embodiment, cache memory subsystems 114 each represent L1 cache structures, and shared cache subsystem 118 represents an L2 cache structure. Other embodiments are possible and contemplated. The processing device has a cache controller for management of the caches. It is also coupled to an interconnect, see Loh paragraph [0027], Regardless of a given type of processing unit used in the computing system 100, as software applications access more and more data, the memory subsystem is utilized more heavily. Latencies become more crucial. More on-chip memory storage may be used to reduce interconnect latencies. For example, each of the cache memory subsystems 124a-124b may reduce memory latencies for a respective one of the processor cores 122a-122b. In addition, the microprocessor 110 may include the shared cache memory subsystem 128 as a last-level cache (LLC) before accessing the off-chip DRAM 170 and/or the off-chip disk memory 162) wherein the memory controller comprises a comparator and is configured to: (Loh Figure 3; Loh paragraph [0021], Referring to FIG. 1, a generalized block diagram of one embodiment of a computing system 100 is shown. As shown, microprocessor 110 may include one or more processor cores 122a-122b connected to corresponding one or more cache memory subsystems 124a-124b. The microprocessor may also include interface logic 140, a memory controller 130, system communication logic 126, and a shared cache memory subsystem 128. In one embodiment, the illustrated functionality of the microprocessor 110 is incorporated upon a single integrated circuit. In another embodiment, the illustrated functionality is incorporated in a chipset on a computer motherboard. Note that while the controller does not describe a physical comparator entity, it does contain a comparator in that the controller performs the function of comparing a plurality of tags from requests, see Loh paragraph [0048], Similar to other DRAM topologies, the 3D DRAM 330 may include multiple memory array banks 332a-332b. Each one of the banks 332a-332b may include a respective one of the row buffers 334a-334b. Each one of the row buffers 334a-334b may store data in an accessed row of the multiple rows within the memory array banks 332a-332b. The accessed row may be identified by a DRAM address in the received memory request. The control logic 336 may perform tag comparisons between a cache tag in a received memory request and the one or more cache tags stored in the row buffer. In addition, the control logic may alter a column access of the row buffers by utilizing the cache tag comparison results rather than a bit field within the received DRAM address) receive a first read request from the cache controller over the interconnect, the first read request comprising first tag data identifying a first cache line in the cache memory; (Loh paragraphs [0011-0012], In one embodiment, a computing system includes a processing unit and an integrated dynamic random access memory (DRAM). Examples of the processing unit include a general-purpose microprocessor, a graphics processing unit (GPU), an accelerated processing unit (APU), and so forth. The integrated DRAM may be a three-dimensional (3D) DRAM and may be included in a System-in-Package (SiP) with the processing unit. The processing unit may utilize the 3D DRAM as a cache. [0012] In various embodiments, the 3D DRAM may store both a tag array and a data array. Each row of the multiple rows in the memory array banks of the 3D DRAM may store one or more cache tags and one or more corresponding cache lines indicated by the one or more cache tags. In response to receiving a memory request from the processing unit, the 3D DRAM may perform a memory access according to the received memory request on a given cache line indicated by a cache tag within the received memory request. Performing the memory access may include a single read of a respective row of the multiple rows storing the given cache line. Rather than utilizing multiple DRAM transactions, a single, complex DRAM transaction may be used to reduce latency and power consumption. The memory request can be considered a read request, which is associated with a first tag data that is used to identify a specific portion of the cache, i.e., the "given cache line", as stated in the reference) determine that the first read request comprises a tag read request; (Loh paragraph [0047], The processing unit 220 may include interface logic to I/O devices and other processing units. This interface logic is not shown for ease of illustration. The processing unit 220 may also include the interface logic 324 for communicating with the 3D DRAM 330. Protocols, address formats, and interface signals used in this communication may be similar to the protocols, address formats and interface signals used for off-package DRAM 170. However, when the 3D DRAM 330 is used as a last-level cache (LLC), adjustments may be made to this communication. For example, a memory request sent from the processing unit 220 to the 3D DRAM 330 may include a cache tag in addition to a DRAM address identifying a respective row within one of the memory array banks 332a-332b. The received cache tag may be used to compare to cache tags stored in the identified given row within the 3D DRAM 330. As stated in Loh, each memory access request (i.e., read request) may be sent with a cache tag as a means to provide a location for the read request) read second tag data corresponding to the tag read request from the cache memory; (Loh paragraph [0048], Similar to other DRAM topologies, the 3D DRAM 330 may include multiple memory array banks 332a-332b. Each one of the banks 332a-332b may include a respective one of the row buffers 334a-334b. Each one of the row buffers 334a-334b may store data in an accessed row of the multiple rows within the memory array banks 332a-332b. The accessed row may be identified by a DRAM address in the received memory request. The control logic 336 may perform tag comparisons between a cache tag in a received memory request and the one or more cache tags stored in the row buffer. In addition, the control logic may alter a column access of the row buffers by utilizing the cache tag comparison results rather than a bit field within the received DRAM address. The cache tag that is used to perform the tag read request may provide the memory system with additional tag located in the particular cache line that the original tag was identifying, leading to obtaining two separate tag data) compare, using the comparator, the second tag data read from the cache memory to the first tag data received from the cache controller with the first read request; and if the second tag data matches the first tag data, initiate an action with respect to the first cache line in the cache memory (Loh paragraphs [0060-0061], A sequence of steps 1-7 is shown in FIG. 4 for accessing tags, status information and data corresponding to cache lines stored in a 3D DRAM. When the memory array bank 430 is used as a cache storing both a tag array and a data array within a same row, an access sequence different from a sequence utilizing steps 1-7 for a given row of the rows 432a-432k may have a large latency. For example, a DRAM access typically includes an first activation or opening stage, a stage that copies the contents of an entire row into the row buffer, a tag read stage, a tag comparison stage, a data read or write access stage that includes a column access, a first precharge or closing stage, a second activation or opening stage, a stage that copies the contents of the entire row again into the row buffer, a tag read stage, a tag comparison stage, an update stage for status information corresponding to the matching tag, and a second precharge or closing stage. The two separate tags are compared to one another. If the two tags match, then the memory access request that was previously described may be executed, which can involve a plurality of actions as well as cache lines, see Loh paragraph [0061], Continuing with the access steps within the memory array bank 430, one or more additional precharge and activation stages may be included after each access of the row buffer if other data stored in other rows are accessed in the meantime. Rather than utilize multiple DRAM transactions for a single cache access, the sequence of steps 1-7, may be used to convert a cache access into a single DRAM transaction. Each of the different DRAM operations, such as activation/open, column access, read, write, and precharge/close, has a different respective latency).
Loh does not teach a memory module coupled to the interconnect, wherein the memory module is separated from the processing device by the interconnect, the memory module comprising: a main memory storing data; a cache memory to store at least a portion of the data from the main memory, the cache memory comprising a plurality of sets of cache lines each comprising a plurality of cache storage locations, wherein a first location of a first set of the plurality of sets of cache lines comprises cache tag data for the plurality of sets of cache lines; a memory controller coupled to the cache memory, read second tag data corresponding to the tag read request from the first location of the first set of the plurality of sets of cache line, wherein the first location storing the cache tag data is part of a different set of cache lines in the cache memory than the first cache line identified in the received first read request.
However, Rolan teaches a memory module coupled to the interconnect, wherein the memory module is separated from the processing device by the interconnect, the memory module comprising: a main memory storing data; a cache memory to store at least a portion of the data from the main memory, (Rolan paragraphs [0061-0063], Processor 520 and memory subsystem 530 are coupled to bus/bus system 510. Bus 510 is an abstraction that represents any one or more separate physical buses, communication lines/interfaces, and/or point-to-point connections, connected by appropriate bridges, adapters, and/or controllers. Therefore, bus 510 may include, for example, one or more of a system bus, a Peripheral Component Interconnect (PCI) bus, a HyperTransport or industry standard architecture (ISA) bus, a small computer system interface (SCSI) bus, a universal serial bus (USB), or an Institute of Electrical and Electronics Engineers (IEEE) standard 1394 bus (commonly referred to as “Firewire”). The buses of bus 510 may also correspond to interfaces in network interface 550. [0062] System 500 may also include one or more input/output (I/O) interface(s) 540, network interface 550, one or more internal mass storage device(s) 560, and peripheral interface 570 coupled to bus 510. I/O interface 540 may include one or more interface components through which a user interacts with system 500 (e.g., video, audio, and/or alphanumeric interfacing). Network interface 550 provides system 500 the ability to communicate with remote devices (e.g., servers, other computing devices) over one or more networks. Network interface 550 may include an Ethernet adapter, wireless interconnection components, USB (universal serial bus), or other wired or wireless standards-based or proprietary interfaces. The processor and memory module may be connected via multiple different types of interconnections. The memory described here includes non-volatile (main) memory, volatile (cache) memory, as well as a controller, see Rolan paragraph [0063], Storage 560 may be or include any conventional medium for storing large amounts of data in a nonvolatile manner, such as one or more magnetic, solid state, or optical based disks, or a combination. Storage 560 holds code or instructions and data 562 in a persistent state (i.e., the value is retained despite interruption of power to system 500). Storage 560 may be generically considered to be a “memory,” although memory 530 is the executing or operating memory to provide instructions to processor 520. Whereas storage 560 is nonvolatile, memory 530 may include volatile memory (i.e., the value or state of the data is indeterminate if power is interrupted to system 500)) the cache memory comprising a plurality of sets of cache lines each comprising a plurality of cache storage locations, wherein a first location of a first set of the plurality of sets of cache lines comprises cache tag data for the plurality of sets of cache lines; (Rolan paragraphs [0020-0021], The processor core(s) 104 may be coupled—e.g. through interconnect 105—to one or more on-die caches 106 physically located on the same die as the processor core(s) 104. In many embodiments, a cache has a tag storage 114 associated with it that stores tags for all cache memory locations. In many embodiments, tag storage 114 resides on a separate silicon die, e.g. Die 2 112, from the processor core(s) 104. In many embodiments, tag storage 114 is coupled to one or more off-die (non-processor die) cache(s) 116—e.g. through interconnect 105—and is located on the same die as off-die cache(s) 116. [0021] A cache of cache tags (CoCT) 108 may store a subset of the off-die cache tags on processor die 102. Specifically, while tag storage 114 stores all index values and associated tag sets per index value, CoCT 108, on the other hand, may not store all possible index values. Rather, to save on storage space, CoCT 108 may store merely a subset of the tags stored in tag storage 114. In some embodiments, not all index locations are represented at any given time in CoCT 108. The cache memory contains a plurality of cache memory locations, wherein the cache tag data for the locations can be stored in a first cache location corresponding to the cache of cache tags. The first cache location can store the cache tags for each of the cache sets in different assigned portions, see Rolan paragraph [0082], In an embodiment, the method further comprises storing at the tag storage a second set including second tags each associated with a respective data location stored within the cache memory, and in response to any determination that a tag of the second set is to be stored to the cache of cache tags, storing all tags of the second set to the second portion. In another embodiment, any storage of tags of the second set to the cache of cache tags by the controller includes storage of the tags of the second set to only the second portion. In another embodiment, of all set of tags of the tag storage, the first portion is to store only tags of the first set) a memory controller coupled to the cache memory, read second tag data corresponding to the tag read request from the first location of the first set of the plurality of sets of cache line (Rolan paragraphs [0050-0051], For example, method 400 may determine, at 410, whether (or not) a tag of a first memory access request corresponds to (e.g. matches) a tag of the first set of tags. In response to the evaluation at 410 determining that the tag of the first memory access request corresponds to such a cached tag, method 400 may, at 415, change a variable of the first entry to indicate increased activity of the first set. Method 400 may further service the first request, as indicated at 420 by operations to access a location in cache memory which corresponds to the tag of the first memory access request. Otherwise, method 400 may, at 425, change the variable of the first entry to indicate decreased activity of the first set. [0051] Subsequent to the first memory access request, a second memory access request may be issued from a host processor or other such requestor logic. The second memory access request may target a memory location which is not currently represented in the cache of cache tags. For example, controller logic may evaluate a tag which is included in the second memory access request to determine whether the tag matches any tag stored or otherwise tracked in the cache of cache tags. The tag data can be requested to be read from the aforementioned first location storing the plurality of cache tag data for the cache sets of cache lines).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teachings of Loh with those of Rolan. Rolan teaches storing cache tag data for a plurality of cache sets in a first cache location. This method provides far more efficient searching and location for cache memory access requests and operations, as well as adjust cache resource allocation based on cache tag request frequency for each of the given cache sets (Rolan paragraph [0085], In another implementation, a method comprises, in response to storage of a first set of tags to a cache of cache tags, associating a first entry of a replacement table with the first set of tags, including setting a first activity variable of the first entry to an initial value of a pre-determined plurality of values, wherein a tag storage stores tags which are each associated with a respective data location of a cache memory, and wherein the cache of cache tags stores a subset of tags stored at the tag storage. The method further comprises, if a first memory access request comprises a tag corresponding to one of the first set of tags, then changing the first activity variable to another of the pre-determined plurality of values to indicate an increase of a level of activity of the first set, otherwise changing the first activity variable to another of the pre-determined plurality of values to indicate a decrease of the level of activity of the first set. The method further comprises, in response to a failure to identify any tag of the cache of cache tags which matches a tag of a second memory access request, selecting a set of tags to evict from the cache of cache tags, including searching the replacement table for an activity variable which is equal to the initial value. Additionally, keeping the cache tag information on the cache storage itself is a method of reducing cache lookup latency, see Rolan paragraph [0006], Each cache has a tag storage structure. If the processor needs data from a certain memory location, it can determine if the data is stored in a given cache by doing a comparison of the memory location address and the tag storage structure for the cache. If the tag storage structure is off-die, the latency to do a tag lookup will be greater than if the tag storage structure is on-die. Thus, although on-die tag storage structures increase the cost of the processor die because they take up valuable space, they help speed up execution by reducing the latencies of tag lookups versus off-die caches).

Loh in view of Rolan does not teach wherein the first location storing the cache tag data is part of a different set of cache lines in the cache memory than the first cache line identified in the received first read request.
However, Steely teaches wherein the first location storing the cache tag data is part of a different set of cache lines in the cache memory than the first cache line identified in the received first read request (Steely paragraph [0016], In an embodiment, the domain state field may be assigned a value indicating that all read requests and all write requests for its associated cache line can only be serviced in the domain state field's associated domain. In an embodiment, the domain state field may be assigned a value indicating that all read requests for the cache line can be serviced in the domain state field's associated domain, that no write requests for the cache line can be serviced in the domain state field's associated domain, and that write requests for the cache line must go to a next level of the hierarchy of cache tag directories. In an embodiment, the domain state field may be assigned a value indicating that no read requests or write requests for the cache line can be serviced in the domain state field's associated domain, and that read requests or write requests for the cache line must go to a next level of the hierarchy of cache tag directories. The cache lines utilized for data read requests are separate and distinct from the cache lines that are used to store the cache tag directories, see Steely paragraph [0014], Embodiments may be discussed herein which efficiently maintain cache coherency. In particular, embodiments of the present invention pertain to a feature for reading/writing a domain state field associated with a tag entry within a cache tag directory. In an embodiment, a value may be assigned to a domain state field of a tag entry in a cache tag directory. The cache tag directory may belong to a hierarchy of cache tag directories. Each tag entry may be associated with a cache line from a cache belonging to a domain. The domain may contain multiple caches. The value of the domain state field may indicate whether its associated cache line can be read or changed).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teachings of Loh and Rolan with those of Steely. Steely teaches a memory location that utilizes a plurality of cache lines for different memory purposes, such as storing cache tag data or completing a read data request from a host device. This allows the memory system to perform several operations at a single time which can improve data performance and keep cache tag data updated via its own cache memory usage, resulting in better cache coherence (Steely paragraph [0005], One conventional technique for maintaining cache coherency, particularly in distributed systems, is a directory-based cache coherency scheme. Directory-based coherency schemes utilize a centralized tag directory to record the location and the status of cache lines as they exist throughout the system. For example, the tag directory records which processor caches have a copy of the data, and further records if any of the caches have an updated copy of the data. When a processor makes a cache request to the main memory for a data item, the tag directory is consulted to determine where the most recent copy of the data resides. Based on this information, the most recent copy of the cache line is retrieved so that it may be provided to the requesting processor cache memory. The tag directory is then updated to reflect the new status for that cache line. Thus, each cache line read by a processor is accompanied by a tag directory update (i.e., a write). The directory-based cache coherency scheme may include multiple tag directories, and the tag directories may be arranged in a hierarchy. Furthermore, a hierarchical tag directory structure may include any number of levels).

Claim 10 is the corresponding method claim to the system claim 1. It is rejected with the same references and rationale.

Regarding claim 2, Loh in view of Rolan in further view of Steely teaches The system of claim 1, wherein the cache memory comprises a set associative cache implemented on a dynamic random access memory (DRAM) device, (Loh paragraph [0008], Utilizing DRAM access mechanisms while storing and accessing the tags and data of the additional cache in the integrated DRAM dissipates a lot of power. In addition, these mechanisms consume a lot of bandwidth, especially for a highly associative on-package cache, and consume too much time as the tags and data are read out in a sequential manner. Therefore, the on-package DRAM provides a lot of extra data storage, but cache and DRAM access mechanisms are inefficient).

Claim 11 is the corresponding method claim to the system claim 2. It is rejected with the same references and rationale.

Regarding claim 5, Loh in view of Rolan in further view of Steely teaches The system of claim 1, wherein to initiate the action with respect to the first cache line, the memory controller is configured to: prepare a portion of the cache memory corresponding to the first cache line for a second read request to be subsequently received from the cache controller over the interconnect (Loh paragraph [0029], Each of the cache memory subsystems 124a-124b and 128 may include a cache memory, or cache array, connected to a corresponding cache controller. The cache memory subsystems 124a-124b and 128 may be implemented as a hierarchy of caches. Caches located nearer the processor cores 122a-122b (within the hierarchy) may be integrated into the processor cores 122a-122b, if desired. This level of the caches may be a level-one (L1) of a multi-level hierarchy. In one embodiment, the cache memory subsystems 124a-124b each represent L2 cache structures, and the shared cache memory subsystem 128 represents an L3 cache structure. In another embodiment, cache memory subsystems 114 each represent L1 cache structures, and shared cache subsystem 118 represents an L2 cache structure. Other embodiments are possible and contemplated. Multiple read requests can be sent from the cache controller, see Loh paragraph [0064], A cache tag may be used to determine which of the multiple cache lines are being accessed within a selected row. For example, in a 30-way set-associative cache organization, when the row 432a is selected, the cache tag values stored in the fields 434a-434d may be used to determine which one of the 30 cache lines stored in fields 438a-438d is being accessed. The cache tag stored in field 412 within the address 410 may be used in comparison logic to locate a corresponding cache line of the multiple cache lines stored in the row buffer 440).

Claim 14 is the corresponding method claim to the system claim 5. It is rejected with the same references and rationale.

Regarding claim 6, Loh in view of Rolan in further view of Steely teaches The system of claim 1, wherein to initiate the action with respect to the first cache line, the memory controller is configured to: read the first cache line from the cache memory before a second read request is received from the cache controller over the interconnect (Loh paragraph [0062], During sequence 1, a memory request from a processing unit may be received by a 3D DRAM. The memory request may have traversed horizontal or vertical short low-latency interconnect routes available through a 3D integrated fabrication process. A portion of a complete address is shown as address 410. The fields 412 and 414 may store a cache tag and a page index, respectively. Other portions of the complete address may include one or more of a channel index, a bank index, a sub array index, and so forth to identify the memory array bank 430 within the 3D DRAM. During sequence 2, a given row of the rows 432a-432k may be selected from other rows by the page index 414. The first read request is sent and completed before the subsequent read requests are received and acted upon by the memory system).

Claim 15 is the corresponding method claim to the system claim 6. It is rejected with the same references and rationale.

Regarding claim 7, Loh in view of Rolan in further view of Steely teaches The system of claim 6, wherein to initiate the action with respect to the first cache line, the memory controller is further configured to: send the first cache line to the cache controller without receiving the second read request from the cache controller over the interconnect (Loh claims 17-19, The method as recited in claim 16, wherein performing the memory access with a single read of the respective row storing the given cache line includes updating the metadata based on the memory access. 18. The method as recited in claim 15, further comprising sending within the memory request the first cache tag in addition to a DRAM address identifying the respective row. 19. The method as recited in claim 15, wherein the DRAM is a three-dimensional (3D) integrated circuit (IC).  The first cache line (i.e., the given cache line associated with the first single read), is sent to the controller individually, without any other future read requests. The interconnect component is used to send the cache data).

Claim 16 is the corresponding method claim to the system claim 7. It is rejected with the same references and rationale.

Regarding claim 8, Loh in view of Rolan in further view of Steely teaches The system of claim 6, wherein to initiate the action with respect to the first cache line, the memory controller is further configured to: receive the second read request from the cache controller over the interconnect; and send the first cache line to the cache controller (Loh paragraph [0060], A sequence of steps 1-7 is shown in FIG. 4 for accessing tags, status information and data corresponding to cache lines stored in a 3D DRAM. When the memory array bank 430 is used as a cache storing both a tag array and a data array within a same row, an access sequence different from a sequence utilizing steps 1-7 for a given row of the rows 432a-432k may have a large latency. For example, a DRAM access typically includes an first activation or opening stage, a stage that copies the contents of an entire row into the row buffer, a tag read stage, a tag comparison stage, a data read or write access stage that includes a column access, a first precharge or closing stage, a second activation or opening stage, a stage that copies the contents of the entire row again into the row buffer, a tag read stage, a tag comparison stage, an update stage for status information corresponding to the matching tag, and a second precharge or closing stage.  A second read request can be sent via the cache controller to send a first cache line, also see Loh paragraph [0031], If a cache miss occurs, such as a requested block is not found in a respective one of the cache memory subsystems 124a-124b or in the shared cache memory subsystem 128, then a read request may be generated and transmitted to the memory controller 130. The memory controller 130 may translate an address corresponding to the requested block and send a read request to the off-chip DRAM 170 through the memory bus 150. The off-chip DRAM 170 may be filled with data from the off-chip disk memory 162 through the I/O controller and bus 160 and the memory bus 150).

Claim 17 is the corresponding method claim to the system claim 8. It is rejected with the same references and rationale.

Regarding claim 9, Loh in view of Rolan in further view of Steely teaches The system of claim 1, wherein the memory controller is further configured to: if the second tag data does not match the first tag data, return an indication of a cache miss to the cache controller (Loh paragraphs [0072-0074], In block 504, the processing unit may determine a given memory request misses within a cache memory subsystem within the processing unit. In block 506, the processing unit may send an address corresponding to the given memory request to an in-package integrated DRAM cache, such as the 3D DRAM. The address may include a non-translated cache tag in addition to a DRAM address translated from a corresponding cache address used within the processing unit to access on-chip caches. In block 508, control logic within the 3D DRAM may identify a given row corresponding to the address within the memory array banks in the 3D DRAM. In block 510, control logic within the 3D DRAM may activate and open the given row. In block 512, the contents of the given row may be copied and stored in a row buffer. In block 514, the tag information in the row buffer may be compared with tag information in the address. The steps described in blocks 506-512 may correspond to the sequences 1-4 described earlier regarding FIG. 4. If the tag comparisons determine a tag hit does not occur (conditional block 516), then in block 518, the memory request may be sent to main memory. The main memory may include an off-chip non-integrated DRAM and/or an off-chip disk memory. If the tag comparisons determine a tag hit occurs (conditional block 516), then in block 520, read or write operations are performed on a corresponding cache line in the row buffer. When the data being compared results in a difference (i.e., not matching), the memory can return a cache miss).

Claim 18 is the corresponding method claim to the system claim 9. It is rejected with the same references and rationale.


Claims 3 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Loh in view of Rolan in further view of Steely as applied to claims 1 and 10 above, and further in view of Nale (US Publication No. 2019/0102313 -- "Nale").

Regarding claim 3, Loh in view of Rolan in further view of Steely and further in view of Nale teaches The system of claim 1, wherein to determine that the first read request comprises a tag read request, the memory controller is configured to: read an identifier in the first read request, the identifier indicating that the first read request is a tag read request (Nale paragraph [0013], Various embodiments described herein include a memory controller that can store a copy of a portion of a critical chunk in a spare lane such that the entire critical chunk can be provided to a CPU using one half of a cache line. In some embodiments, the memory controller may utilize the spare lane to store an entire critical chunk in each half of a cache line. For example, the critical chunk may include metadata (e.g., a cache tag or a Read ID) stored in both halves of a cache line. In such examples, the metadata that is normally in the first half of the cache line may be copied or mapped to spare lane bits associated with the second half of the cache line and metadata that is normally in the second half of the cache line may be copied or mapped to spare lane bits associated with the first half of the cache line. The read/access request includes metadata (i.e., an identifier) which indicates whether or not a tag is present for the read request target).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teachings of Loh and Rolan and Steely with those of Nale. Nale teaches using an identifier to indicate whether or not the first read request is a tag read request, which allows the system to more easily identify and classify when a tag read request occurs versus a normal read occurring, resulting in improved memory performance (Nale paragraph [0013], Various embodiments described herein include a memory controller that can store a copy of a portion of a critical chunk in a spare lane such that the entire critical chunk can be provided to a CPU using one half of a cache line. In some embodiments, the memory controller may utilize the spare lane to store an entire critical chunk in each half of a cache line. For example, the critical chunk may include metadata (e.g., a cache tag or a Read ID) stored in both halves of a cache line. In such examples, the metadata that is normally in the first half of the cache line may be copied or mapped to spare lane bits associated with the second half of the cache line and metadata that is normally in the second half of the cache line may be copied or mapped to spare lane bits associated with the first half of the cache line. In embodiments, the memory controller may allow critical chunk operations to be used in 2LM and/or DDR-T2 environments when a spare lane is implemented by the memory controller and the DDR-T interface. In one or more embodiments, the memory controller may store the critical chunk in the same locations in the two halves of the cache line such that the memory controller does not have to multiplex (MUX) the data depending on which half comes first. In various embodiments, the memory controller may arrange the bits in a critical chunk separately, such as depending on which half of a cache line is requested. In these and other ways the memory controller may enable reliable and efficient critical chunk operation to achieve improved memory performance, such as by reducing the overall number of memory operations required to provide a critical chunk to a CPU, resulting in several technical effects and advantages).

Claim 13 is the corresponding method claim to the system claim 3. It is rejected with the same references and rationale.


Claims 4 and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Loh in view of Rolan in further view of Steely as applied to claims 1 and 10 above, and further in view of Kim et al. (US Publication No. 2017/0168931 -- "Kim").

Regarding claim 4, Loh in view of Rolan in further view of Steely and further in view of Kim teaches The system of claim 1, wherein the memory controller is further configured to: send the second tag data read from the cache memory to the cache controller over the interconnect (Kim claim 11, The nonvolatile memory module of claim 9, wherein, when a read request is received, a second match signal indicating the cache hit corresponding to the read request is generated by reading a second tag from the tag array and comparing the read second tag with second tag information received with the second tag, and second cache data corresponding to the read request is read from the data array in response to the second match signal.  Second tag data is read from the cache memory to the controller, which is used for all read operations, see Kim paragraph [0056], FIG. 3 is a block diagram for conceptually illustrating the tag DRAM 331 and the data DRAM 332 of FIG. 2. Referring to FIG. 3, the tag DRAM 331 and the data DRAM 332 may include the same elements, for example, memory cell arrays 331-1 and 332-1, tag comparison circuits 331-5 and 332-5, and multiplexers (Mux Circuit) 331-6 and 332-6. In some embodiments, each of the tag DRAM 331 and the data DRAM 332 may include a dual port DRAM. The dual port DRAM may include input/output ports respectively corresponding to different kinds of devices, for example, data buffer/nonvolatile memory controller. A data path of the dual port DRAM may be connected to a first external device, for example, a data buffer, or a second external device, for example, a nonvolatile memory controller, based on the selection of the multiplexer, that is, multiplexers, 331-6 or 332-6).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teachings of Loh and Rolan and Steely with those of Kim. Kim teaches sending a second tag data from the cache memory to the cache controller, to later be used in a comparison, which allows the memory system to compare and contrast the two separate tags for improving memory reliability and consistency (Kim paragraph [0049-0050], At least one DRAM 331 of the plurality of first DRAMs 330-1 and the plurality of second DRAMs 330-2 may store a tag corresponding to a cache line and compare stored tag information with input tag information. The remaining DRAMs may be implemented to store cache data corresponding to the tag. Hereinafter, a DRAM, which stores tags, may be referred to as "tag DRAM", and each of the remaining DRAMs may be referred to as "data DRAM". The at least one DRAM 331 may be a tag DRAM. DRAM 332 may be a data DRAM. [0050] In some embodiments, the tag DRAM 331 may store a 4-byte tag. In some embodiments, the tag DRAM 331 may store tags in a 2-way, 1:8 direct mapping scheme. The tag may include location information about cache data stored in the data DRAMs and dirty/clear information indicating validity of cache data. In some embodiments, the tag may include an error correction value for error correction. Thus, the tag DRAM 331 may further include an error correction circuit for correcting an error. The memory module control device 350 may provide tag information to the DRAM 330-2).

Claim 12 is the corresponding method claim to the system claim 4. It is rejected with the same references and rationale.


Claims 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Loh in view of Rolan in further view of Steely and further in view of Jiang (US Publication No. 2018/0173640 -- "Jiang").

Regarding claim 19, Loh teaches A device comprising: a cache memory; and a memory controller coupled to the cache memory, wherein the memory controller is configured to: (Loh Figure 3; Loh paragraph [0021], Referring to FIG. 1, a generalized block diagram of one embodiment of a computing system 100 is shown. As shown, microprocessor 110 may include one or more processor cores 122a-122b connected to corresponding one or more cache memory subsystems 124a-124b. The microprocessor may also include interface logic 140, a memory controller 130, system communication logic 126, and a shared cache memory subsystem 128. In one embodiment, the illustrated functionality of the microprocessor 110 is incorporated upon a single integrated circuit. In another embodiment, the illustrated functionality is incorporated in a chipset on a computer motherboard. Note that while the controller does not describe a physical comparator entity, it does contain a comparator in that the controller performs the function of comparing a plurality of tags from requests, see Loh paragraph [0048], Similar to other DRAM topologies, the 3D DRAM 330 may include multiple memory array banks 332a-332b. Each one of the banks 332a-332b may include a respective one of the row buffers 334a-334b. Each one of the row buffers 334a-334b may store data in an accessed row of the multiple rows within the memory array banks 332a-332b. The accessed row may be identified by a DRAM address in the received memory request. The control logic 336 may perform tag comparisons between a cache tag in a received memory request and the one or more cache tags stored in the row buffer. In addition, the control logic may alter a column access of the row buffers by utilizing the cache tag comparison results rather than a bit field within the received DRAM address) receive a first write request from a cache controller over an interconnect, the first write request comprising first tag data identifying a first cache line in the cache memory; (Loh paragraphs [0011-0012], In one embodiment, a computing system includes a processing unit and an integrated dynamic random access memory (DRAM). Examples of the processing unit include a general-purpose microprocessor, a graphics processing unit (GPU), an accelerated processing unit (APU), and so forth. The integrated DRAM may be a three-dimensional (3D) DRAM and may be included in a System-in-Package (SiP) with the processing unit. The processing unit may utilize the 3D DRAM as a cache. [0012] In various embodiments, the 3D DRAM may store both a tag array and a data array. Each row of the multiple rows in the memory array banks of the 3D DRAM may store one or more cache tags and one or more corresponding cache lines indicated by the one or more cache tags. In response to receiving a memory request from the processing unit, the 3D DRAM may perform a memory access according to the received memory request on a given cache line indicated by a cache tag within the received memory request. Performing the memory access may include a single read of a respective row of the multiple rows storing the given cache line. Rather than utilizing multiple DRAM transactions, a single, complex DRAM transaction may be used to reduce latency and power consumption. The memory request can be considered a read request, which is associated with a first tag data that is used to identify a specific portion of the cache, i.e., the "given cache line", as stated in the reference) wherein the cache memory is separated from the cache controller by the interconnect; (Loh paragraph [0043], Turning now to FIG. 3, a generalized block diagram of one embodiment of a computing system 300 utilizing a three-dimensional (3D) DRAM is shown. Circuitry and logic described earlier are numbered identically. The computing system 300 may utilize three-dimensional (3D) packaging, such as a System in Package (SiP) as described earlier. The computing system 300 may include a SiP 310. In one embodiment, the SiP 310 may include the processing unit 220 described earlier and a 3D DRAM 330 that communicate through low-latency interconnect 340. The in-package low-latency interconnect 340 may be horizontal and/or vertical with shorter lengths than long off-chip interconnects when a SiP is not used. Loh Figure 3; Reference #330, 340, 220. Note that the processing unit 220 is separated from the 3D DRAM 330 (i.e, memory module) via the interconnect 340 (in this case, a low-latency interconnect)) compare the second tag data read from the cache memory to the first tag data received from the cache controller with the first read request; if the second tag data matches the first tag data and the first cache line is not already marked as dirty, modify a dirty status indicator for the first cache line before a second write request is received from the cache controller over the interconnect; (Loh paragraphs [0060-0061], A sequence of steps 1-7 is shown in FIG. 4 for accessing tags, status information and data corresponding to cache lines stored in a 3D DRAM. When the memory array bank 430 is used as a cache storing both a tag array and a data array within a same row, an access sequence different from a sequence utilizing steps 1-7 for a given row of the rows 432a-432k may have a large latency. For example, a DRAM access typically includes an first activation or opening stage, a stage that copies the contents of an entire row into the row buffer, a tag read stage, a tag comparison stage, a data read or write access stage that includes a column access, a first precharge or closing stage, a second activation or opening stage, a stage that copies the contents of the entire row again into the row buffer, a tag read stage, a tag comparison stage, an update stage for status information corresponding to the matching tag, and a second precharge or closing stage. The two separate tags are compared to one another. If the two tags match, then the memory access request that was previously described may be executed, which can involve a plurality of actions as well as cache lines, see Loh paragraph [0061], Continuing with the access steps within the memory array bank 430, one or more additional precharge and activation stages may be included after each access of the row buffer if other data stored in other rows are accessed in the meantime. Rather than utilize multiple DRAM transactions for a single cache access, the sequence of steps 1-7, may be used to convert a cache access into a single DRAM transaction. Each of the different DRAM operations, such as activation/open, column access, read, write, and precharge/close, has a different respective latency).
Loh does not teach a cache memory comprising a plurality of sets of cache lines each comprising a plurality of cache storage locations, wherein a first location of a first set of the plurality of sets of cache lines comprises cache tag data for the plurality of sets of cache lines; the first write request from the first location of the first set of the plurality of sets of cache line in the cache memory; wherein the first location storing the cache tag data is part of a different set of cache lines in the cache memory than the first cache line identified in the received first read request; determine that the first write request comprises a tag write request; read second tag data corresponding to the first write request from the cache memory; receive the second write request is received from the cache controller over the interconnect; perform a write operation on the first cache line.
However, Rolan teaches a cache memory comprising a plurality of sets of cache lines each comprising a plurality of cache storage locations, wherein a first location of a first set of the plurality of sets of cache lines comprises cache tag data for the plurality of sets of cache lines; (Rolan paragraphs [0020-0021], The processor core(s) 104 may be coupled—e.g. through interconnect 105—to one or more on-die caches 106 physically located on the same die as the processor core(s) 104. In many embodiments, a cache has a tag storage 114 associated with it that stores tags for all cache memory locations. In many embodiments, tag storage 114 resides on a separate silicon die, e.g. Die 2 112, from the processor core(s) 104. In many embodiments, tag storage 114 is coupled to one or more off-die (non-processor die) cache(s) 116—e.g. through interconnect 105—and is located on the same die as off-die cache(s) 116. [0021] A cache of cache tags (CoCT) 108 may store a subset of the off-die cache tags on processor die 102. Specifically, while tag storage 114 stores all index values and associated tag sets per index value, CoCT 108, on the other hand, may not store all possible index values. Rather, to save on storage space, CoCT 108 may store merely a subset of the tags stored in tag storage 114. In some embodiments, not all index locations are represented at any given time in CoCT 108. The cache memory contains a plurality of cache memory locations, wherein the cache tag data for the locations can be stored in a first cache location corresponding to the cache of cache tags. The first cache location can store the cache tags for each of the cache sets in different assigned portions, see Rolan paragraph [0082], In an embodiment, the method further comprises storing at the tag storage a second set including second tags each associated with a respective data location stored within the cache memory, and in response to any determination that a tag of the second set is to be stored to the cache of cache tags, storing all tags of the second set to the second portion. In another embodiment, any storage of tags of the second set to the cache of cache tags by the controller includes storage of the tags of the second set to only the second portion. In another embodiment, of all set of tags of the tag storage, the first portion is to store only tags of the first set) the first write request from the first location of the first set of the plurality of sets of cache line in the cache memory; (Rolan paragraphs [0050-0051], For example, method 400 may determine, at 410, whether (or not) a tag of a first memory access request corresponds to (e.g. matches) a tag of the first set of tags. In response to the evaluation at 410 determining that the tag of the first memory access request corresponds to such a cached tag, method 400 may, at 415, change a variable of the first entry to indicate increased activity of the first set. Method 400 may further service the first request, as indicated at 420 by operations to access a location in cache memory which corresponds to the tag of the first memory access request. Otherwise, method 400 may, at 425, change the variable of the first entry to indicate decreased activity of the first set. [0051] Subsequent to the first memory access request, a second memory access request may be issued from a host processor or other such requestor logic. The second memory access request may target a memory location which is not currently represented in the cache of cache tags. For example, controller logic may evaluate a tag which is included in the second memory access request to determine whether the tag matches any tag stored or otherwise tracked in the cache of cache tags. The tag data can be requested to be read from the aforementioned first location storing the plurality of cache tag data for the cache sets of cache lines).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teachings of Loh with those of Rolan. Rolan teaches storing cache tag data for a plurality of cache sets in a first cache location. This method provides far more efficient searching and location for cache memory access requests and operations, as well as adjust cache resource allocation based on cache tag request frequency for each of the given cache sets (Rolan paragraph [0085], In another implementation, a method comprises, in response to storage of a first set of tags to a cache of cache tags, associating a first entry of a replacement table with the first set of tags, including setting a first activity variable of the first entry to an initial value of a pre-determined plurality of values, wherein a tag storage stores tags which are each associated with a respective data location of a cache memory, and wherein the cache of cache tags stores a subset of tags stored at the tag storage. The method further comprises, if a first memory access request comprises a tag corresponding to one of the first set of tags, then changing the first activity variable to another of the pre-determined plurality of values to indicate an increase of a level of activity of the first set, otherwise changing the first activity variable to another of the pre-determined plurality of values to indicate a decrease of the level of activity of the first set. The method further comprises, in response to a failure to identify any tag of the cache of cache tags which matches a tag of a second memory access request, selecting a set of tags to evict from the cache of cache tags, including searching the replacement table for an activity variable which is equal to the initial value. Additionally, keeping the cache tag information on the cache storage itself is a method of reducing cache lookup latency, see Rolan paragraph [0006], Each cache has a tag storage structure. If the processor needs data from a certain memory location, it can determine if the data is stored in a given cache by doing a comparison of the memory location address and the tag storage structure for the cache. If the tag storage structure is off-die, the latency to do a tag lookup will be greater than if the tag storage structure is on-die. Thus, although on-die tag storage structures increase the cost of the processor die because they take up valuable space, they help speed up execution by reducing the latencies of tag lookups versus off-die caches).

Loh in view of Rolan does not teach determine that the first write request comprises a tag write request; read second tag data corresponding to the first write request from the cache memory; wherein the first location storing the cache tag data is part of a different set of cache lines in the cache memory than the first cache line identified in the received first read request; receive the second write request is received from the cache controller over the interconnect; perform a write operation on the first cache line.
However, Steely teaches wherein the first location storing the cache tag data is part of a different set of cache lines in the cache memory than the first cache line identified in the received first read request (Steely paragraph [0016], In an embodiment, the domain state field may be assigned a value indicating that all read requests and all write requests for its associated cache line can only be serviced in the domain state field's associated domain. In an embodiment, the domain state field may be assigned a value indicating that all read requests for the cache line can be serviced in the domain state field's associated domain, that no write requests for the cache line can be serviced in the domain state field's associated domain, and that write requests for the cache line must go to a next level of the hierarchy of cache tag directories. In an embodiment, the domain state field may be assigned a value indicating that no read requests or write requests for the cache line can be serviced in the domain state field's associated domain, and that read requests or write requests for the cache line must go to a next level of the hierarchy of cache tag directories. The cache lines utilized for data read requests are separate and distinct from the cache lines that are used to store the cache tag directories, see Steely paragraph [0014], Embodiments may be discussed herein which efficiently maintain cache coherency. In particular, embodiments of the present invention pertain to a feature for reading/writing a domain state field associated with a tag entry within a cache tag directory. In an embodiment, a value may be assigned to a domain state field of a tag entry in a cache tag directory. The cache tag directory may belong to a hierarchy of cache tag directories. Each tag entry may be associated with a cache line from a cache belonging to a domain. The domain may contain multiple caches. The value of the domain state field may indicate whether its associated cache line can be read or changed).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teachings of Loh and Rolan with those of Steely. Steely teaches a memory location that utilizes a plurality of cache lines for different memory purposes, such as storing cache tag data or completing a read data request from a host device. This allows the memory system to perform several operations at a single time which can improve data performance and keep cache tag data updated via its own cache memory usage, resulting in better cache coherence (Steely paragraph [0005], One conventional technique for maintaining cache coherency, particularly in distributed systems, is a directory-based cache coherency scheme. Directory-based coherency schemes utilize a centralized tag directory to record the location and the status of cache lines as they exist throughout the system. For example, the tag directory records which processor caches have a copy of the data, and further records if any of the caches have an updated copy of the data. When a processor makes a cache request to the main memory for a data item, the tag directory is consulted to determine where the most recent copy of the data resides. Based on this information, the most recent copy of the cache line is retrieved so that it may be provided to the requesting processor cache memory. The tag directory is then updated to reflect the new status for that cache line. Thus, each cache line read by a processor is accompanied by a tag directory update (i.e., a write). The directory-based cache coherency scheme may include multiple tag directories, and the tag directories may be arranged in a hierarchy. Furthermore, a hierarchical tag directory structure may include any number of levels).

Loh in view of Rolan in further view of Steely does not teach determine that the first write request comprises a tag write request; read second tag data corresponding to the first write request from the cache memory; receive the second write request is received from the cache controller over the interconnect; perform a write operation on the first cache line.
However, Jiang teaches determine that the first write request comprises a tag write request; read second tag data corresponding to the first write request from the cache memory; (Jiang paragraph [0004], Embodiments of the present disclosure also provide a method of operating a cache that comprises a tag array, a tag control buffer, a data array, and a write buffer. The method comprises: receiving a first write request including write data and a memory address; receiving a second data access request; determining a tag address based on the memory address; performing a first read operation to the tag array to determine if there is a cache-hit; and responsive to determining that there is a cache-hit: performing a write operation to the write buffer to store information related to the first write request, performing a write operation to the tag control buffer to update stored cache control information, performing second read operations to the tag array and to the data array for the second data access request, and performing a write operation to a first data entry of the data array based on the information related to the first write request stored in the write buffer. The write request can be a tag write request and can read the tag data) receive the second write request is received from the cache controller over the interconnect; perform a write operation on the first cache line (Jiang claim 12, The method of claim 11, wherein the second read operations to the tag array and to the data array for second request are performed before the write operation to the tag control buffer is completed.  The second write/read requests can be received to perform the operation on the initial storage/operation location. Also see paragraph [0005], Embodiments of the present disclosure also provide a computer system. The computer system comprises a hardware processor and a hierarchical memory system coupled with the hardware processor. The hierarchical memory system comprises a dynamic random access memory device and a cache. The cache comprises a tag array configured to store one or more tag addresses, a tag control buffer configured to store cache control information, a data array configured to store data acquired from the dynamic random access memory device, and a write buffer configured to store information related to a write request from the hardware processor. The tag array is configured to be accessed independently from the tag control buffer, and the data array is configured to be accessed independently from the write buffer).

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to combine the teachings of Loh and Rolan and Steely with those of Jiang. Jiang teaches using a write request to determine if there is a write tag request associated with it, and to use the tag data to perform write operations. This allows the system to more accurately monitor data that is stored in the data arrays and entries, as well as performing operations in a more efficient manner based on the information that the tags relay to the memory controller (Jiang paragraphs [0006-0007], Embodiments of the present disclosure also provide a cache. The cache comprises a tag array configured to store one or more tag addresses and cache control information, a data array configured to store data acquired from a memory device, and a write buffer configured to store information related to a write request. The data array is configured to be accessed independently from the write buffer. [0007] Embodiments of the present disclosure also provide a method of operating a cache that comprises a tag array, a data array, and a write buffer. The method comprises: receiving a write request including a first data and a memory address; determining a tag address based on the memory address; performing a read operation to the tag array to determine if there is a cache-hit; responsive to determining that there is a cache-hit, performing a write operation to the write buffer to store the first data, and performing a write operation to the tag array to update stored cache control information; and responsive to determining that preset condition is satisfied, performing a write operation to the data array based on the first data stored in the write buffer).

Regarding claim 20, Loh in view of Rolan in further view of Steely and further in view of Jiang teaches The device of claim 19, wherein the memory controller is further configured to: if the second tag data does not match the first tag data, send at least one of the second tag data read from the cache memory or an indication of a cache miss to the cache controller over the interconnect (Loh paragraphs [0072-0074], In block 504, the processing unit may determine a given memory request misses within a cache memory subsystem within the processing unit. In block 506, the processing unit may send an address corresponding to the given memory request to an in-package integrated DRAM cache, such as the 3D DRAM. The address may include a non-translated cache tag in addition to a DRAM address translated from a corresponding cache address used within the processing unit to access on-chip caches. In block 508, control logic within the 3D DRAM may identify a given row corresponding to the address within the memory array banks in the 3D DRAM. In block 510, control logic within the 3D DRAM may activate and open the given row. In block 512, the contents of the given row may be copied and stored in a row buffer. In block 514, the tag information in the row buffer may be compared with tag information in the address. The steps described in blocks 506-512 may correspond to the sequences 1-4 described earlier regarding FIG. 4. If the tag comparisons determine a tag hit does not occur (conditional block 516), then in block 518, the memory request may be sent to main memory. The main memory may include an off-chip non-integrated DRAM and/or an off-chip disk memory. If the tag comparisons determine a tag hit occurs (conditional block 516), then in block 520, read or write operations are performed on a corresponding cache line in the row buffer. When the data being compared results in a difference (i.e., not matching), the memory can return a cache miss).

Response to Arguments
Applicant’s arguments, see pages 1-5 (numbered pages 8-12), filed August 26th, 2022, with respect to the rejection(s) of claim(s) 1-20 under 35 U.S.C. 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Loh in view of Rolan in further view of Steely, JR. et al. (US Publication No. 2014/0052920 – “Steely”).

The newly amended claim limitation has been determined to overcome the existing 35 USC 103 Rejection, but was rejected in view of the aforementioned Steely reference above. The arguments below are a reiteration of the arguments provided in the after-final submission and advisory action.

Applicant argues:
“Loh is directed to a DRAM cache with tags and data jointly stored in physical rows. (Loh, Abstract.) Loh describes a computing system 100 including a microprocessor 110 including processor cores 122a-122b and cache memory subsystems 124a-124b. (Loh, Paragraph 21.) The microprocessor 110 is connected to an off-chip disk memory 162 via memory bus 150 and 1/0 controller and bus 160. (Loh, Paragraph 31.) Loh teaches that a memory request sent from a processing unit 220 to a 3D DRAM 330 may include a cache tag in addition to a DRAM address identifying a respective row within one of the memory array banks 332 a-332 b. (Loh, Paragraph 47.) Loh, however, does not teach or suggest reading second tag data corresponding to the tag read request from the first location of the first set of the plurality of sets of cache lines in the cache memory, wherein the first location storing the cache tag data is part of a different set of cache lines in the cache memory than the first cache line identified in the received first read request, as claimed. Rather, Loh teaches that the received cache tag may be used to compare to cache tags stored in the identified given row within the 3D DRAM 330. (Loh, Paragraph 47.) Thus, it is clear in Loh that the cache tags are stored in the same row (i.e., the same set of cache lines) as the row identified by the DRAM address in the received memory request. Loh cannot be properly interpreted as teaching that the cache tag data is stored in a different set of cache lines than the cache lien identified in the request. Therefore, Loh does not teach or suggest this feature of the claims. 
Rolan is directed to caching sets of tags of an off-die cache memory. (Rolan, Abstract.) Rolan, however, does not cure the deficiencies of Loh noted above, because Rolan also does not teach or suggest reading second tag data corresponding to the tag read request from the first location of the first set of the plurality of sets of cache lines in the cache memory, wherein the first location storing the cache tag data is part of a different set of cache lines in the cache memory than the first cache line identified in the received first read request, as claimed. 
For at least the reasons stated above, the combination of Loh and Rolan fails to teach or suggest all of the features of the claims. Therefore, Applicant respectfully submits that claims 1 and 10 are patentable over the cited references.”

The applicant refers to the cited portion of the Loh reference, which is used to disclose the concept of reading cache tag data corresponding to a first location of a cache line in cache memory (see Loh paragraph [0048], Similar to other DRAM topologies, the 3D DRAM 330 may include multiple memory array banks 332a-332b. Each one of the banks 332a-332b may include a respective one of the row buffers 334a-334b. Each one of the row buffers 334a-334b may store data in an accessed row of the multiple rows within the memory array banks 332a-332b. The accessed row may be identified by a DRAM address in the received memory request. The control logic 336 may perform tag comparisons between a cache tag in a received memory request and the one or more cache tags stored in the row buffer. In addition, the control logic may alter a column access of the row buffers by utilizing the cache tag comparison results rather than a bit field within the received DRAM address. The cache tag that is used to perform the tag read request may provide the memory system with additional tag located in the particular cache line that the original tag was identifying, leading to obtaining two separate tag data). The Rolan reference is added to the Loh reference as a secondary reference, in order to disclose the teaching of reading second cache tag data in response to a cache tag read request from a first cache location (see (Rolan paragraphs [0050-0051], For example, method 400 may determine, at 410, whether (or not) a tag of a first memory access request corresponds to (e.g. matches) a tag of the first set of tags. In response to the evaluation at 410 determining that the tag of the first memory access request corresponds to such a cached tag, method 400 may, at 415, change a variable of the first entry to indicate increased activity of the first set. Method 400 may further service the first request, as indicated at 420 by operations to access a location in cache memory which corresponds to the tag of the first memory access request. Otherwise, method 400 may, at 425, change the variable of the first entry to indicate decreased activity of the first set. [0051] Subsequent to the first memory access request, a second memory access request may be issued from a host processor or other such requestor logic. The second memory access request may target a memory location which is not currently represented in the cache of cache tags. For example, controller logic may evaluate a tag which is included in the second memory access request to determine whether the tag matches any tag stored or otherwise tracked in the cache of cache tags. The tag data can be requested to be read from the aforementioned first location storing the plurality of cache tag data for the cache sets of cache lines). This cited portion of Rolan discloses a flexible cache tag reading structure wherein a plurality of cache tag read requests can be issued in a variety of ways to access a plurality of cache tag data from different regions/entries. However, the examiner agrees with the applicant’s arguments that upon clarification that the first location storing the cache tag data is claimed to be part of a different set of cache lines than the first cache line identified in the received first read request, that the cited reference above are overcome. Previously, the claims only referred to a first and second cache tag data, but did not claim that the cache lines from which the cache tag data is read are necessarily a different set of cache lines in the cache memory.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONAH C KRIEGER whose telephone number is (571)272-3627.  The examiner can normally be reached on Monday - Friday 8 AM - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Rones can be reached on (571)272-4085.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/J.C.K./           Examiner, Art Unit 2136          

/CHARLES RONES/           Supervisory Patent Examiner, Art Unit 2136