The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
DETAILED ACTION
Claims 1-28 are presented for examination in this application (16/548,116) filed on August 22, 2019.
The Examiner cites particular sections in the references as applied to the claims below for the convenience of the applicant(s). Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant(s) fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
Claims 1-28 are pending for consideration. 
Drawings
The drawings submitted on August 22, 2019 have been considered and accepted.
Information Disclosure Statement
Acknowledgment is made of the information disclosure statements filed on August 22, 2019, December 20, 2019, February 24, 2020, April 18, 2020, April 29, 2020, July 10, 2020 and February 12, 2021. U.S. patents and Foreign Patents have been considered.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
 (a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

7.	Claims 1, 2, 5, 7, 12, 15, 20, 21, 24 and 25 are rejected under 35 U.S.C. 102(a)(1) and 102(a)(2) as being anticipated by Wallin et al. (US PGPUB 2004/0260883, hereinafter " Wallin").
As per independent claim 1, Wallin discloses a programmable switch, comprising: a plurality of ports for communication with devices on a network [(Paragraph 0028; FIGs.1 and 2) wherein Wallin teaches that the node interconnect 20 may be implemented as a circuit-switched network or a packet-switched network. In embodiments where node interconnect 20 is a packet-switched network, packets may be sent through the data network using techniques such as wormhole, store and forward, or virtual cut-through. In a circuit-switched network, a particular device may communicate directly with a second device via a dedicated point-to-point link that may be established through a switched interconnect mechanism. To communicate with a different device, a different link is established through the switched interconnect. In some embodiments, separate address and data networks may be employed to correspond to the claimed limitation]; and circuitry configured to: receive program instructions to program the programmable switch for processing packets within the network [(Paragraph 0015, 0028, 0032-034 and 0041-0043; FIGs.1 and 2) wherein Wallin teaches that while bundling the original Read, ReadExclusive and Upgrade requests together with the prefetch requests may reduce the number of address transactions conveyed on interconnect 20, the technique may not reduce the number of snoop lookups each cache 18 is required to perform. In addition, in some instances, the technique may create a multi-source situation, where a single address transaction would result in data packets being transferred from many different sources. In various systems, such a situation may violate some basic assumptions for cache coherence to correspond to the claimed limitation]; receive a cache line request from a client of a plurality of clients on the network to obtain a cache line [(Paragraph 0028, 0032-034 and 0041-0043; FIGs.1 and 2) wherein Wallin teaches that when a request corresponding to a read operation is received by cache controller 202 and a miss occurs, fetch/prefetch controller 210 generates a bundled transaction that is conveyed on bus interconnect 20 that specifies a Read request corresponding to the cache line and that additionally specifies a prefetch Read request(s) to one or more sequential cache lines. Similarly, when a request corresponding to a write operation is received and a valid copy of the cache line does not exist in the cache memory, fetch/prefetch controller 210 may generate a bundled transaction that is conveyed on interconnect 20 that specifies a ReadExclusive request corresponding to the cache line and that specifies one or more prefetch ReadExclusive requests to the next sequential cache line(s). Likewise, when a request is received that requires a write access right to a cache line that exists in the cache memory in a read only state (e.g., the Shared state), fetch/prefetch controller 210 may generate a bundled transaction that is conveyed on interconnect 20 that specifies an Upgrade request to the corresponding cache line and that also specifies one or more Upgrade requests to the next sequential cache line(s). It is noted that in some implementations the fetch/prefetch controller 210 may specify the additional prefetched Upgrade requests only if the corresponding cache lin(s) also exist in the cache memory in a readable state. FIG. 3 is a flow diagram illustrating aspects of operation of one embodiment of fetch/prefetch controller 210. In step 302, cache controller 202 receives a request for a particular cache line. The request may correspond to a read operation or a write operation initiated by the corresponding processor 16. In response to receiving the request, cache controller 202 performs a lookup within cache memory 204 in step 304 to determine whether a cache line corresponding to the address of the request resides in the cache memory, and to determine whether the access right to the line as indicated by the state field is sufficient to satisfy the request. A cache hit occurs when a line exists within cache memory 204 that can be used to satisfy the request. If a hit occurs (step 306), cache controller 202 may perform subsequent operations (not shown) to satisfy the request, such as providing the data to the requesting processor in the case of a read operation or writing a new data entry to the cache line in the case of a write operation to correspond to the claimed limitation]; identify one or more additional cache lines to obtain based on the received cache line request and prefetch information received from a host in communication with the programmable switch, in accordance with the received program instructions [(Paragraph 0028, 0032-033 and 0041-0043; FIGs.1, 2 and 5) wherein Wallin teaches that FIG. 2 is a block diagram of a cache subsystem illustrative of each of the caches 18 of FIG. 1. As illustrated, the cache subsystem includes a cache controller 202 coupled to a cache memory 204. Cache controller 202 includes a fetch/prefetch controller 210 configured to perform prefetching operations. As will be described in further detail below, in various embodiments, fetch/prefetch controller 210 may be configured to bundle a request generated due to a cache miss together with one or more associated prefetch requests to form a single request transaction conveyed on interconnect 20. In this manner, the amount of address traffic on interconnect 20 (and network 40) may be reduced. Further, as illustrated in FIG. 5, if a request resulting in a cache miss necessitates issuance of a Read request (step 502), fetch/prefetch controller 210 may bundle the original request with one or more prefetch requests into a single request transaction conveyed on interconnect 20 in step 504. A single bundled read transaction may include the address A of the cache miss and information about the address offsets to the K prefetches. For example, in one implementation, the transaction encoding illustrated in FIG. 4 may be employed. Thus, the address offsets relative to the original address may be encoded in a prefetch bit mask. In one embodiment, all caching devices and memory devices on interconnect 20 may need to perform a snoop lookup for address A but only the device owning cache line A, performs snoop lookups for the prefetched cache lines. This device will reply with data for each prefetched address for which it is the owner. Otherwise, an empty NACK data packet may be supplied for the prefetched cache line. Since the states of the other caches are not affected by the prefetch transaction, they do not need to snoop the prefetch addresses to correspond to the claimed limitation]; request the cache line and the identified one or more additional cache lines from one or more memory devices on the network via one or more ports of the plurality of ports [(Paragraph 0028, 0032-033 and 0041-0043; FIGs.1, 2 and 5) wherein Wallin teaches that as illustrated in FIG. 5, if a request resulting in a cache miss necessitates issuance of a Read request (step 502), fetch/prefetch controller 210 may bundle the original request with one or more prefetch requests into a single request transaction conveyed on interconnect 20 in step 504. A single bundled read transaction may include the address A of the cache miss and information about the address offsets to the K prefetches. For example, in one implementation, the transaction encoding illustrated in FIG. 4 may be employed. Thus, the address offsets relative to the original address may be encoded in a prefetch bit mask. In one embodiment, all caching devices and memory devices on interconnect 20 may need to perform a snoop lookup for address A but only the device owning cache line A, performs snoop lookups for the prefetched cache lines. This device will reply with data for each prefetched address for which it is the owner. Otherwise, an empty NACK data packet may be supplied for the prefetched cache line. Since the states of the other caches are not affected by the prefetch transaction, they do not need to snoop the prefetch addresses to correspond to the claimed limitation]; receive the requested cache line and the one or more additional cache lines from the one or more memory devices; and send the requested cache line and the one or more additional cache lines to the client [(Paragraph 0028, 0032-033 and 0041-0046; FIGs.1, 2 and 5) wherein Wallin teaches that if a cache miss in step 306 necessitates issuance of an upgrade request in step 502, the invalidate requests (i.e., corresponding to the upgrade requests) for each of the K consecutive cache lines being in the Shared state in the requesting device are bundled in step 504 with the Invalidate request of address A on the bus. Address A is snooped by all devices, possibly causing a cache invalidation. If a device has address A in the Owner.sub.2 state, it will also invalidate each of the prefetch cache lines it currently has in the Owner.sub.2 state. It is noted that cache lines in the Owner.sub.2 state are shared by at most one other device, i.e., the requesting device, the copy in the requesting device will be the only copy left. The device owning address A will send a reply to the requesting node indicating which of the bundled upgrade cache lines it now safely can put into the Modified state. Cache lines being invalidated in the Owner.sub.m state cannot be handled in the same way since the number of sharers is not known. In this case, only the original address will be invalidated to correspond to the claimed limitation]. 
As per dependent claim 2, Wallin discloses wherein the circuitry is further configured to send cache miss data to the host, the cache miss data representing cache line requests received by the programmable switch [(Paragraph 0028, 0032-033 and 0041-0047; FIGs.1, 2, 3 and 5) wherein Wallin teaches that FIG. 2 is a block diagram of a cache subsystem illustrative of each of the caches 18 of FIG. 1. As illustrated, the cache subsystem includes a cache controller 202 coupled to a cache memory 204. Cache controller 202 includes a fetch/prefetch controller 210 configured to perform prefetching operations. As will be described in further detail below, in various embodiments, fetch/prefetch controller 210 may be configured to bundle a request generated due to a cache miss together with one or more associated prefetch requests to form a single request transaction conveyed on interconnect 20. In this manner, the amount of address traffic on interconnect 20 (and network 40) may be reduced. Further, as illustrated in FIG. 5, if a request resulting in a cache miss necessitates issuance of a Read request (step 502), fetch/prefetch controller 210 may bundle the original request with one or more prefetch requests into a single request transaction conveyed on interconnect 20 in step 504. A single bundled read transaction may include the address A of the cache miss and information about the address offsets to the K prefetches. For example, in one implementation, the transaction encoding illustrated in FIG. 4 may be employed. Thus, the address offsets relative to the original address may be encoded in a prefetch bit mask. In one embodiment, all caching devices and memory devices on interconnect 20 may need to perform a snoop lookup for address A but only the device owning cache line A, performs snoop lookups for the prefetched cache lines. This device will reply with data for each prefetched address for which it is the owner. Otherwise, an empty NACK data packet may be supplied for the prefetched cache line. Since the states of the other caches are not affected by the prefetch transaction, they do not need to snoop the prefetch addresses. if a cache miss in step 306 necessitates issuance of an upgrade request in step 502, the invalidate requests (i.e., corresponding to the upgrade requests) for each of the K consecutive cache lines being in the Shared state in the requesting device are bundled in step 504 with the Invalidate request of address A on the bus. Address A is snooped by all devices, possibly causing a cache invalidation. If a device has address A in the Owner.sub.2 state, it will also invalidate each of the prefetch cache lines it currently has in the Owner.sub.2 state. It is noted that cache lines in the Owner.sub.2 state are shared by at most one other device, i.e., the requesting device, the copy in the requesting device will be the only copy left. The device owning address A will send a reply to the requesting node indicating which of the bundled upgrade cache lines it now safely can put into the Modified state. Cache lines being invalidated in the Owner.sub.m state cannot be handled in the same way since the number of sharers is not known. In this case, only the original address will be invalidated. As illustrated in FIG. 5, in one embodiment, for requests other than Read or Upgrade requests, fetch/prefetch controller 210 does not convey bundled prefetch transaction requests. Instead, in step 506 fetch/prefetch controller 210 may generate a transaction on interconnect 20 corresponding to the address of the miss, and may additionally convey separate prefetch transactions on interconnect 20, as desired to correspond to the claimed limitation]. 
As per dependent claim 5, Wallin discloses wherein the client is configured to execute an internal memory access prediction algorithm for loading locally stored cache lines into a memory of the client from a storage device of the client in addition to requesting cache lines from the programmable switch [(Paragraphs 0034-0035; FIG.3) wherein FIG. 3 is a flow diagram illustrating aspects of operation of one embodiment of fetch/prefetch controller 210. In step 302, cache controller 202 receives a request for a particular cache line. The request may correspond to a read operation or a write operation initiated by the corresponding processor 16. In response to receiving the request, cache controller 202 performs a lookup within cache memory 204 in step 304 to determine whether a cache line corresponding to the address of the request resides in the cache memory, and to determine whether the access right to the line as indicated by the state field is sufficient to satisfy the request. A cache hit occurs when a line exists within cache memory 204 that can be used to satisfy the request. If a hit occurs (step 306), cache controller 202 may perform subsequent operations (not shown) to satisfy the request, such as providing the data to the requesting processor in the case of a read operation or writing a new data entry to the cache line in the case of a write operation. A miss may occur in cache memory 204 for various reasons. For example, a request to cache controller 202 that corresponds to a write operation initiated by the associated processor 16 may require that a line be in a valid, writable state, such as the modified state of the MOSI protocol. If a writable copy of the cache line does not exist in cache memory 204, the cache controller 202 may initiate a ReadExclusive request on interconnect 20 in step 312 to obtain a writable copy of the cache line. Alternatively, if the cache line exists in the cache memory 204 but is not in a readable state (e.g., a copy exists in the shared state of the MOSI protocol), cache controller 202 may transmit an upgrade request on interconnect 20 in step 312 to allow the line to be upgraded to a writable state. Similarly, if a request to cache controller 202 is received that corresponds to a read operation initiated by the associated processor 16, but a copy of the cache line does not already exist in the cache memory 204 or the cache line exists but is in an invalid state, cache controller 202 may transmit a Read request on interconnect 20 in step 312 to obtain a readable copy of the cache line. It is noted that the requests initiated on interconnect 20 may be responded to by a memory 22 or by another cache 18 that owns the cache line to correspond to the claimed limitation]. 
As per dependent claim 7, Wallin discloses wherein the cache line request follows a custom packet format including one or more fields indicating a memory message [(Paragraph 0040; FIG.4) wherein the bundled transaction request may be encoded differently from the implementation illustrated in FIG. 4. For example, in various embodiments, certain bits may be provided in the transaction encoding to specify a stride associated with the prefetch requests. Likewise, bits may be provided within the transaction encoding to specify prefetch types that may differ from the request type of the original request (e.g., the request type of an original request may specify an Upgrade request, while the request type associated with a bundled prefetch request may specify a ReadExclusive operation.)  to correspond to the claimed limitation]. 

As for independent claims 12, 20, 21 and 25, the applicant is directed to the rejections to claim 1 set forth above, as they are rejected based on the same rationale.
As for dependent claim 15, the applicant is directed to the rejections to claim 5 set forth above, as they are rejected based on the same rationale.
As for dependent claim 24, the applicant is directed to the rejections to claim 7 set forth above, as they are rejected based on the same rationale.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 3, 13 and 22 are rejected under 35 U.S.C. 103(a) as being unpatentable over Wallin, as applied to claims 1, 12, 20, 21 and 25 above, and further in view of Wang et al. (US PGPUB 2018/0262459) (hereinafter ‘Wang’).
As per dependent claim 3, Wallin teaches the programmable switch of claim 1.
Wallin does not appear to explicitly disclose wherein the circuitry is further configured to mirror packets for received cache line requests to a port of the plurality of ports to provide the host with the cache miss data.
However, Wang discloses wherein the circuitry is further configured to mirror packets for received cache line requests to a port of the plurality of ports to provide the host with the cache miss data [(Paragraphs 0018-0020; FIGs 1 and 2) where the present invention of Anantaraman teaches that FIG. 2 illustrates ARP traffic flow in a network, according to certain aspects of the present disclosure. As illustrated, when a virtual machine (e.g., VM 140.sub.1) wants to communicate with another virtual machine (e.g., a destination VM) in the network whose MAC address does not appear in its ARP cache, VM 140.sub.1 transmits an ARP request packet 202 to virtual switch 112. According to aspects, the destination MAC address of an ARP request packet may be set to a broadcast MAC address (e.g., FF:FF:FF:FF:FF:FF), which causes the ARP request packet to be flooded on all the virtual ports of the virtual switch 112. For example, as illustrated, upon receiving an ARP request packet 202 from the VM 140.sub.1, the virtual switch 112 may duplicate the ARP request packet and transmit the duplicated ARP request packet on each virtual port of virtual switch 112 corresponding to the network. For example, virtual switch 112 may forward the ARP request packet on each virtual port to which a VM 140 (e.g., VM 140.sub.2-VM 140.sub.n+1) is connected, and further to a virtual port (also referred to as an uplink port) where virtual switch 112 connects to a physical switch 115.sub.1 (e.g., of physical network 130 via PNIC 102) to access entities outside host machine 100 implementing virtual switch 112. In physical switch 115.sub.1, the ARP request packet may also get duplicated and flooded to all the physical switch ports in physical switch 115.sub.1 and may even traverse another physical switch 115.sub.2 and be duplicated and flooded to all the physical switch ports in that physical switch. The ARP request packet duplication behavior described above presents several issues. For example, such ARP request packet duplication may utilize CPU and memory resources in host machine 100. For example, virtual switch 112 is tasked with duplicating the ARP request packet multiple times according to the number of ports in virtual switch 112 corresponding to the network and sending out the packets even though the destination virtual machine may be connected to the same virtual switch (i.e., virtual switch 112) as the source virtual machine (i.e., the virtual machine that initially transmits the ARP request packet). Moreover, each of the recipients of the ARP request packets needs to process these packets, causing additional utilization of CPU and memory resources. Additionally, the CPU and memory resource use may increase as the number of virtual ports in the virtual switch 112 increases. Further, this problem also exists in the situation where hypervisor 105 receives additional ARP broadcast packets from an external network (e.g., physical network 130) and needs to process such received ARP broadcast packets, where it will be obvious to one of ordinary skill in the art to utilize the packet duplication to switch ports of Wang to modify the programmable switch as claimed to mirror packets for received cache line requests, described by Wallin to a port of the plurality of ports to provide the host with the cache miss data as described by Wallin to correspond to the claimed limitation]. 
Wallin and Wang are analogous art because they are from the same field of endeavor of packet switching management.
Before the effective filling date, it would have been obvious to one of ordinary skill in the art, having the teachings of Wallin and Wang before him or her, to modify the apparatus of Wallin to include the networking switch of Wang because it will enhance bandwidth.
The motivation for doing so would be to [the duplicate ARP request packets which flow into a physical switch through an uplink port also get duplicated in the physical switch and sent to all the switch ports in the network, which utilizes network bandwidth. Additionally, every VM and host in the network has to check if the duplicate ARP request packets are addressed to itself, which also increases CPU and memory resource utilization in each VM and host (Paragraph 0021 by Wang)].
Therefore, it would have been obvious to combine Wallin and Wang to obtain the invention as specified in the instant claim.
As for dependent claims 13 and 22, the applicant is directed to the rejections to claim 1 set forth above, as they are rejected based on the same rationale.
Claim 6 is rejected under 35 U.S.C. 103(a) as being unpatentable over Wallin, as applied to claims 1 above, and further in view of Pruss et al. (US PGPUB 2014/0269716) (hereinafter ‘Pruss’).
As per dependent claim 6, Wallin teaches the programmable switch of claim 1.
Wallin does not appear to explicitly disclose wherein the programmable switch forms part of a Software Defined Networking (SDN).
However, Pruss discloses wherein the programmable switch forms part of a Software Defined Networking (SDN) [(Paragraphs 0013, 0026, 0030 and 0035; FIGs 1 and 2) where the present invention of Anantaraman teaches that embodiments presented herein support tagging network traffic in order to allow programmable applications running on a router to mange traffic flows based on the labels. For example, this approach, which may be referred to as software defined networking ( SDN), brings programmability of data networking into the network elements directly. The single platform kit provides programming constructs (e.g., application programming interfaces (APIs)) that abstract a variety of network functions and support both packet tagging operations as well as operations based on the tags assigned to the packets of a given data flow; the data center 200 incorporates software defined networking (SDN), which is an approach to building a computer network that involves separating and abstracting elements of the network. Applications running on clients 228, routers 227, and switches can add, change, and/or respond to tags in packets of traffic flows. The elements include the control plane and the data plane. SDN decouples the system that decides where traffic is sent (the control plane) from the system that forwards traffic to the selected destination (the data plane). This technology simplifies networking and enables new applications, such as network virtualization in which the control plane is separated from the data plane and is implemented in a software application (e.g., a virtual machine of the virtual server 224). The architecture of the data center 300 architecture enables a network administrator to have programmable control of network traffic without requiring physical access to the network's hardware devices to correspond to the claimed limitation]. 
Wallin and Pruss are analogous art because they are from the same field of endeavor of switching management.
Before the effective filling date, it would have been obvious to one of ordinary skill in the art, having the teachings of Wallin and Pruss before him or her, to modify the apparatus of Wallin to include the networking switch SDN of Pruss because it will enhance performance.
The motivation for doing so would be to [provides a general mechanism to add information to a packet. So, a tag becomes a generalized way of communicating with the entire network (e.g., any network device that is on the transmission path of the packet). Also, a network programmer can define a protocol for particular traffic after a switch is purchased and installed on the network. The network programmer can even build and set protocols dynamically (e.g., during network runtime) (Paragraph 0018 by Pruss)].
Therefore, it would have been obvious to combine Wallin and Pruss to obtain the invention as specified in the instant claim.
Claims 8, 9, 16, 17 and 26 are rejected under 35 U.S.C. 103(a) as being unpatentable over Wallin, as applied to claims 1, 12, 20, 21 and 25  above, and further in view of Anantaraman et al. (US PGPUB 2015/0378919) (hereinafter ‘Anantaraman’).
As per dependent claim 8, Wallin teaches the programmable switch of claim 1.
Wallin does not appear to explicitly disclose wherein before sending the one or more additional cache lines to the client, the circuitry is further configured to: store the one or more additional cache lines in a memory of the programmable switch; send the requested cache line and one or more prefetch hints to the client indicating the one or more additional cache lines; receive a permission request from the client in response to the one or more prefetch hints, the permission request requesting access to the one or more additional cache lines; and in response to receiving the permission request from the client, sending the one or more additional cache lines stored in the memory.
However, Anantaraman discloses wherein before sending the one or more additional cache lines to the client, the circuitry is further configured to: store the one or more additional cache lines in a memory of the programmable switch; send the requested cache line and one or more prefetch hints to the client indicating the one or more additional cache lines [(Paragraphs 0016, 0029-0031, 0039, 0047-0048 and 0053-0058; FIGs 12 and 13) where the present invention of Anantaraman teaches that lower level memory generates a request vector for the cache line that triggered the cache miss, including a field for each cache line of the superline. The request vector includes a demand request for the cache line that caused the cache miss, and the lower level memory can enhance the request vector with prefetch hint information. The prefetch hint information can indicate a prefetch request for one or more other cache lines in the superline. The lower level memory sends the request vector to the higher level memory with the prefetch hint information. Based on the prefetch hint information, the higher level memory makes determinations about what prefetches to service, if any. After fetching the prefetch data, the higher level memory device can push the prefetched data line(s) back to the lower level memory. Thus, the higher level memory services the demand request for the cache line that caused a cache miss in the lower level memory, and selectively either services a prefetch hint or drops the prefetch hint and to mark a cache line as invalid or prefetch, cache 130 can be configured to be more or less aggressive with prefetching. For example, in one embodiment, prefetch request engine 132 is configured with a limit on how many valid bits should be in a superline before requesting prefetch of another cache line in the superline. In one embodiment, unless there are a threshold number of valid cache lines in the superline, prefetch request engine 132 will not execute prefetching where lower level memory generates a request vector for the cache line that triggered the cache miss, including a field for each cache line of the superline. The request vector includes a demand request for the cache line that caused the cache miss, and the lower level memory can enhance the request vector with prefetch hint information. The prefetch hint information can indicate a prefetch request for one or more other cache lines in the superline; the prefetch vector is a prefetch hint vector, which can provide a mechanism for piggybacking prefetch hint information from the cache controller or other lower level cache 130 to the memory controller or other higher level cache 140 without increasing pressure on the demand request interface between the two caches. As mentioned above, in one embodiment, cache 140 is not obligated to respond to every prefetch hint in the prefetch vector, and can selectively drop prefetch information (e.g., hints) based on availability or lack of availability of memory bandwidth. As mentioned above, in one embodiment, every demand read request made by cache 130 can include a prefetch hint vector. In one embodiment, the prefetch vector indicates the status of each cache line in a superline that the demand read request belongs to. It will be understood that different implementations can use different labels and indications to pass request information to the higher level cache. In one embodiment, the status of each cache line can be identified or marked as valid (V), invalid (I), prefetch (P), or demand (D). The status of the individual cache lines can be placed in a field or bit for each cache line. Valid indicates that the cache line is already present in the cache. Invalid status indicates that the cache line is not present in the cache; a status of invalid does not necessarily indicate a request by the lower level cache for the cache line. In one embodiment, the lower level cache can mark a cache line as invalid to indicate that the cache line is missing from the cache, but that it is not a candidate for prefetching. Prefetch status indicates that the cache line is not present in the lower level cache and is a candidate to be prefetched. The states I and P provide a mechanism for the lower level cache to control which lines are candidates for prefetching. Demand status indicates the position of the cache line in the superline that is the subject of the demand request to correspond to the claimed limitation]; receive a permission request from the client in response to the one or more prefetch hints, the permission request requesting access to the one or more additional cache lines [(Paragraphs 0016, 0029, 0039, 0047-0048 and 0053-0058; FIGs 12 and 13) where the present invention of Anantaraman teaches the prefetch vector is a prefetch hint vector, which can provide a mechanism for piggybacking prefetch hint information from the cache controller or other lower level cache 130 to the memory controller or other higher level cache 140 without increasing pressure on the demand request interface between the two caches. As mentioned above, in one embodiment, cache 140 is not obligated to respond to every prefetch hint in the prefetch vector, and can selectively drop prefetch information (e.g., hints) based on availability or lack of availability of memory bandwidth. As mentioned above, in one embodiment, every demand read request made by cache 130 can include a prefetch hint vector. In one embodiment, the prefetch vector indicates the status of each cache line in a superline that the demand read request belongs to. service engine 350 includes update logic 356. Update logic 356 can be responsible for changing pending requests to the correct state. For example, update logic 356 can change a request to page hit/miss when an activate command is sent, or to page empty when a precharge command is sent. Additionally, the value of request vectors 344 should be changed (e.g., from P to I) when a data access request is generated for prefetch data, to prevent prefetching the same cache line multiple times. Update logic 356 can also invalidate a prefetch vector when a page corresponding to the prefetch vector is closed. In one embodiment, memory controller 340 only services prefetch requests for open pages, and drops prefetch hints for pages that are not open, where it will be obvious to one or ordinary skill in the art to modify the system of Wallin to include the features of the memory controller to drop or service access to the one or more additional cache lines based on the prefetch hints to correspond to the claimed limitation]; and in response to receiving the permission request from the client, sending the one or more additional cache lines stored in the memory [(Paragraphs 0016, 0028-0029, 0039, 0047-0048 and 0053-0058; FIGs 12 and 13) where the present invention of Anantaraman teaches the prefetch request engine 132 is part of and/or executed by a sectored cache controller. In one embodiment, prefetch servicing engine 142 is part of and/or executed by a memory controller (MC). Thus, a cache controller can be responsible for determining which cache lines are good prefetch candidates while the memory controller can be responsible for prefetching the data and sending the prefetched data back to the cache controller. In one embodiment, lower level cache 130, higher level cache 140, and the interface between them is extended in accordance with a prefetcher framework such as set forth in FIG. 3, and as described below with reference to system 300 and the lower level memory can selectively mark cache lines for prefetch or not, and the higher level memory (the bandwidth provider) can selectively drop the prefetch hint information without servicing the prefetch request. Such a loosely coupled prefetching mechanism allows the lower level cache to make determinations as to what cache lines should be prefetched, but allows the higher level memory to determine whether or not a prefetch request can or will be serviced. Thus, the prefetch requests can be referred to as prefetch hints, since the higher level memory has autonomy to ignore the request. Also, the system designer can control the aggression of the prefetching by the lower level memory. By placing a threshold of three valid cache lines before the lower level memory can make a prefetch request, the lower level memory will not be as aggressive in prefetching. By setting a threshold of 1, or even eliminating the threshold, the lower level memory can be more aggressive in making prefetch requests to correspond to the claimed limitation]. 
Wallin and Anantaraman are analogous art because they are from the same field of endeavor of memory and Cache management.
Before the effective filling date, it would have been obvious to one of ordinary skill in the art, having the teachings of Wallin and Anantaraman before him or her, to modify the apparatus of Wallin to include the prefetch hint information Anantaraman because it will enhance bandwidth.
The motivation for doing so would be to [risk of wasting memory bandwidth is reduced by allowing the higher level memory or higher level cache to selectively service the prefetch requests (Paragraph 0034 by Anantaraman)].
Therefore, it would have been obvious to combine Wallin and Anantaraman to obtain the invention as specified in the instant claim.
As per dependent claim 9, Anantaraman discloses wherein before requesting the one or more additional cache lines from the one or more memory devices, the circuitry is further configured to: send the requested cache line and one or more prefetch hints to the client indicating the one or more additional cache lines [(Paragraphs 0016, 0029-0031, 0039, 0047-0048 and 0053-0058; FIGs 12 and 13) where the present invention of Anantaraman teaches to mark a cache line as invalid or prefetch, cache 130 can be configured to be more or less aggressive with prefetching. For example, in one embodiment, prefetch request engine 132 is configured with a limit on how many valid bits should be in a superline before requesting prefetch of another cache line in the superline. In one embodiment, unless there are a threshold number of valid cache lines in the superline, prefetch request engine 132 will not execute prefetching where lower level memory generates a request vector for the cache line that triggered the cache miss, including a field for each cache line of the superline. The request vector includes a demand request for the cache line that caused the cache miss, and the lower level memory can enhance the request vector with prefetch hint information. The prefetch hint information can indicate a prefetch request for one or more other cache lines in the superline; the prefetch vector is a prefetch hint vector, which can provide a mechanism for piggybacking prefetch hint information from the cache controller or other lower level cache 130 to the memory controller or other higher level cache 140 without increasing pressure on the demand request interface between the two caches. As mentioned above, in one embodiment, cache 140 is not obligated to respond to every prefetch hint in the prefetch vector, and can selectively drop prefetch information (e.g., hints) based on availability or lack of availability of memory bandwidth. As mentioned above, in one embodiment, every demand read request made by cache 130 can include a prefetch hint vector. In one embodiment, the prefetch vector indicates the status of each cache line in a superline that the demand read request belongs to. It will be understood that different implementations can use different labels and indications to pass request information to the higher level cache. In one embodiment, the status of each cache line can be identified or marked as valid (V), invalid (I), prefetch (P), or demand (D). The status of the individual cache lines can be placed in a field or bit for each cache line. Valid indicates that the cache line is already present in the cache. Invalid status indicates that the cache line is not present in the cache; a status of invalid does not necessarily indicate a request by the lower level cache for the cache line. In one embodiment, the lower level cache can mark a cache line as invalid to indicate that the cache line is missing from the cache, but that it is not a candidate for prefetching. Prefetch status indicates that the cache line is not present in the lower level cache and is a candidate to be prefetched. The states I and P provide a mechanism for the lower level cache to control which lines are candidates for prefetching. Demand status indicates the position of the cache line in the superline that is the subject of the demand request to correspond to the claimed limitation]; receive a permission request from the client in response to the one or more prefetch hints, the permission request requesting access to the one or more additional cache lines [(Paragraphs 0016, 0029, 0039, 0047-0048 and 0053-0058; FIGs 12 and 13) where the present invention of Anantaraman teaches the prefetch vector is a prefetch hint vector, which can provide a mechanism for piggybacking prefetch hint information from the cache controller or other lower level cache 130 to the memory controller or other higher level cache 140 without increasing pressure on the demand request interface between the two caches. As mentioned above, in one embodiment, cache 140 is not obligated to respond to every prefetch hint in the prefetch vector, and can selectively drop prefetch information (e.g., hints) based on availability or lack of availability of memory bandwidth. As mentioned above, in one embodiment, every demand read request made by cache 130 can include a prefetch hint vector. In one embodiment, the prefetch vector indicates the status of each cache line in a superline that the demand read request belongs to. service engine 350 includes update logic 356. Update logic 356 can be responsible for changing pending requests to the correct state. For example, update logic 356 can change a request to page hit/miss when an activate command is sent, or to page empty when a precharge command is sent. Additionally, the value of request vectors 344 should be changed (e.g., from P to I) when a data access request is generated for prefetch data, to prevent prefetching the same cache line multiple times. Update logic 356 can also invalidate a prefetch vector when a page corresponding to the prefetch vector is closed. In one embodiment, memory controller 340 only services prefetch requests for open pages, and drops prefetch hints for pages that are not open, where it will be obvious to one or ordinary skill in the art to modify the system of Wallin to include the features of the memory controller to drop or service access to the one or more additional cache lines based on the prefetch hints to correspond to the claimed limitation]; and in response to receiving the permission request from the client, requesting the one or more additional cache lines from the one or more memory devices to send to the client [(Paragraphs 0016, 0028-0029, 0039, 0047-0048 and 0053-0058; FIGs 12 and 13) where the present invention of Anantaraman teaches the prefetch request engine 132 is part of and/or executed by a sectored cache controller. In one embodiment, prefetch servicing engine 142 is part of and/or executed by a memory controller (MC). Thus, a cache controller can be responsible for determining which cache lines are good prefetch candidates while the memory controller can be responsible for prefetching the data and sending the prefetched data back to the cache controller. In one embodiment, lower level cache 130, higher level cache 140, and the interface between them is extended in accordance with a prefetcher framework such as set forth in FIG. 3, and as described below with reference to system 300 and the lower level memory can selectively mark cache lines for prefetch or not, and the higher level memory (the bandwidth provider) can selectively drop the prefetch hint information without servicing the prefetch request. Such a loosely coupled prefetching mechanism allows the lower level cache to make determinations as to what cache lines should be prefetched, but allows the higher level memory to determine whether or not a prefetch request can or will be serviced. Thus, the prefetch requests can be referred to as prefetch hints, since the higher level memory has autonomy to ignore the request. Also, the system designer can control the aggression of the prefetching by the lower level memory. By placing a threshold of three valid cache lines before the lower level memory can make a prefetch request, the lower level memory will not be as aggressive in prefetching. By setting a threshold of 1, or even eliminating the threshold, the lower level memory can be more aggressive in making prefetch requests to correspond to the claimed limitation]. 
As for dependent claims 16 and 26, the applicant is directed to the rejections to claim 8 set forth above, as they are rejected based on the same rationale.
As for dependent claims 17, the applicant is directed to the rejections to claim 9 set forth above, as they are rejected based on the same rationale.
Claims 10, 11, 18, 19, 27 and 28 is rejected under 35 U.S.C. 103(a) as being unpatentable over Wallin, as applied to claims 1, 12, 20, 21 and 25 above, and further in view of Hooker et al. (US PGPUB 2011/0238923) (hereinafter ‘Hooker’).
As per dependent claim 10, Wallin teaches the programmable switch of claim 1.
Wallin does not appear to explicitly disclose wherein the circuitry is further configured to compare an address for the data requested by the cache line request to addresses stored in a match-action table to identify a matching address.
However, Hooker discloses wherein the circuitry is further configured to compare an address for the data requested by the cache line request to addresses stored in a match-action table to identify a matching address [(Paragraphs 0099-0103; FIGs 12 and 13) where the present invention of Morris teaches FIG. 13, a flowchart illustrating operation of the prefetch unit 124 of FIG. 12 is shown. Flow begins at block 1302. At block 1302, the prefetch unit 124 receives the L1D memory address 196 of FIG. 12 from the L1 data cache 116. Flow proceeds to block 1304. At block 1304, the prefetch unit 124 detects that the L1D memory address 196 falls within a block (e.g., page) for which the prefetch unit 124 has previously detected an access pattern and has begun prefetching cache lines from system memory into the L2 cache 118, as described above with respect to FIGS. 1 through 11. Specifically, the prefetch unit 124 maintains a block number 303 that specifies the base address of the memory block for which the access pattern has been detected. The prefetch unit 124 detects that the L1D memory address 196 falls within the memory block by detecting that the bits of the block number 303 match the corresponding bits of the L1D memory address 196. Flow proceeds to block 1306. At block 1306, beginning at the L1D memory address 196, the prefetch unit 124 finds the next two cache lines in the detected access direction within the memory block that are implicated by the previously detected access pattern. The operation performed at block 1306 is described in more detail below with respect to FIG. 14. Flow proceeds to block 1308. At block 1308, the prefetch unit 124 provides to the L1 data cache 116 the physical addresses of the next two cache lines found at block 1306 as the pattern-predicted cache line address 194. Other embodiments are contemplated in which the number of cache line addresses provided by the prefetch unit 124 is more or less than two. Flow proceeds to block 1312. At block 1312, the L1 data cache 116 pushes the addresses provided at block 1308 into the queue 198. Flow proceeds to block 1314. At block 1314, whenever the queue 198 is non-empty, the L1 data cache 116 takes the next address out of the queue 198 and makes an allocation request 192 to the L2 cache 118 for the cache line at the address. However, if an address in the queue 198 is already present in the L1 data cache 116, the L1 data cache 116 dumps the address and foregoes requesting its cache line from the L2 cache 118. The L2 cache 118 subsequently provides the requested cache line data 188 to the L1 data cache 116. Flow ends at block 1314 to correspond to the claimed limitation]. 
Wallin and Hooker are analogous art because they are from the same field of endeavor of memory and power management.
Before the effective filling date, it would have been obvious to one of ordinary skill in the art, having the teachings of Wallin and Hooker before him or her, to modify the apparatus of Wallin to include the prefetch unit and matching mechanism of Hooker because it will enhance data prefetching.
The motivation for doing so would be to [effectively prefetching data for programs that exhibit no clear trend when considering their memory accesses within relatively small time windows, but present a clear trend when examined in relatively large samples (Paragraph 0007, lines 1-4 by Hooker)].
Therefore, it would have been obvious to combine Wallin and Hooker to obtain the invention as specified in the instant claim.
As per dependent claim 11, Wallin discloses wherein the circuitry is further configured to calculate one or more offset addresses for the matching address to identify the one or more additional cache lines [(Paragraph 0028, 0032-033 and 0041-0047; FIGs.1, 2, 3 and 5) wherein Wallin teaches that as illustrated in FIG. 5, if a request resulting in a cache miss necessitates issuance of a Read request (step 502), fetch/prefetch controller 210 may bundle the original request with one or more prefetch requests into a single request transaction conveyed on interconnect 20 in step 504. A single bundled read transaction may include the address A of the cache miss and information about the address offsets to the K prefetches. For example, in one implementation, the transaction encoding illustrated in FIG. 4 may be employed. Thus, the address offsets relative to the original address may be encoded in a prefetch bit mask. In one embodiment, all caching devices and memory devices on interconnect 20 may need to perform a snoop lookup for address A but only the device owning cache line A, performs snoop lookups for the prefetched cache lines. This device will reply with data for each prefetched address for which it is the owner. Otherwise, an empty NACK data packet may be supplied for the prefetched cache line. Since the states of the other caches are not affected by the prefetch transaction, they do not need to snoop the prefetch addresses to correspond to the claimed limitation]. 
As for dependent claims 18 and 27, the applicant is directed to the rejections to claim 10 set forth above, as they are rejected based on the same rationale.
As for dependent claims 19 and 28, the applicant is directed to the rejections to claim 11 set forth above, as they are rejected based on the same rationale.


CLOSING COMMENTS
    a.   STATUS OF CLAIMS IN THE APPLICATION
	a(1) CLAIMS REJECTED IN THE APPLICATION
Per the instant office action, claims 1-3, 5-13, 15-22 and 24-28 have received a first action on the merits and are subject of a first action non-final.
	a(2) CLAIMS ALLOWED IN THE APPLICATION
Claims 4, 14 and 23 would be allowable if rewritten to include all of the limitations of the base claim and any intervening claims.
The reasons for allowance of claims 4, 14 and 23 are that the prior art of record, neither anticipates, nor renders obvious the recited combination as a whole; including, as for claim 4, the limitations of “wherein the circuitry is further configured to receive updated prefetch information from the host, the updated prefetch information having been prepared by the host based on the cache miss data sent from the programmable switch to the host and execution of a memory access prediction algorithm by the host”. The cosest prior art Shen (US PGPUB 2019/0196987) teaches computing resources are selected to process certain applications. At times, the behavior of the processing of a certain application includes a given number of memory access misses in the cache memory subsystem for the particular computing resource. These misses lead to a number of memory requests being sent to the system memory via the memory controller. In some embodiments, the OS scheduler also sends a unique application ID to the memory controller identifying a particular application or type of application corresponding to the miss. The application type can also be used to predict memory access misses and determine the predicted latency and that the computing resource sends updates of cache miss rate information corresponding to the cache memory subsystem used by the computing resource. In an embodiment, a miss rate for each level of the hierarchical cache memory subsystem is sent by the computing resource. In another embodiment, a single combined miss rate is sent. Similar to the number of threads, the miss rate information is sent when changes that cross thresholds are detected (Paragraphs 0017-0020) but not the specific limitations as claimed in claims 4, 14 and 23. 
Pertinent Prior art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Shen et al., US PGPUB 2019/0196987– teaches Dynamic per-link and all-bank refresh.
Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMED GEBRIL whose telephone number is (571)270-1857.  The examiner can normally be reached on Monday-Friday, 8:00am-5:00pm.ALT. Friday.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sanjiv Shah can be reached on 571-272-4098.  The fax phone number for the organization where this application or proceeding is assigned is 571-270-2857. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MOHAMED M GEBRIL/Primary Examiner, Art Unit 2135