Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
As per the instant Application having Application number 16/560,217, the examiner acknowledges the applicant's submission of the amendment dated 11/19/2020.  Claim 18 is canceled. Claims 1-17 and 19-21 are pending.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 6-7, 10 and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Beard (US20180293169) in view of Jalal (US20190340147).
As per claim 1. A computing system, comprising: a producer comprising a first processing element configured to generate processed data (Beard: Fig. 2, first processing ; a producer cache configured to store the processed data generated by the producer (Beard: Fig. 2, cache 208; "First processing device 202 includes at least one cache 208 together with a cache controller 210." (paragraph 0034)); a consumer comprising a second processing element configured to receive and process the processed data generated by the producer (Beard: Fig. 2, second processing device 204); and a consumer cache configured to store the processed data generated by the consumer (Beard: Fig. 2, cache 212), wherein the producer is configured to, in response to receiving a …, perform a direct cache transfer (DCT) to transfer the processed data from the producer cache to the consumer cache (Beard: paragraph 0034).  
Beard does not teach stash cache maintenance operation (stash-CMO); however, Jalal teaches stash cache maintenance operation (stash-CMO) (Jalal: "When data is received from an I/O interface it is directed to a storage resource of the data processing system, such as a memory or cache. Cache Stashing is a mechanism to install data within a particular cache in a data processing system. Cache stashing ensures that data is located close to its point of use, thereby improving the system performance" (paragraph 0003); "In one embodiment, the protocol for interfacing a node with the interconnect (such as a Coherent Hub Interface (CHI) protocol) is enhanced for WR_UNIQ_STASH requests to add an optional COMP_ACK packet response and to add a WR_DATA_CANCEL data operation. Here, WR_UNIQ_STASH is a request to stash data to a cache of a CPU (the stash target) and to tag the data as having the coherence state UNIQUE" (paragraph 0045)).

Before the effective filing date of the claimed inventions, it would have been obvious to a person of ordinary skill in the art to modify Beard to include stash cache maintenance operation (stash-CMO) as taught by Jalal since doing so would provide the benefit of [Jalal: "When data is received from an I/O interface it is directed to a storage resource of the data processing system, such as a memory or cache. Cache Stashing is a mechanism to install data within a particular cache in a data processing system. Cache stashing ensures that data is located close to its point of use, thereby improving the system performance" (paragraph 0003)].
Therefore, it would have been obvious to combine Beard and Jalal for the benefit of creating the computing system as specified in claim 1.
As per claim 6. Beard teaches wherein the producer does not know a location of the consumer before the producer has completed a task that instructs the producer to generate the processed data (Beard: "In some embodiments, the first cache line is transferred to the link controller, stored in a line buffer in a memory of the data processing system and, at a later time, transferred the first cache line from the line buffer to the second cache line of the consumer processing device. The line buffer may be a first-in, first-out line (FIFO) buffer or a first-in, last-out (FILO) line buffer, or in a relaxed ordering between producer and consumer. In other embodiments, the link controller may be configured via signal to perform one of the aforementioned orderings" (paragraph 0050). Where producer transfers the completed data to link controller without knowing in advance of the consumer destination).  
wherein coherency of the computing system is maintained by a software management unit (Beard: Fig. 2, coherence control 216; paragraph 0034) … the software management unit (Beard: Fig. 2, coherence control 216; paragraph 0034).
Beard does not teach wherein the producer is configured to notify ... when a task that generates the processed data is complete, and wherein ... is configured to instruct a home of the processed data to initiate the DCT; however, Jalal teaches wherein the producer is configured to notify ... when a task that generates the processed data is complete, and wherein ... is configured to instruct a home of the processed data to initiate the DCT (Jalal: "When data is received from an I/O interface it is directed to a storage resource of the data processing system, such as a memory or cache. Cache Stashing is a mechanism to install data within a particular cache in a data processing system. Cache stashing ensures that data is located close to its point of use, thereby improving the system performance" (paragraph 0003); "In one embodiment, the protocol for interfacing a node with the interconnect (such as a Coherent Hub Interface (CHI) protocol) is enhanced for WR_UNIQ_STASH requests to add an optional COMP_ACK packet response and to add a WR_DATA_CANCEL data operation. Here, WR_UNIQ_STASH is a request to stash data to a cache of a CPU (the stash target) and to tag the data as having the coherence state UNIQUE" (paragraph 0045)).
Before the effective filing date of the claimed inventions, it would have been obvious to a person of ordinary skill in the art to modify Beard to include wherein the producer is configured to notify ... when a task that generates the processed data is complete, and wherein ... is configured to instruct a home of the processed data to initiate the DCT as taught 
Therefore, it would have been obvious to combine Beard and Jalal for the benefit of creating the computing system as specified in claim 7.
As per claim 10. A method, comprising: generating processed data at a producer comprising a first hardware processing element; storing the processed data in a producer cache; performing, in response to receiving a stash-CMO, a DCT to transfer the processed data from the producer cache to a consumer cache; processing, after the DCT, the processed data at a consumer comprising a second hardware processing element; and storing the processed data generated by the consumer in the consumer cache.  The rationale in the rejections of claim 1 is herein incorporated.
As per claim 15. The method of claim 10, wherein the producer does not know a location of the consumer before the producer has completed a task that instructs the producer to generate the processed data.  The rationale in the rejections of claim 6 is herein incorporated.
As per claim 16. The method of claim 10, wherein coherency between the consumer and producer is maintained by a software management unit, the method further comprising: notifying the software management unit when a task that generates the processed data is completed by the producer; and instructing a home of the processed data to initiate the DCT.  The rationale in the rejections of claim 7 is herein incorporated.
Claims 2 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Beard (US20180293169) in view of Jalal (US20190340147) as applied to claims 1 and 10 above, and further in view of Tian et al. (US20070079298).
As per claim 2. Beard teaches at least one coherent interconnect communicatively coupling the producer to the consumer (Beard: Fig. 2, on-chip interconnect 206; paragraph 0034).
Beard in view of Jalal does not teach wherein the computing system is a Cache-Coherent Non-Uniform Memory Access (CC-NUMA) system; however, Tian teaches wherein the computing system is a Cache-Coherent Non-Uniform Memory Access (CC-NUMA) system (Tian: "In contrast the cc-NUMA multiprocessing architecture has memory separated into close and distant banks. In the shared-memory multi-core processor and multiprocessor systems, all processing elements access a common memory at the same speed. In cc-NUMA, memory on the same processor board as the processing element (local memory) is accessed faster than memory on other processor boards (shared memory), hence the “non-uniform” nomenclature. As a result, the cc-NUMA architecture scales much better to higher numbers of processing elements than the shared-memory multi-core processor and multiprocessor systems. “Cache coherent NUMA” means that caching is supported in the local system. As a practical matter, most large scale NUMA systems are cc-NUMA systems, NUMA and cc-NUMA will be used interchangeable in this description. The differences between NUMA and cc-NUMA are not of 
Beard, Jalal, and Tian are analogous art because they are from the same field of endeavor of memory access and data processing.
Before the effective filing date of the claimed inventions, it would have been obvious to a person of ordinary skill in the art to modify Beard in view of Jalal to include wherein the computing system is a Cache-Coherent Non-Uniform Memory Access (CC-NUMA) system as taught by Tian since doing so would provide the benefit of [Tian: "In contrast the cc-NUMA multiprocessing architecture has memory separated into close and distant banks. In the shared-memory multi-core processor and multiprocessor systems, all processing elements access a common memory at the same speed. In cc-NUMA, memory on the same processor board as the processing element (local memory) is accessed faster than memory on other processor boards (shared memory), hence the “non-uniform” nomenclature. As a result, the cc-NUMA architecture scales much better to higher numbers of processing elements than the shared-memory multi-core processor and multiprocessor systems. “Cache coherent NUMA” means that caching is supported in the local system. As a practical matter, most large scale NUMA systems are cc-NUMA systems, NUMA and cc-NUMA will be used interchangeable in this description. The differences between NUMA and cc-NUMA are not of particular relevance for the understanding of the various embodiments of the invention described herein" (paragraph 0006)].
Therefore, it would have been obvious to combine Beard, Jalal, and Tian for the benefit of creating the computing system as specified in claim 2.
The method of claim 10, wherein the DCT is performed using at least one coherent interconnect communicatively coupling the producer to the consumer, wherein the producer and the consumer are part of a CC-NUMA system. The rationale in the rejections of claim 2 is herein incorporated.
Claims 3 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Beard (US20180293169) in view of Jalal (US20190340147) as applied to claims 1 and 10 above, and further in view of Moir et al. (US20060123156).
As per claim 3. Beard in view of Jalal does not teach wherein the producer knows a location of the consumer before the producer has completed a task that instructs the producer to generate the processed data; however, Moir teaches wherein the producer knows a location of the consumer before the producer has completed a task that instructs the producer to generate the processed data (Moir: "An elimination technique for lock-free data transfer, or elimination, between producers and consumers is described herein. Producer and consumer elimination, as described herein, is mainly described with regard to how a consumer and producer exchange information after they have already agreed upon a location to perform the exchange. Producers and consumers, such as the enqueue and dequeue operations in a FIFO queue implementation, may agree on a transfer location in any of various ways, according to various embodiments. For example, in one embodiment, a producer and a consumer may transfer data using a single, predetermined location. In other embodiments, each producer and consumer may select transfer locations randomly. In general the method for producer and consumer elimination described herein may be used with virtually any elimination application, including a scalable first-in-first-out (FIFO) queue implementation that utilizes elimination 
Beard, Jalal, and Moir are analogous art because they are from the same field of endeavor of memory access and data processing.
Before the effective filing date of the claimed inventions, it would have been obvious to a person of ordinary skill in the art to modify Beard in view of Jalal to include wherein the producer knows a location of the consumer before the producer has completed a task that instructs the producer to generate the processed data as taught by Moir since doing so would provide the benefit of [Moir: "An elimination technique for lock-free data transfer, or elimination, between producers and consumers is described herein. Producer and consumer elimination, as described herein, is mainly described with regard to how a consumer and producer exchange information after they have already agreed upon a location to perform the exchange. Producers and consumers, such as the enqueue and dequeue operations in a FIFO queue implementation, may agree on a transfer location in any of various ways, according to various embodiments. For example, in one embodiment, a producer and a consumer may transfer data using a single, predetermined location. In other embodiments, each producer and consumer may select transfer locations randomly. In general the method for producer and consumer elimination described herein may be used with virtually any elimination application, including a scalable first-in-first-out (FIFO) queue implementation that utilizes elimination 
Therefore, it would have been obvious to combine Beard, Jalal, and Moir for the benefit of creating the computing system as specified in claim 3.
As per claim 12. The method of claim 10, further comprising: informing the producer of a location of the consumer before the producer has completed a task that instructs the producer to generate the processed data.  The rationale in the rejections of claim 3 is herein incorporated.
Claims 4, 8-9, 13 and 17-21 are rejected under 35 U.S.C. 103 as being unpatentable over Beard (US20180293169) in view of Jalal (US20190340147) as applied to claims 1 and 10 above, and further in view of Vasudevan et al. (US20190004958).
As per claim 4. Beard does not teach the stash-CMO; however, Jalal teaches the stash-CMO (Jalal: "When data is received from an I/O interface it is directed to a storage resource of the data processing system, such as a memory or cache. Cache Stashing is a mechanism to install data within a particular cache in a data processing system. Cache stashing ensures that data is located close to its point of use, thereby improving the system performance" (paragraph 0003); "In one embodiment, the protocol for interfacing a node with the interconnect (such as a Coherent Hub Interface (CHI) protocol) is enhanced for WR_UNIQ_STASH requests to add an optional COMP_ACK packet response and to add a WR_DATA_CANCEL data operation. Here, 
Beard in view of Jalal does not teach wherein coherency of the computing system is maintained by hardware elements, wherein … and the DCT are executed in response to the producer determining to update a main memory in the computing system with the processed data currently stored in the producer cache, wherein … is a flush type …; however, Vasudevan teaches wherein coherency of the computing system is maintained by hardware elements, wherein … and the DCT are executed in response to the producer determining to update a main memory in the computing system with the processed data currently stored in the producer cache, wherein … is a flush type … (Vasudevan: Figs. 4A and 4B; "Referring to FIG. 4A, I/O interface block 420 (i.e., the consumer) seeks to read cache line (CL) 432. However, the only valid instance of cache line 432 resides in the local cache 412 of processor 1 410 (i.e., the producer). As such, the read operation initiated by I/O interface block 420 misses its local cache 422 in operation 1. As a result of this miss, the home agent 402 issues a snoop request for cache line 432 to processor 410, as illustrated by operation 2. Next, since the cache line 432 in the local cache 412 of processor 410 has been modified and therefore “dirty,” cache line 432 is written back to memory 404 as cache line 434 via a writeback, as illustrated by operation 3 in FIG. 4B. The coherency state of the original cache line 432 stored in the local cache 412 of processor 410 is changed from modified (M) to shared (S), as illustrated by operation 4. In addition to the writeback to memory, a copy of cache line 432 is provided to the I/O interface block 420 via operation 5. This copy of cache line 432 is saved into the local cache 422 as cache line copy 436 and marked as shared (S). Thus, as a result of the read issued by the I/O interface 
Beard, Jalal, and Vasuvedan are analogous art because they are from the same field of endeavor of memory access and data processing.
Before the effective filing date of the claimed inventions, it would have been obvious to a person of ordinary skill in the art to modify Beard in view of Jalal to include wherein coherency of the computing system is maintained by hardware elements, wherein … and the DCT are executed in response to the producer determining to update a main memory in the computing system with the processed data currently stored in the producer cache, wherein … is a flush type … as taught by Vasuvedan since doing so would provide the benefit of [Vasudevan: "The read snapshot operation is used in producer consumer usage models wherein the consumer is sourcing data from the producer and storing it into a memory location local to the consumer, without disturbing the source address or cache line containing the data. In one embodiment, the memory location local to the consumer is a register used by the consumer. In other embodiments, the local memory location is a memory address or cache line in the consumer's local cache that is different from the source memory address or cache line. In yet another embodiment, the local memory location is a data buffer, such as one residing in the consumer's local cache and may be used repeatedly by the producer to update with new data for the consumer. The read snapshot operation may source data for a memory address wherever that data exists, without changing the data's existing coherency state or the its location in the caching hierarchy. For example, if the most current (i.e., modified) data being requested for a memory address happens to be cached in a cache line of the L1 cache, the read 
Therefore, it would have been obvious to combine Beard, Jalal, and Vasuvedan for the benefit of creating the computing system as specified in claim 4.
As per claim 8. Beard teaches wherein the software management unit (Beard: Fig. 2, coherence control 216; paragraph 0034).
Beard does not teach a first stash-CMO; however, Jalal teaches a first stash-CMO (Jalal: "When data is received from an I/O interface it is directed to a storage resource of the data processing system, such as a memory or cache. Cache Stashing is a mechanism to install data within a particular cache in a data processing system. Cache stashing ensures that data is located close to its point of use, thereby improving the system performance" (paragraph 0003); "In one embodiment, the protocol for interfacing a node with the interconnect (such as a Coherent Hub Interface (CHI) protocol) is enhanced for WR_UNIQ_STASH requests to add an optional COMP_ACK packet response and to add a WR_DATA_CANCEL data operation. Here, WR_UNIQ_STASH is a request to stash data to a cache of a CPU (the stash target) and to tag the data as having the coherence state UNIQUE" (paragraph 0045)).
transmits … to the home to initiate the DCT on the producer; however, Vasuvedan teaches transmits … to the home to initiate the DCT on the producer (Vasudevan: Fig 6; paragraphs 0040-41; "At initial state, there is only one copy of the cache line which is cached in the producer's local cache 604 and has a modified (M) cache coherency state indicating that its “dirty.” The initial cache line is illustrated by reference number 612. The consumer 610 desires to obtain a copy of this cache line and therefore issues a read request 622. Since consumer's local cache 608 does not contain a copy of the cache line, the read request results in a miss 624 which is then forwarded to the home agent 606. In response to the read miss 624, the home agent 606 determines at 626, such as checking a directory, which cache has the request cache line. Upon determining that the producer's local cache 604 contains cache line 612, the home agent 606 sends a message 628 to the producer's local cache 604 to request the cache line. In turn, producer's cache 604 sends a response 630 containing a copy of the cache line back to the home agent 606. A copy of the cache line is retained in the producer's cache 604 with its cache coherency state set to shared (S). This is illustrated by reference number 614. Responsive to receiving the response 630 from the producer's local cache 604, the home agent 606 writes back the cache line into memory to ensure that any modification made to the cache line is not lost. In addition, the home agent 606 forwards the response along with the requested cache line to the consumer's local cache 608 via response 632. The requested cache line is then saved in the consumer's local cache 608 as cache line copy 616 and marked as shared (S)" (paragraph 0050)).
Before the effective filing date of the claimed inventions, it would have been obvious to a person of ordinary skill in the art to modify Beard in view of Jalal to include transmits … to the home to initiate the DCT on the producer as taught by Vasuvedan since doing so would provide the benefit of [Vasudevan: "The read snapshot operation is used in producer consumer usage models wherein the consumer is sourcing data from the producer and storing it into a memory location local to the consumer, without disturbing the source address or cache line containing the data. In one embodiment, the memory location local to the consumer is a register used by the consumer. In other embodiments, the local memory location is a memory address or cache line in the consumer's local cache that is different from the source memory address or cache line. In yet another embodiment, the local memory location is a data buffer, such as one residing in the consumer's local cache and may be used repeatedly by the producer to update with new data for the consumer. The read snapshot operation may source data for a memory address wherever that data exists, without changing the data's existing coherency state or the its location in the caching hierarchy. For example, if the most current (i.e., modified) data being requested for a memory address happens to be cached in a cache line of the L1 cache, the read snapshot operation would read the data from the cache line in the L1 cache and provide it to the consumer requesting the data. In contrast to current approaches, such as using a regular read or load operation, which tend to force any modified (i.e., dirty) data to be written back to memory, at the completion of a read snapshot operation, the requested data will continue to reside in the cache line in the L1 cache. The read snapshot operation does not cause a change to the cache line's location or coherency state as a consequence of carrying out the operation" (paragraph 0038)].
As per claim 9. Beard does not teach the stash-CMO; however, Jalal teaches the stash-CMO (Jalal: "When data is received from an I/O interface it is directed to a storage resource of 
Beard in view of Jalal does not teach wherein the home, in response to receiving …, transmits a snoop comprising … to the producer that instructs the producer to perform the DCT; however, Vasudevan teaches wherein the home, in response to receiving …, transmits a snoop comprising … to the producer that instructs the producer to perform the DCT (Vasudevan: Figs. 2A and 2B; paragraphs 0040-41; "At initial state, there is only one copy of the cache line which is cached in the producer's local cache 604 and has a modified (M) cache coherency state indicating that its “dirty.” The initial cache line is illustrated by reference number 612. The consumer 610 desires to obtain a copy of this cache line and therefore issues a read request 622. Since consumer's local cache 608 does not contain a copy of the cache line, the read request results in a miss 624 which is then forwarded to the home agent 606. In response to the read miss 624, the home agent 606 determines at 626, such as checking a directory, which cache has the request cache line. Upon determining that the producer's local cache 604 contains cache line 612, the home agent 606 sends a message 628 to the producer's local cache 604 to request the cache line. In turn, producer's cache 604 sends a response 630 
As per claim 13. The method of claim 10, wherein coherency between the consumer and producer is maintained by hardware elements, wherein the stash-CMO and the DCT are executed in response to the producer determining to update a main memory with the processed data currently stored in the producer cache, wherein the stash-CMO is a flush type stash-CMO.  The rationale in the rejections of claim 4 is herein incorporated.
As per claim 17. The method of claim 16, wherein instructing a home of the processed data to initiate the DCT comprises: transmitting, from the software management unit, a first stash-CMO to the home to initiate the DCT on the producer; and transmitting, in response to receiving the first stash-CMO, a snoop from the home to the producer, the snoop comprising the stash-CMO that instructs the producer to perform the DCT.  The rationale in the rejections of claims 8 and 9 is herein incorporated.
As per claim 19. Beard in view of Jalal does not teach updating, in response to the snoop, main memory to include the processed data stored in the producer cache; however, Vasudevan teaches updating, in response to the snoop (Vasudevan: Figs. 2A and 2B;  main memory to include the processed data stored in the producer cache (Vasudevan: Figs. 4A and 4B; paragraphs 0040-41; "Referring to FIG. 4A, I/O interface block 420 (i.e., the consumer) seeks to read cache line (CL) 432. However, the only valid instance of cache line 432 resides in the local cache 412 of processor 1 410 (i.e., the producer). As such, the read operation initiated by I/O interface block 420 misses its local cache 422 in operation 1. As a 
As per claim 20.  Beard does teach informing …the consumer that the DCT is complete; however, Jalal teaches informing …the consumer that the DCT is complete (Jalal: "In accordance with certain aspects of the disclosure, a data processing network and method of operation thereof are provided for efficient transfer of ordered data from a Request Node (RN-I) to a target node. The RN-I sends write requests to a Home Node (HN) and the HN responds to a first write request when resources have been allocated by the HN. The RN-I then sends the data to be written. The HN also responds with a completion (COMP) message when a coherency action has been performed at the HN. The RN-I acknowledges receipt of the COMP message with a COMP_ACK message. This message is not sent until COMP messages have been received for all write requests that are older than the first write request for the ordered data, 
Beard in view of Jalal does not teach after transmitting the snoop; however, Vasudevan teaches after transmitting the snoop (Vasudevan: Figs. 2A and 2B; paragraphs 0040-41; "At initial state, there is only one copy of the cache line which is cached in the producer's local cache 604 and has a modified (M) cache coherency state indicating that its “dirty.” The initial cache line is illustrated by reference number 612. The consumer 610 desires to obtain a copy of this cache line and therefore issues a read request 622. Since consumer's local cache 608 does not contain a copy of the cache line, the read request results in a miss 624 which is then forwarded to the home agent 606. In response to the read miss 624, the home agent 606 determines at 626, such as checking a directory, which cache has the request cache line. Upon determining that the producer's local cache 604 contains cache line 612, the home agent 606 sends a message 628 to the producer's local cache 604 to request the cache line. In turn, producer's cache 604 sends a response 630 containing a copy of the cache line back to the home agent 606. A copy of the cache line is retained in the producer's cache 604 with its cache coherency state set to shared (S). This is illustrated by reference number 614. Responsive to receiving the response 630 from the producer's local cache 604, the home agent 606 writes back the cache line into memory to ensure that any modification made to the cache line is not lost. In addition, the home agent 606 forwards the response along with the requested cache line to the consumer's local cache 608 via response 632. The requested cache line is then saved in the consumer's local cache 608 as cache line copy 616 and marked as shared (S)" (paragraph 0050)).
A computing system, comprising: a producer comprising a first processing element configured to generate processed data; 5X-5946US USPATENT 16/560,217Conf. No.: 3354 a producer cache configured to store the processed data generated by the producer; a consumer comprising a second processing element configured to receive and process the processed data generated by the producer; and a consumer cache configured to store the processed data generated by the consumer, wherein the producer is configured to, in response to receiving a stash cache maintenance operation (stash-CMO), perform a direct cache transfer (DCT) to transfer the processed data from the producer cache to the consumer cache, wherein the stash CMO comprises at least one of: a flush type stash-CMO, a copy back type stash-CMO, or a first-stash-CMO transmitted from a software management unit to a home of the processed data. The rationale in the rejections of claims 1 and 4 is herein incorporated.
Claims 5 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Beard (US20180293169) in view of Jalal (US20190340147) as applied to claims 1 and 10 above, and further in view of Ghai et al. (US20150127910).
As per claim 5. Beard does not teach the stash-CMO; however, Jalal teaches the stash-CMO (Jalal: "When data is received from an I/O interface it is directed to a storage resource of the data processing system, such as a memory or cache. Cache Stashing is a mechanism to install data within a particular cache in a data processing system. Cache stashing ensures that data is located close to its point of use, thereby improving the system performance" (paragraph 0003); "In one embodiment, the protocol for interfacing a node with the interconnect (such as a Coherent Hub Interface (CHI) protocol) is enhanced for WR_UNIQ_STASH requests to add an optional COMP_ACK packet response and to add a WR_DATA_CANCEL data operation. Here, 
Beard in view of Jalal does not teach wherein coherency of the computing system is maintained by hardware elements, wherein … and the DCT are executed in response to the producer cache issuing a capacity eviction to remove at least a portion of the processed data, wherein … is a copy back type ...; however, Ghai teaches wherein coherency of the computing system is maintained by hardware elements, wherein … and the DCT are executed in response to the producer cache issuing a capacity eviction to remove at least a portion of the processed data, wherein … is a copy back type ... (Ghai: "In various embodiments, when HA data 302 is victimized (i.e., when a cache line is evicted from producer cache 204 to make room for additional data), producer cache 204 moves (e.g., responsive to issuance of a lateral cast-out command from cast-out engine 205) HA data 302 to consumer cache 214 for later processing. Alternatively, producer cache 204 may move (e.g., responsive to a lateral cast-out command) HA data 302 to consumer cache 214 prior to victimization. In any event, consumer core 212 utilizes HA log 304 to determine the location of HA data 302 when HA processing is initiated" (paragraph 0052); paragraph 0053). 
Beard, Jalal, and Ghai are analogous art because they are from the same field of endeavor of memory access and data processing.
Before the effective filing date of the claimed inventions, it would have been obvious to a person of ordinary skill in the art to modify Beard in view of Jalal to include wherein coherency of the computing system is maintained by hardware elements, wherein … and the DCT are executed in response to the producer cache issuing a capacity eviction to remove at least a portion of the processed data, wherein … is a copy back type ... as taught by Ghai since doing so would provide the benefit of [Ghai: "Modern processor designs commonly include some form of cast-out engine and snoop-intervention engine. A cast-out engine is responsible for writing data evicted from the cache back to main memory (or system memory) or into a cache associated with another processor. A snoop-intervention engine is responsible for providing data from a given cache to another processor that is trying to gain access to a cache line that includes the data. Operation of the cast-out engine may be triggered by, for example, a requirement to free-up space in a cache for incoming data. Operation of the snoop-intervention engine may be triggered to, for example, provide another processor exclusive access to a cache line in the event that the processor wishes to modify data in the cache line. In general, during a checkpoint interval (i.e., a time between two checkpoints), every cache line modified between checkpoints is either resident as dirty data in a cache or has been through a cast-out or snoop-intervention engine and, at a checkpoint, a cache walk/scrub can be triggered" (paragraph 0027)].
Therefore, it would have been obvious to combine Beard, Jalal, and Ghai for the benefit of creating the computing system as specified in claim 5.
As per claim 14. The method of claim 10, wherein coherency between the consumer and producer is maintained by hardware elements, wherein the stash-CMO and the DCT are executed in response to the producer cache issuing a capacity eviction to remove at least a portion of the processed data, wherein the stash-CMO is a copy back type stash-CMO.  The rationale in the rejections of claim 5 is herein incorporated.
Response to Amendment
Applicant's arguments with respect to the 35 USC 103 rejections filed on 11/19/2020 have been fully considered but are moot in view of the rejection.
Claims must be given the broadest reasonable interpretation during examination and limitations appearing in the specification but not recited in the claim are not read into the claim (See M.P.E.P. 2111 [R-1]).
In Applicant’s arguments, Applicant states that the combination of Beard and Jalal fails to disclose the amended subject matter of in claim 1, specifically: “perform a direct cache transfer (DCT) to transfer the processed data from the producer cache to the consumer cache” and “stash cache maintenance operation (stash-CMO)”.
In response, these arguments have been fully considered but they are not deemed persuasive.
With regard to “perform a direct cache transfer (DCT) to transfer the processed data from the producer cache to the consumer cache” and “stash cache maintenance operation (stash-CMO)”. “FIG. 2 is a simplified diagram of a data processing system 200. By way of example, system 200 comprises first processing device 202 and second processing device 204 that are coupled by an interconnect structure 206. Devices 202 and 204 and interconnect structure 206 may be implemented on the same chip (as shown) or on separate chips. In general, system 200 may have any number of devices. First processing device 202 includes at least one cache 208 together with a cache controller 210. Similarly, second processing device 204 includes at least one cache 212 together with a cache controller 214. The at least one cache may be a hierarchy of caches, such as level one (L1) cache and a level two (L2) cache. Interconnect structure 206 includes a coherence controller 216 that 
Furthermore, Jalal teaches the feature of stash-CMO (Jalal: "When data is received from an I/O interface it is directed to a storage resource of the data processing system, such as a memory or cache. Cache Stashing is a mechanism to install data within a particular cache in a data processing system. Cache stashing ensures that data is located close to its point of use, thereby improving the system performance" (paragraph 0003); "In one embodiment, the protocol for interfacing a node with the interconnect (such as a Coherent Hub Interface (CHI) protocol) is enhanced for WR_UNIQ_STASH requests to add an optional COMP_ACK packet response and to add a WR_DATA_CANCEL data operation. Here, WR_UNIQ_STASH is a request to stash data to a cache of a CPU (the stash target) and to tag the data as having the coherence state UNIQUE" (paragraph 0045)). Should the applicant wish to distinguish between the different Stashing operations, this could be further clarified through an amendment.
All arguments by the applicant are believed to be covered in the body of the office action; thus, this action constitutes a complete response to the issues raised in the remarks dated 11/19/2020.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
    a.   STATUS OF CLAIMS IN THE APPLICATION
	a(1) CLAIMS REJECTED IN THE APPLICATION
Per the instant office action, claims 1-17 and 19-21 have received an action on the merits and are subject to a final rejection.
    b.   DIRECTION OF FUTURE CORRESPONDENCES
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Kendrick Lam whose telephone number is (408)918-7586.  The examiner can normally be reached on Monday - Friday 8-5 PT.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sanjiv Shah can be reached on (571)272-4098
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/K.L./Examiner, Art Unit 2135 

/SANJIV SHAH/Supervisory Patent Examiner, Art Unit 2135