DETAILED ACTION
It is hereby acknowledged that the following papers have been received and placed of record in the file:
Amended Claims						-Receipt Date 10/28/2022
Applicant Arguments						-Receipt Date 10/28/2022		
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
This office action is in response to the amendment filed on 10/28/2022. Claims 1-21 are pending. Claims 1-7, 9-11, 13, and 15 are amended. Claim 8 is canceled. Claim 21 is new. 

Allowable Subject Matter
Claims 1-10 and 21 are allowed. The following is an examiner’s statement of reasons for allowance:
The reasons for allowance is the same as the reasons for indicating allowable subject matter for claim 8 in the Non-Final dated 06/29/2022 since the amendments to claim 1 incorporate the limitations of claim 8. Claims 2-10 and 21 are allowed based on their dependence from claim 1. 
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”


Response to Arguments
Applicant's arguments filed 10/28/2022 have been fully considered but they are not persuasive.
Applicant submits:
“Claim 11 has been amended to recite "the first cache includes a memory controller" and "determining, based on the first instruction and the second instruction being associated with the first group, whether the second cache has been updated based on the write of the first instruction using the memory controller." Even when combined, the cited reference do not teach or suggest at least these elements.”
However, this argument is not persuasive because the broadest reasonable interpretation of “determining…whether the second cache has been updated based on the write of the first instruction using the memory controller” only requires determining whether the second cache has been updated based on the write of the first instruction where the first instruction uses the memory controller. Since Ray teaches instructions using a memory controller in L1 cache, see Fig. 1 120, and the combination of Ray in view of Shen, Panwar, and Frey, as previously mapped, teach “determining based on the first instruction and the second instruction being associated with the first group, whether the second cache has been updated based on the write of the first instruction”, this combination also teaches “determining based on the first instruction and the second instruction being associated with the first group, whether the second cache has been updated based on the write of the first instruction using the memory controller” as recited in claim 11. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 11-13, 15-17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Ray et al. US 2004/0044847 in view of Shen et al. US 8,671,248, Panwar et al. 5,999,727, and Frey et al. US 5,185,871.
	Regarding claim 11, Ray teaches:
11. A method comprising: 
receiving a first instruction that is associated with a write to a first cache ([0021]: a normal store instruction, i.e. a first instruction, writes the L1 cache by sending the write data from the GPR to the store queue and then to the L1 cache, see also [0020] describing that the LSU receives instructions from the instruction fetch unit); 
the first cache includes a memory controller (Fig. 1, 120)
performing the first instruction on the first cache ([0021]: the write of the normal store instruction is provided to the L1 cache); 
receiving a second instruction that is associated with an access of a second cache that is higher in a memory hierarchy than the first cache ([0029]: the third DCBT instruction, which is the claimed second instruction, accesses L2 to prefetch data into L0, where L2 is higher in the hierarchy than L1), wherein: 
the second instruction is to be performed via a data path that does not include the first cache ([0023]: the access of L2 is made via 136/138 which does not include L1 in its data path to L2, see Fig. 1); 
the first instruction using the memory controller ([0020]-[0021]: instructions, including normal store instructions, are issued to the LSU on bus 142, where they are decoded at 108, and used by L1 cache control to control L1 cache read and write operations, see also [0023] describing the function of the L2 cache control which is similar to the L1 cache control as indicated by the connection from 120 to 138 in Fig. 1)
	Ray does not teach:
wherein the first instruction is associated with a first group;
the second instruction is associated with the first group; and 
determining, based on the first instruction and the second instruction being associated with the first group, whether the second cache has been updated based on the write of the first instruction; and 
performing the second instruction on the second cache after the second cache has been updated based on the write of the first instruction.
However, Shen teaches:
wherein a first instruction is associated with a first group (col 2 line 65-col 3 line 2: color information can be explicitly specified by a memory store instruction, i.e. in a field of the memory store instruction; col 5 lines 6-25: the color information may associate the store instruction with a memory region, i.e. a first group);
wherein a second instruction is associated with the first group (col 2 lines 47-53 and col 2 line 65-col 3 line 2: color information can be explicitly specified by a memory load instruction, i.e. in a field of the memory load instruction, where the color associates the instruction with other instructions of that color, i.e. a first group); 
determining, based on the first instruction and the second instruction being associated with the first group, whether to perform the load and store operations in order (col 5 lines 6-25: based on the instructions being associated with the same color the processor may determine to perform the load and store operations in order); and 
cause the second instruction to be performed in order (col 5 lines 6-25: load and store operations of the same color may be performed in order).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the processor of Ray to color its memory access instructions to enforce memory consistency semantics as taught by Shen. in this combination, the load, store, and DCBT instructions of Ray would each include a field for color information. One of ordinary skill in the art would have been motivated to make this modification to enable enforcement of different memory consistency models (Shen col 2 lines 2-4), which would improve performance and flexibility of the cache by allowing an optimal memory consistency model to be for each memory region. 
	Further, Panwar teaches that a store miss to L1 will proceed to L2 (col 10 lines 9-10) and using coloring information to ensure that a load does not execute until a similar colored store executes (col 14 lines 22-39).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the processor of Ray in view of Shen to store data that misses in L1 into L2 as taught by Panwar and to assign a common color to load instructions that are dependent on store instructions as further taught by Panwar, such that a store instruction and a DCBT instruction that depends on the store instruction use a common color and such that store misses in L1 write to L2. In this combination, the LSU of Ray would execute the dependent DCBT instruction to control 136/138 after executing the store instruction having the same color as the DCBT instruction. One of ordinary skill in the art would have been motivated to modify the LSU of Ray to store misses in L1 to L2 because storing data that misses to the next level of cache is a known technique on the known device of a computer processor for handling store misses and would yield the predictable result of preventing the loss of data by ensuring the data is written to the correct location. Further, one of ordinary skill in the art would have been motivated to modify the LSU of Ray to execute a dependent load instruction to control the streaming engine only after the store instruction it depends on is performed to avoid RAW hazards, such as loading incorrect data, which would occur if a dependent load instruction executes ahead of a store it depends on.  
	Furthermore, Frey teaches:
determine whether a cache has been updated based on a write of the first instruction (col 6 lines 26-32, col 6 lines 38-39, and col 15 lines 20-24: the L2 cache may send a store completion acknowledgement signal indicating that a store operation has successfully completed and copied into L2); 
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the L2 cache of Ray to send an acknowledgement once data has been copied into L2 as taught by Frey such that the LSU of Ray receives the acknowledgement from L2. In this combination, a store instruction and dependent DCBT instruction which share the same color would cause Ray to determine whether it has received an acknowledgement of a write from L2 indicating that the second cache has been updated by the store instruction, which uses the L1 cache control 120/memory controller of Ray, so that it can determine when to execute the dependent DCBT instruction. One of ordinary skill in the art would have been motivated to make this modification because sending write acknowledgements are a known technique on the known device of a processor for indicating a write has occurred and would yield the predictable result of enabling the processor to ensure safe execution of instructions dependent on the write. 

	Regarding claim 12, Ray in view of Shen, Panwar, and Frey teaches:
12. The method of claim 11, 
	Ray in view of Shen, Panwar, and Frey does not teach:
wherein the first instruction and the second instruction are associated with the first group based on a value stored in a processor task state register.
However, Shen further teaches:
wherein instructions are associated with a group based on a value stored in a processor task state register (col 3 lines 8-11 and lines 35-37: a color instruction is used to assign colors; the color instruction can provide the color information as a value maintained in a register, i.e. a task state register).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the processor of Ray in view of Shen and Panwar to support a color instruction that assigns colors using a register as further taught by Shen such that the memory access instructions of Ray are assigned colors based on a value stored in a register. One of ordinary skill in the art would have been motivated to make this modification to provide flexibility to the software (Shen col 3 lines 8-11).

	Regarding claim 13, Ray in view of Shen, Panwar, and Frey teaches:
13. The method of claim 12 further comprising: 
receiving a third instruction (Shen col 3 lines 35-37: the color instruction that assigns color information is a third instruction); and 
updating the value stored in the processor task state register based on the third instruction (Shen col 3 lines 8-11 and lines 35-37: the color instruction provides color information as a value maintained in a register, where the writing of the value to the register is an update to the value stored in the register).
	
Regarding claim 15, Ray in view of Shen, Panwar, and Frey teaches: 
15. The method of claim 11, wherein the determining of whether the second cache has been updated based on the write of the first instruction includes determining whether data associated with the first instruction has been provided from the first cache to the second cache (in the combination, Ray will determine whether or not it has received an ack from L2 in order to proceed with the execution of the DCBT instruction, see Frey col 15 lines 20-24 teaching L2 sending an ack, this is a determination of whether or not L2 has been updated by/based on the write of the store instruction to L1 when the store instruction misses in L1 since the ack from L2 will tell the LSU that the store instruction missed in L1, see Panwar col 10 lines 9-10 disclosing a cache store miss in L1 writes to L2, this determination is based on whether or not the data of the store instruction is provided from L1 to L2 since the LSU does not determine that L2 has been updated based on the store instruction until it receives the ack).

	Regarding claim 16, Ray in view of Shen, Panwar, and Frey teaches:
16. The method of claim 11, wherein the first cache includes a level-one (L1) cache and the second cache includes a level-two (L2) cache (Ray Fig. 1 118 is an L1 D-cache and 132 is an L2 cache).

	Regarding claim 17, Ray in view of Shen, Panwar, and Frey teaches:
17. The method of claim 11, wherein the first cache includes a level-one data (L1 D) cache and the second cache includes a level-two (L2) cache (Ray Fig. 1 118 is an L1 D-cache and 132 is an L2 cache).

Regarding claim 20, Ray in view of Shen, Panwar, and Frey teaches:
20. The method of claim 11, wherein the access of the second cache by the second instruction includes a prefetch of data from the second cache (Ray [0029]: the third DCBT instruction prefetches data from L2).

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Ray et al. US 2004/0044847 in view of Shen et al. US 8,671,248, Panwar et al. 5,999,727, Frey et al. US 5,185,871 and Codrescu et al. US 7,523,295.
	Regarding claim 14, Ray in view of Shen, Panwar, and Frey teaches:
14. The method of claim 11, 
	Ray in view of Shen, Panwar, and Frey does not teach:
wherein the first instruction and the second instruction are associated with the first group based on the first instruction and the second instruction being in a same execute packet.
However, Codrescu teaches:
an architecture which executes instruction packets wherein a first instruction and a second instruction are interdependent and the first instruction and the second instruction are in a same execute packet (col 3 lines 28-43 and 53-56: the interleaved multithreading architecture executes instruction packets which may have interdependent instructions).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the processor of Ray to have an interleaved multithreading architecture which executes instruction packets and supports interdependent instructions in the packets as taught by Codrescu. In this combination, Ray in view of Shen, Panwar, and Codrescu would assign the same color to store and load instructions in the same fetch packet to maintain ordering based on being in the same fetch packet since instructions in the same packet may be interdependent. One of ordinary skill in the art would have been motivated to modify the architecture of Ray to be multithreaded and to support the execution of instruction packets to increase clock frequency while maintaining high core and memory utilization (Codrescu col 1 lines 23-25) and to increase flexibility to manipulate program logic (Codrescu col 1 lines 36-39).

Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Ray et al. US 2004/0044847 in view of Shen et al. US 8,671,248, Panwar et al. 5,999,727, Frey et al. US 5,185,871, and Van Dyke et al. US 7,340,577
Regarding claim 18, Ray in view of Shen, Panwar, and Frey teaches:
18. The method of claim 11, 
	Ray in view of Shen, Panwar, and Frey does not teach:
wherein the performing of the second instruction on the second cache after the second cache has been updated based on the write of the first instruction includes inserting a set of no-op instructions prior to the second instruction.
	However, the background of Van Dyke teaches:
performing a second instruction after a memory has been updated based on a write of a first instruction by inserting a set of no-op instructions (col 1 line 65-col 2 line 9: a read instruction is performed after memory has been updated by a write instruction by inserting a series of no-ops to delay the read instruction)
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the LSU of Ray to insert NOPs before a load instruction dependent on a store instruction such that the load instruction is performed to control 136/138 after the store instruction updates L2 cache. One of ordinary skill in the art would have been motivated to make this modification because using NOPs is a known technique on the known device of a computer processor for delaying an operation and would yield the predictable result of simplifying the implementation of a delay between two instructions since NOPs are simple way to insert a delay into a pipeline. 

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Ray et al. US 2004/0044847 in view of Shen et al. US 8,671,248, Panwar et al. 5,999,727, Frey et al. US 5,185,871, and Palanca et al. US 9,342,310
	Regarding claim 19, Ray in view of Shen, Panwar, and Frey teaches:
19. The method of claim 11 
	Ray in view of Shen, Panwar, and Frey does not teach:
receiving a third instruction, wherein the third instruction includes a field that includes a predetermined value; and 
determining, based on the third instruction including the predetermined value, whether the second cache has been updated based on all pending instructions regardless of group.
	However, Palanca teaches:
a third instruction includes a field that includes a predetermined value (col 6 lines 59-60: the opcode of the MFENCE instruction includes is a field that is predetermined by the ISA); and 
determining, based on a third instruction including the predetermined value, whether a memory has been updated based on all pending instructions regardless of group  (col 7 line 1-17 and lines 26-30: an MFENCE causes a determination of whether all pending instructions in the memory subsystem in the memory subsystem are retired and all read and writes in the cache controller are globally observed before the MFENCE is deallocated, whether the MFENCE is deallocated will determine whether memory has been updated based on all the pending instructions ).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the processor of Ray to support an MFENCE instruction as taught by Palanca such that the LSU may receive the MFENCE instruction and determine the L2 cache has been updated based on all pending instructions when the MFENCE instruction retired. One of ordinary skill in the art would have been motivated to make this modification to maintain strong memory ordering when required (Palanca col 5 lines 29-30).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KASIM ALLI whose telephone number is (571)270-1476. The examiner can normally be reached Monday - Friday 9am 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached on (571) 270-3995. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KASIM ALLI/Examiner, Art Unit 2183          

/JYOTI MEHTA/Supervisory Patent Examiner, Art Unit 2182