DETAILED ACTION
Status of Claims 
Claims 1-20 have been considered. It is hereby acknowledged that the following papers have been received and placed of record in the file:
Applicant Remarks 						-Receipt Date 03/14/2022
Amended Claims 						-Receipt Date 03/14/2022

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 03/14/2022 has been entered.
 

Response to Amendment
This office action is in response to the amendment filed on 03/14/2022. Claims 1-20 are pending. Claims 1 and 12 are amended. 

Response to Arguments
Applicant's arguments filed 03/14/2022 have been fully considered but they are not persuasive. 
Applicant submits:
“even if it were assumed, for the sake of argument only (and Applicant does not concede this) that the LFDQ instruction of Blainey in some sense corresponds to "...a load micro-operation with twice the data size and a load data only micro operation...", Blainey does not disclose or teach "...wherein the load data only micro-operation allocates a destination physical register for the load micro-operation with twice the data size." as in Claim 1 at least for the reason that the expression "r200=(r200, dl+d2)" does not update a destination register. 
In the passage of Blainey cited above, the quad load instruction LFQU loads data from memory starting at an address stored in r200 into registers fp101 and fp102, and updates the address stored in r200 by (dl+d2). However, r200 is not a destination register - rather, it stores an address of memory from which data is retrieved, and then stored in fp 101 and fp 102. Thus, even if it were assumed, for the sake of argument (and Applicant does not concede this) that the expression "r200=(r200, dl+d2)" were to correspond in some sense to another micro-operation, it does not disclose a micro-operation which "...allocates a destination physical register for the load micro-operation with twice the data size."” (Remarks, pages 10-11)
	However, this argument is not persuasive because a destination register is a register that an instruction or operation writes a result to, such a register is a “destination” of the operation, as opposed to a source register which is a register which the operation reads from. In Blainey, the update operation r200=(r200, d1+d2) writes the sum r200 + d1+d2 to r200, which makes r200 a destination register of this update operation, see Blainey Appendix 1 for the description of this operation in the description of the LFQU instruction. 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 3-5, 7, 11-12, 14, 16, and 20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Blainey US 5,613,121.
Regarding claim 1, Blainey teaches:
1. (Currently Amended) A method for fusing load micro-operations, the method comprising: 
determining whether two adjacent micro-operations are consecutive load micro-operations (col 9 lines 1-5: the determination of whether there are any intervening instructions between two LDFU instructions is a determination of whether the two LDFU instructions are consecutive load micro-operations); 
determining whether the two adjacent micro-operations each have a data size that is the same (col 8 lines 17-39: by scanning the window to find other double precision load instructions LFDU after a first LFDU is identified, it is determined whether the remaining loads are double precision loads/whether they have a data size that is the same); 
determining whether the two adjacent micro-operations access consecutive addresses (col 7 lines 43-58 and col 8 lines 48-61: by determining whether the displacement of the second load d2 is +/-8, it is determined whether the second load is accessing the 8 bytes before the first load or the 8 bytes after the first load, i.e. consecutive addresses of double precision data); and 
fusing the two adjacent micro-operations into a load micro-operation with twice the data size (col 8 lines 46-61: eligible pairs of consecutive double precision load instructions LFDU are fused into/replaced with a quad load instruction LFQU which loads twice the data size) and a load data only micro-operation (col 8 lines 46-61: the update of the register r200 is the load data only micro-operation since the data it loads into, i.e. r200+d1+d2, is only for the LFQU load instruction, see also Appendix 1) on a condition that the two adjacent micro-operations are determined to be consecutive load micro-operations  (col 9 lines 1-9: two loads that are determined to have no intervening instructions, i.e. determined to be consecutive loads, are eligible to be replaced without having to examine any intervening instructions, i.e. are fused on the condition that they are consecutive), to both have the data size (col 8 lines 17-20: pairs of double precision loads, i.e. loads that have a same data size, are a searched for to be replaced), and to access consecutive addresses (col 8 lines 46-55: a pair of loads are eligible to be replaced if they meet the criteria of accessing consecutive addresses), wherein the load data only micro-operation allocates a destination physical register to which to write data of the load micro-operation with twice the data size (col 8 lines 46-61: the operation that updates the destination register r200, i.e. r200= r200 + d1+d2, is a load data only micro-operation that allocates r200 to store (r200, d1 +d2) which is data of the LFQU instruction).

Regarding claim 3, Blainey teaches:
3. The method of claim 1, further comprising: 
reviewing an addressing mode of each the two adjacent micro-operations (col 8 lines 33-61: by looking for LFDU instructions that employ base-displacement addressing and by looking at the displacement of the second load d2, the addressing mode of each of the consecutive micro-operations/instructions are reviewed).

Regarding claim 4, Blainey teaches:
4. The method of claim 1, wherein of the two adjacent micro-operations, a micro-operation having a lower address is converted to the load micro-operation with twice the data size (col 9 lines 45-48: the first instruction of the pair is replaced/converted to the load with twice the data size; col 8 lines 46-61: the first instruction of the pair has a lower address in the case where d2=+8).

Regarding claim 5, Blainey teaches:
5. The method of claim 4, wherein of the two adjacent micro-operations, a micro-operation having a higher address is converted to the load data only micro-operation (col 8 lines 46-61 and col 9 lines 45-50: when d2=+8, the second load instruction has a higher address and is deleted and the update of the r200 register, i.e. the load data only micro-operation, is performed instead, i.e. the higher address micro-operation is converted to the load data only micro-operation).

Regarding claim 7, Blainey teaches:
7. The method of claim 1, further comprising: 
marking a memory renamed load micro-operation as ineligible if memory renaming operates before load fusion (col 8 lines 33-61: a memory renamed load does not meet the eligibility criteria, i.e. marked ineligible, since the eligibility of the renamed registers of the load, i.e. whether the base registers are the same or the source registers are adjacent, are not able to be determined).


11. The method of claim 1, further comprising: 
inserting a bubble into a load pipeline on a cycle immediately following a fused load micro-operation to allow the fused load micro-operation an extra cycle to process a HI portion of load results (col 9 lines 15-27: a copy instruction/bubble is inserted into the load pipeline after a fused load which allows an extra cycle to process the register of the second load instruction, i.e. a HI portion of load results).

	Regarding claim 12, Blainey teaches:
12. A processor configured to fuse load micro- operations, comprising: 
dispatch logic configured to dispatch micro-operations (col 12 lines 18-25: the processor executing the benchmark program dispatches micro-operations of the program); and 
load fusion detection logic connected to the dispatch logic (col 4 lines 28-33: the logic for replacing/fusing loads is connected to dispatch logic since the fused loads are dispatched during execution after they are fused), the load fusion detection logic configured to: 
determine whether two adjacent micro-operations are consecutive load micro-operations (col 9 lines 1-5: the determination of whether there are any intervening instructions between two LDFU instructions is a determination of whether the two LDFU instructions are consecutive load micro-operations); 
determine whether the two adjacent micro-operations each have a data size that is the same (col 8 lines 17-39: by scanning the window to find other double precision load instructions LFDU after a first LFDU is identified, it is determined whether the remaining loads are double precision loads/whether they have a data size that is the same); 
determine whether the two adjacent micro-operations access consecutive addresses (col 7 lines 43-58 and col 8 lines 48-61: by determining whether the displacement of the second load d2 is +/-8, it is determined whether the second load is accessing the 8 bytes before the first load or the 8 bytes after the first load, i.e. consecutive addresses of double precision data); and 
fuse the two adjacent micro-operations into a load micro-operation with twice the data size (col 8 lines 46-61: eligible pairs of consecutive double precision load instructions LFDU are fused into/replaced with a quad load instruction LFQU which loads twice the data size) and a load data only micro-operation (col 8 lines 46-61: the update of the register r200 is the load data only micro-operation) on a condition that the two adjacent micro-operations are determined to be consecutive load micro-operations  (col 9 lines 1-9: two loads that are determined to have no intervening instructions, i.e. determined to be consecutive loads, are eligible to be replaced without having to examine any intervening instructions, i.e. are fused on the condition that they are consecutive), to both have the data size (col 8 lines 17-20: pairs of double precision loads, i.e. loads that have a same data size, are a searched for to be replaced), and to access consecutive addresses (col 8 lines 46-55: a pair of loads are eligible to be replaced if they meet the criteria of accessing consecutive addresses), wherein the load data only micro-operation allocates a destination physical register to which to write data of the load micro-operation with twice the data size (col 8 lines 46-61: the operation that updates the destination register r200, i.e. r200= (r200, d1+d2), is a load data only micro-operation that allocates r200 to write (r200, d1 +d2) which is data of the LFQU instruction).

	Regarding claim 14, Blainey teaches
14. The processor of claim 12, wherein the load fusion detection logic is configured to convert a micro-operation of the two adjacent micro-operations having a lower address to the load micro-operation with twice the data size (col 9 lines 45-48: the first instruction of the pair is replaced/converted to the load with twice the data size; col 8 lines 46-61: the first instruction of the pair has a lower address in the case where d2=+8) and to convert a micro-operation of the two adjacent micro-operations having a higher address to the load data only micro-operation (col 8 lines 46-61 and col 9 lines 45-50: when d2=+8, the second load instruction has a higher address and is deleted and the update of the r200 register, i.e. the load data only micro-operation, is performed instead, i.e. the higher address micro-operation is converted to the load data only micro-operation).

	Regarding claim 16, Blainey teaches: 
16. The processor of claim 12, wherein the processor is further configured to mark a memory renamed load micro-operation as ineligible when memory renaming operates before load fusion (col 8 lines 33-61: a memory renamed load does not meet the eligibility criteria, i.e. marked ineligible, since the eligibility of the renamed registers of the load, i.e. whether the base registers are the same or the source registers are adjacent, are not able to be determined).  

	Regarding claim 20, Blainey teaches:
20. The processor of claim 12, wherein the load fusion detection logic is configured to insert a bubble into a load pipeline on a cycle immediately following a fused load micro-operation to allow the fused load micro-operation an extra cycle to process a HI portion of load results (col 9 lines 15-27: a copy instruction/bubble is inserted into the load pipeline after a fused load which allows an extra cycle to process the register of the second load instruction, i.e. a HI portion of load results).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 2, 6, 13, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Blainey US 5,613,121 in view of Levison et al. US 2018/0129498 (hereinafter, Levison).
	Regarding claim 2, Blainey teaches:
		 2. The method of claim 1, 
	Although Blainey teaches replacing pairs of load instructions with a load instruction that loads twice the data size and a load data only micro-operation (Blainey col 8 lines 46-61), Blainey does not explicitly discuss the effects this has on the load queue entries and the address generation scheduler queue entries. That is, Blainey does not explicitly teach:
wherein the load data only micro- operation suppresses use of load queue entries and address generation scheduler queue entries.

merging instructions suppresses use of load queue entries and address generation scheduler queue entries ([0003]: the instruction merge saves a place/suppresses use of an entry in the renamer, the ROB, and the schedulers, which would include the load queue and the address generation scheduler queue).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the processor of Blainey to avoid using a load queue entry and address generation scheduler queue entry for a merged/fused load as taught by Levison such that the update register portion of the LFQU instruction does not use/suppresses use of load queue entries and address generation scheduler queue entries. One of ordinary skill in the art would have been motivated to make this modification in implementing the load fusion in order to improve performance and free up space in the load queue and address generation scheduler queue (Levison [0003]).

Regarding claim 6, Blainey teaches:
6. The method of claim 1, further comprising: 
Although Blainey teaches replacing pairs of load instructions with a load instruction that loads twice the data size and a load data only micro-operation (Blainey col 8 lines 46-61), Blainey does not explicitly discuss the effects this has on memory renaming. That is, Blainey does not explicitly teach:
 marking fused loads as ineligible for memory renaming if memory renaming operates after load fusion.
	However, Levison discusses fusing consecutive instructions (Levison, Abstract and [0003]). In particular, Levison teaches:
marking fused loads as ineligible for memory renaming when memory renaming operates after load fusion ([0003]: the instruction merge saves a place in the renamer, i.e. the merged/fused instruction, which may be a load instruction, does not go to/is ineligible for the memory renamer).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the processor of Blainey to avoid renaming a fused load as taught by Levison. One of ordinary skill in the art would have been motivated to make this modification in implementing the load fusion in order to improve performance (Levison [0003]) and free up the memory renamer.

	Regarding claim 13, Blainey teaches:
13. The processor of claim 12, 
	Although Blainey teaches replacing pairs of load instructions with a load instruction that loads twice the data size and a load data only micro-operation (Blainey col 8 lines 46-61), Blainey does not explicitly discuss the effects this has on the load queue entries and the address generation scheduler queue entries. That is, Blainey does not explicitly teach:
wherein the load data only micro- operation suppresses use of load queue entries and address generation scheduler queue entries.
However, Levison discusses fusing consecutive instructions (Levison, Abstract and [0003]). In particular, Levison teaches:
merging instructions suppresses use of load queue entries and address generation scheduler queue entries ([0003]: the instruction merge saves a place/suppresses use of an entry in the renamer, the ROB, and the schedulers, which would include the load queue and the address generation scheduler queue).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of 

	Regarding claim 15, Blainey teaches: 
15. The processor of claim 12, 
Although Blainey teaches replacing pairs of load instructions with a load instruction that loads twice the data size and a load data only micro-operation (Blainey col 8 lines 46-61), Blainey does not explicitly discuss the effects this has on memory renaming. That is, Blainey does not explicitly teach:
wherein the processor is further configured to mark fused loads as ineligible for memory renaming when memory renaming operates after load fusion.  
	However, Levison discusses fusing consecutive instructions (Levison, Abstract and [0003]). In particular, Levison teaches:
marking fused loads as ineligible for memory renaming when memory renaming operates after load fusion ([0003]: the instruction merge saves a place in the renamer, i.e. the merged/fused instruction, which may be a load instruction, does not go to/is ineligible for the memory renamer).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the processor of Blainey to avoid renaming a fused load as taught by Levison. One of ordinary skill in the art would have been motivated to make this modification in implementing the load fusion in order to improve performance (Levison [0003]) and free up the memory .

Claims 8 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Blainey US 5,613,121 and Sperber et al. US 2004/0199755 (hereinafter, Sperber).
	Regarding claim 8, Blainey teaches:
8. The method of claim 1, 
	Blainey does not discuss how exceptions are handled. That is, Blainey does not explicitly teach:
wherein an occurrence of an exception with respect to at least one of the load micro-operation with twice the data size and the load data only micro-operation results in re-execution of the adjacent micro- operations without fusing.
	However, Sperber discusses handling exceptions that occur during execution of fused micro-operations (Sperber, Abstract). In particular, Sperber teaches: 
wherein an occurrence of an exception with respect to the fused micro-operation results in re-execution of the adjacent micro- operations without fusing ([0048]-[0051]: if an exception occurs during execution of a fused-op, the fused exception handler selects an unfused mode which re-executes the instructions without fusing).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system of Blainey to handle exceptions during execution of fused loads as taught by Sperber. One of ordinary skill in the art would have been motivated to make this modification to resolve exceptions of operations that are steps of fused uops (Sperber [0045]). 

	Regarding claim 17, Blainey teaches:
17. The processor of claim 12,
	Blainey does not discuss how exceptions are handled. That is, Blainey does not explicitly teach:
wherein an occurrence of an exception with respect to at least one of the load micro-operation with twice the data size and the load data only micro-operation results in re-execution of the adjacent micro- operations without fusing.
	However, Sperber discusses handling exceptions that occur during execution of fused micro-operations (Sperber, Abstract). In particular, Sperber teaches: 
wherein an occurrence of an exception with respect to the fused micro-operation results in re-execution of the adjacent micro- operations without fusing ([0048]-[0051]: if an exception occurs during execution of a fused-op, the fused exception handler selects an unfused mode which re-executes the instructions without fusing).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system of Blainey to handle exceptions during execution of fused loads as taught by Sperber. One of ordinary skill in the art would have been motivated to make this modification to resolve exceptions of operations that are steps of fused uops (Sperber [0045]). 

Allowable Subject Matter
Claims 9-10 and 18-19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. The reasons for allowance is the same as the reasons for allowance given in the Non-Final Rejection dated 11/15/2019

Conclusion	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KASIM ALLI whose telephone number is (571)270-1476. The examiner can normally be reached Monday - Friday 9am 5pm.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached on (571) 270-3995. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KASIM ALLI/Examiner, Art Unit 2183                                                                                                                                                                                                        /JYOTI MEHTA/Supervisory Patent Examiner, Art Unit 2182