DETAILED ACTION
Claims 1-17 are pending.
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 3/7/2022 has been entered. 
The office acknowledges the following papers:
Power of Attorney, ADS, Assignee showing ownership filed on 3/4/2022,
Claims and remarks filed on 3/7/2022,
Power of Attorney, ADS, Assignee showing ownership filed on 3/11/2022.

	Allowable Subject Matter
Claims 3-6 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
As allowable subject matter has been indicated, applicant's reply must either comply with all formal requirements or specifically traverse each requirement not complied with.  See 37 CFR 1.111(b) and MPEP § 707.07(a).

New Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claim 17 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term “make the instruction execution more efficient” in claim 17 is a relative term which renders the claim indefinite. The term “make the instruction execution more efficient” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. For examination purposes, the limitation containing the indefinite limitation isn’t examined.

New Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 7, 12, and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Hu et al. (U.S. 2008/0201591), in view of Chen et al. (U.S. 2008/0294881), in view of Xu et al. (U.S. 2010/0318972).
As per claim 1:
Hu disclosed a method of tracing instruction execution on a processor of an integrated circuit chip in real time whilst the processor continues to execute instructions during clock cycles of the processor, the method comprising:
at tracing circuitry on the integrated circuit chip (Hu: Figures 1-2 elements 101 and 210, paragraph 26):
monitoring the instruction execution of the processor by:
counting a number of successive instructions which are retired contiguously in time to form an instruction count (Hu: Figure 2 element 212, paragraph 27)(The broadest reasonable interpretation of contiguously in time is next or near in time or sequence. The first counter counts the number of retired micro-operations.).
Hu failed to teach monitoring the instruction execution of the processor by: counting a number of subsequent contiguous clock cycles of the processor during which no instruction is retired to form a stall count; generating a trace message which comprises the instruction count and the stall count; and outputting the trace message.
However, Chen combined with Hu disclosed monitoring the instruction execution of the processor by:
counting a number of subsequent contiguous clock cycles of the processor during which no instruction is retired to form a stall count (Chen: Figure 1 element 145, paragraphs 3, 17, and 28)(Hu: Figure 2 elements 210 and 216, paragraph 28)(The broadest reasonable interpretation of contiguously in time is next or near in time or sequence. Chen disclosed a completion stall counter that increments when the next to complete instruction cannot retire. Hu disclosed a generic stall counter that increments on a plurality of different stall reasons. The combination adds the completion stall counter as an additional aggregate performance counter in Hu.).
The advantage of using completion stall counters is that they can provide better information for stall analysis (Chen: Paragraph 3). Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the completion stall counter of Chen into the processor of Hu for the above advantage. 
Hu and Chen failed to teach generating a trace message which comprises the instruction count and the stall count; and outputting the trace message.
However, Xu combined with Hu and Chen disclosed generating a trace message which comprises the instruction count and the stall count (Xu: Figures 1 and 3-4 elements 156A5-25, 160, 440, and 470-475, paragraphs 16, 18, 29-30, and 37-39)(Chen: Figure 1 element 145, paragraphs 3, 17, and 28)(Hu: Figure 2 element 212, paragraphs 27 and 29-30)(Xu disclosed a trace unit that samples performance monitoring data on performance monitor events and generates trace messages. Hu disclosed a performance monitor reading performance counters on pre-configured intervals. The combination allows for generation of trace messages on performance monitor events that includes the completion stall counter of Chen and the retired micro-operation counter of Hu.); and 
outputting the trace message (Xu: Figures 2-3 elements 156A35, 157, and 210, paragraphs 25 and 27)(Chen: Figure 1 element 145, paragraphs 3, 17, and 28)(Hu: Figure 2 element 210, paragraphs 27 and 29-30)(The combination allows for generation of trace messages on performance monitor events that includes the completion stall counter of Chen and the retired micro-operation counter of Hu. The generated messages are output for analysis.).
The advantage of generating trace messages is that performance monitors can receive a standardized format. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the generation of trace messages in Hu for the advantage of outputting standardized messages to the performance monitoring element.
As per claim 7:
Claim 7 essentially recites the same limitations of claim 1. Claim 7 additionally recites the following limitations:
counting a number of instructions retired in each clock cycle to form a clock cycle count (Chen: Figure 1 elements 170-180, paragraph 20)(Hu: Figure 2 element 212, paragraph 27)(The combination implements the first counter to count the number of retired micro-operations each clock cycle.).
As per claim 12:
Claim 12 essentially recites the same limitations of claim 1. Claim 12 additionally recites the following limitations:
a processor configured to execute instructions during clock cycles of the processor (Chen: Figure 1 elements 170-180, paragraph 20)(Hu: Figure 2 element 101, paragraph 27)(The combination implements the multiple execution units of Chen into the CPU of Hu.);
tracing circuitry configured to trace instruction execution of the processor in real time (Xu: Figure 1 elements 155-160, paragraphs 16-18)(Hu: Figures 2 elements 101 and 210, paragraph 26).
As per claim 16:
Hu, Chen, and, Xu disclosed the method of claim 1, wherein the trace message identifies how many successive instructions the processor has retired contiguously and how long the processor has stalled before resuming instruction execution (Xu: Figures 1 and 3-4 elements 156A5-25, 160, 440, and 470-475, paragraphs 16, 18, 29-30, and 37-39)(Chen: Figure 1 element 145, paragraphs 3, 17, and 28)(Hu: Figure 2 element 212, paragraphs 27 and 29-30)(Xu disclosed a trace unit that samples performance monitoring data on performance monitor events and generates trace messages. Hu disclosed a performance monitor reading performance counters on pre-configured intervals. The combination allows for generation of trace messages on performance monitor events that includes the completion stall counter of Chen and the retired micro-operation counter of Hu.).
As per claim 17:
Hu, Chen, and Xu disclosed the method of claim 1, further comprising:
analyzing the trace message (Xu: Figures 2-3 elements 156A35, 157, and 210, paragraphs 25 and 27)(Chen: Figure 1 element 145, paragraphs 3, 17, and 28)(Hu: Figure 2 element 210, paragraphs 27 and 29-30)(The combination allows for generation of trace messages on performance monitor events that includes the completion stall counter of Chen and the retired micro-operation counter of Hu. The generated messages are output for analysis by the debugger.); and 
identifying potential changes to an instruction set and/or operation of the processor based on the analysis to optimize the operation of the processor and make the instruction execution more efficient (In view of the 112(b) rejection, this limitation isn’t examined.).

Claims 2 and 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Hu et al. (U.S. 2008/0201591), in view of Chen et al. (U.S. 2008/0294881), in view of Xu et al. (U.S. 2010/0318972), in view of Official Notice.
As per claim 2:
Hu, Chen, and Xu disclosed the method of claim 1, further comprising:
comparing the stall count to a threshold stall count (Xu: Figure 1 element 160, paragraph 16)(Chen: Figure 1 element 145, paragraphs 3, 17, and 28)(Hu: Figure 2 elements 210 and 216, paragraph 28)(The combination adds the completion stall counter as an additional aggregate performance counter in Hu. Xu disclosed that a count of an event can trigger a performance monitor event. Official notice is given that thresholds can be compared to current count values for the advantage of determining if the threshold has been met or exceeded. Thus, it would have been obvious to one of ordinary skill in the art to compare the completion stall counter to a count value that triggers a performance monitor event.); and
generating and outputting the trace message when the stall count is the same as or exceeds the threshold stall count (Xu: Figures 2-3 elements 156A35, 157, and 210, paragraphs 25 and 27)(Hu: Figure 2 element 210, paragraphs 27 and 29-30)(The combination allows for generation of trace messages on performance monitor events that includes the completion stall counter of Chen and the retired micro-operation counter of Hu. In view of the above official notice, a comparison of the completion stall counter to a count value triggers a performance monitor event. This causes a message to be generated to be output for analysis.).
As per claim 13:
The additional limitation(s) of claim 13 basically recite the additional limitation(s) of claim 2. Therefore, claim 13 is rejected for the same reason(s) as claim 2.
As per claim 14:
Claim 14 essentially recites the same limitations of claim 7. Claim 14 additionally recites the following limitations:
a processor configured to retire more than one instruction per clock cycle of the processor (Chen: Figure 1 elements 170-180, paragraph 20)(Hu: Figure 2 element 101, paragraph 27)(The combination implements the multiple execution units of Chen into the CPU of Hu. Official Notice is given that retirement units can retire multiple instructions per clock cycle for the advantage of higher instruction throughput. Thus, it would have been obvious to one of ordinary skill in the art that complete unit of Chen can retire multiple instructions each clock cycle.);
tracing circuitry configured to trace instruction execution of the processor in real time (Xu: Figure 1 elements 155-160, paragraphs 16-18)(Hu: Figures 2 elements 101 and 210, paragraph 26).

Claims 8-9 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Hu et al. (U.S. 2008/0201591), in view of Chen et al. (U.S. 2008/0294881), in view of Xu et al. (U.S. 2010/0318972), further in view of Quach et al. (U.S. 2009/0019317).
As per claim 8:
Hu, Chen, and Xu disclosed the method of claim 7.
Hu, Chen, and Xu failed to teach counting a number of non-retired instruction executions to form a non-retired count.
However, Quach combined with Hu, Chen, and Xu disclosed counting a number of non-retired instruction executions to form a non-retired count (Quach: Figure 7 element 704, paragraphs 69-70)(Hu: Figure 2 element 210, paragraphs 26-30)(Quach disclosed recording a maximum number of instructions that aren’t able to retire in a clock cycle. The combination allows for recording this maximum number of instructions that aren’t able to retire during an interval time in a performance counter.).
The advantage of the recording the maximum number instructions that aren’t able to retire in clock cycles is that they can provide better information for stall analysis. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement recording the maximum number instructions that aren’t able to retire of Quach into the processor of Hu for the above advantage. 
As per claim 9:
Hu, Chen, Xu, and Quach disclosed the method of claim 8, wherein the trace message further comprises the non-retired count (Quach: Figure 7 element 704, paragraphs 69-70)(Xu: Figures 2-3 elements 156A35, 157, and 210, paragraphs 25 and 27)(Hu: Figure 2 element 210, paragraphs 27 and 29-30)(The combination allows for generation of trace messages on performance monitor events that includes a maximum number of instructions that aren’t able to retire counter and the retired micro-operation counter of Hu. The generated messages are output for analysis.).
As per claim 15:
The additional limitation(s) of claim 15 basically recite the additional limitation(s) of claim 8. Therefore, claim 15 is rejected for the same reason(s) as claim 8.

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Hu et al. (U.S. 2008/0201591), in view of Chen et al. (U.S. 2008/0294881), in view of Xu et al. (U.S. 2010/0318972), further in view of Treue et al. (U.S. 7,231,551).
As per claim 10:
Hu, Chen, and Xu disclosed the method of claim 7.
Hu, Chen, and Xu failed to teach encoding the counts in the trace message using run length encoding.
However, Treue combined with Hu, Chen, and Xu disclosed encoding the counts in the trace message using run length encoding (Treue: Figure 9 elements 910, column 22 lines 23-47)(Xu: Figures 1 and 3-4 elements 156A5-25, 160, 440, and 470-475, paragraphs 16, 18, 29-30, and 37-39)(Chen: Figure 1 element 145, paragraphs 3, 17, and 28)(Hu: Figure 2 element 212, paragraphs 27 and 29-30)(Treue disclosed compressing trace data using run length encoding. The combination allows for generation of compressed trace messages using run length encoding, where the messages occur on performance monitor events and include performance counter values.).
The advantage of run length encoding is that data can be compressed into smaller values for data transfers. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement the run length encoding compression of trace data of Treue into the combination for the above advantage.

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Hu et al. (U.S. 2008/0201591), in view of Chen et al. (U.S. 2008/0294881), in view of Xu et al. (U.S. 2010/0318972), in view of Treue et al. (U.S. 7,231,551), further in view of Ryan (U.S. 2007/0247350).
As per claim 11:
Hu, Chen, Xu, and Treue disclosed the method of claim 10.
Hu, Chen, Xu, and Treue failed to teach encoding the counts in the trace message using Elias Gamma code.
However, Ryan combined with Hu, Chen, Xu, and Treue disclosed encoding the counts in the trace message using Elias Gamma code  (Treue: Figure 9 elements 910, column 22 lines 23-47)(Xu: Figures 1 and 3-4 elements 156A5-25, 160, 440, and 470-475, paragraphs 16, 18, 29-30, and 37-39)(Chen: Figure 1 element 145, paragraphs 3, 17, and 28)(Hu: Figure 2 element 212, paragraphs 27 and 29-30)(Ryan disclosed data compression using Elias Gamma Coding. Treue disclosed compressing trace data using run length encoding. The combination allows for generation of compressed trace messages using Elias Gamma Coding and Run Length Encoding, where the messages occur on performance monitor events and include performance counter values.).
The advantage of Elias Gamma Coding is that data can be compressed into smaller values for data transfers. Thus, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date to implement Elias Gamma Coding compression of Ryan of trace data within Treue into the combination for the above advantage.

Response to Arguments
The arguments presented by Applicant in the response, received on 3/7/2022 are not considered persuasive.
Applicant argues regarding claim 1:
“The cited art does not teach or suggest such a method. To begin, Hu does not teach or suggest monitoring the instruction execution of the processor by counting the number of successive instructions which are retired contiguously in time to form an instruction count. The Office has argued that counter 212 of Figure 2 and paragraph [0027] of Hu disclose such a feature. (Office Action, p. 3.) Applicant respectfully disagrees. While Applicant has previously argued this feature as missing from Hu, the Office has not provided any remark or argument as to the reference disclosing or suggesting a number of operations that are retired contiguously in time. In other words, counter 212 within Hu discloses a performance monitoring counter used to count a number of executed micro- operations that occur within a time interval. While paragraph [0027] mentions that this counter tallies a number of retired/completed operations, the reference fails to teach or suggest that the counter is counting a number of operations that are retired or executed contiguously in time.”  

This argument is not found to be persuasive for the following reason. Merriam-Webster defines “contiguous” as “next or near in time or sequence.” In the context of clock cycles occurring on the nano-second time-scale, occasional stalls or bubbles between consecutive clock cycle retirement of instructions would still read upon “contiguous in time.” Paragraph 27 of Hu describes “a first counter (number of retired micro-operations counter) 212 that is used to count the number of executed micro-operations (mops) occurring in the current time interval” (emphasis added). The first counter in the time interval incrementing for a plurality of consecutive clock cycles for retired instructions reads upon the claimed language. The first counter incrementing in non-consecutive clock cycles due to pipeline stalls still increments for the clock cycles with retired instructions. The later also reads upon the claimed limitation. 
Applicant argues for claim 1:
“Second, Chen does not teach or suggest counting a number of subsequent contiguous clock cycles of the processor during which no instruction is retired to form a stall count. The Office has argued that Figures 1 and 2 and paragraphs [0003], [0017], and [0028] of Chen disclose such a feature. (Office Action, p. 4.) Applicant respectfully disagrees. Paragraph [0017] of Chen describes a stall counter register 145 that stores data relating to a number of clock cycles that the next to complete instruction uses during instruction execution. Paragraph [0028] discloses that the stall counter register is incremented each time a currently executing instruction does not complete, wherein an instruction takes one clock cycle longer to complete than in ideal conditions, therein indicating a potential stall. Each iteration of the stall counter register 145 therefore indicates the number of clock cycle iterations that will be used to determine when the next to complete instruction does not complete. Applicant has previously argued that Chen does not teach or suggest a counting of subsequent contiguous clock cycles, as recited in claim 1. Again, Applicant submits that while Chen discloses a stall counter register, there is no discussion within the reference regarding a count of contiguous clock cycles in which no instruction is retired. Instead, the reference only discloses that the stall counter increases when a next to complete instruction is not completed. This does not teach or suggest a counting of contiguous clock cycles as defined in claim 1. In fact, the effect of the word "contiguous" is completely omitted from the Office's remarks.” 

This argument is not found to be persuasive for the following reason. Merriam-Webster defines “contiguous” as “next or near in time or sequence.” In the context of clock cycles occurring on the nano-second time-scale, occasional retirements between consecutive clock cycle delayed retirement of instructions would still read upon “contiguous in time.” Chen described the “completion stall counter register” element 145 in paragraph 28 states “If the next to complete instruction currently executing in HIS is not complete, decision block 330 increments the stall counter register 145” (emphasis added). Stated another way, when the oldest instruction to retire in the processor can’t retire, then the stall counter register 145 is incremented due to the lack of retirement of any instruction. No retirements occur when the next to complete instruction can’t retire due to the general requirement of in-order completion to ensure proper execution results. The added second counter in the time interval incrementing for a plurality of consecutive clock cycles when no instructions are retired reads upon the claimed language. The added second counter incrementing in non-consecutive clock cycles when no instructions are retired also reads upon the claimed limitation.
Applicant argues for claim 1:
“Nonetheless, assuming, arguendo, that Chen discloses the claimed stall counter, Applicant submits that there is no motivation to combine Chen with Hu. Hu discloses a stall counter 216 that is used purely for monitoring CPU utilization, and as such, there is no stall analysis within Hu. Applicant submits that one of skill in the art would not consider adding the teachings of Chen to improve the operation of Hu, as the stall counter within Chen provides no advantage or improvement for the operation of Hu. It is superfluous to the process of monitoring CPU utilization within Hu.” 

This argument is not found to be persuasive for the following reason. Chen provides explicit reasons for the advantages of using stall counters. This reason was used as proper motivation for the combination.
Applicant argues for claim 1:
“Furthermore, the addition of Xu merely provides that trace messages can include count data. The actual count of the PMCs in Hu is used during a training period to make frequency scaling decisions, which is done by having the CPU change to another power state with the desired frequency (see paragraph [0042]). A clock frequency scaling decision is made based on the number of finished micro-operations of the previous time interval and the current time interval as well as the processor utilization of the current interval. Applicant submits that there is no need for stall analysis in this, nor the need for trace messaging to output temporal information in relation to the stall count. As such, Applicant submits that one of skill in the art would not consider the combination of Hu with Chen and Xu to arrive at claim 1. Furthermore, adding additional data (i.e., the trace messages) and requirements with regard to the stall count would defeat the object of creating run-time power optimization in Hu.” 

This argument is not found to be persuasive for the following reason. In response to applicant’s argument that there is no teaching, suggestion, or motivation to combine the references, the examiner recognizes that obviousness may be established by combining or modifying the teachings of the prior art to produce the claimed invention where there is some teaching, suggestion, or motivation to do so found either in the references themselves or in the knowledge generally available to one of ordinary skill in the art.  See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988), In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992), and KSR International Co. v. Teleflex, Inc., 550 U.S. 398, 82 USPQ2d 1385 (2007).  In this case, simplifying received inputs to the performance monitor was provided as sufficient motivation to combine the references.

	Conclusion
The following is text cited from 37 CFR 1.111(c): In amending in reply to a rejection of claims in an application or patent under reexamination, the applicant or patent owner must clearly point out the patentable novelty which he or she thinks the claims present in view of the state of the art disclosed by the references cited or the objections made. The applicant or patent owner must also show how the amendments avoid such references or objections.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACOB A. PETRANEK whose telephone number is (571)272-5988.  The examiner can normally be reached on M-F 8:00-4:30.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee Li can be reached on (571) 272-4169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JACOB PETRANEK/Primary Examiner, Art Unit 2183