Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claims 1-25 are presented for examination.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-3, 5-7, 8-10, 12-17, 22-24, 12-14 and 19-21 are rejected under 35 U.S.C. 103 as being unpatentable over Lee (US 2017/0205863 Al) in view of Bodas (US 2017/0185132 A1).

As per claim 1, Lee teaches An apparatus comprising: 
a model to generate adjusted tuning parameters of a thread scheduling policy based on a tradeoff indication value of a target system; (Lee (0030] In various embodiments, a power management model of a power controller of a processor can be dynamically updated, e.g., after the processor has been put into service in the field. In this way, balancing of power management and performance can be dynamically tuned for given workloads/applications to execute on the processor. In various embodiments, reinforcement learning algorithms can be used to dynamically (online) update the power management model).
compare the tradeoff indication value to a criterion; (Lee [0165] For example, if the overall reward values (a weighted combination of short term and long term rewards indicate the model is making too aggressive predictions and adversely affecting performance), this power configuration update does not occur. Further, as described, with regard to FIG. 20, self-learning may occur in which trained model parameters can be adjusted by update to the trained model parameters, e.g., to be more conservative with regard to performance impact. And Fig 20 Blocks 2020 and 2040 and [0169] As seen, method 2000 begins by estimating a performance/energy impact of prior configuration predictions for a workload under analysis (block 2010). As an example, this impact may be with regard to performance loss as a result of a reduced number of active cores (and/or reduced frequency of core operation). Next, control passes to diamond 2020 to determine whether this impact is greater than a first impact threshold. For example, in some embodiments this impact threshold may be a high impact threshold and may be set at a predetermined level, e.g., a performance impact measured in percentage of performance loss. As an example, this threshold level may be less than approximately a 10% performance loss. If it is determined that the impact is greater than the first impact threshold, control passes to block 2030 where one or more trained model parameters may be updated. More specifically, for a particular
workload type under analysis, one or more trained model parameters may be updated to limit the performance impact of the power management configurations identified in the trained model parameters. As such, dynamic machine or self-learning occurs over processor lifetime, such that an appropriate balance of performance and power management can be realized. And [0170] Still with reference to FIG. 20, if instead it is determined at block 2020 that the impact is not greater than this first impact threshold, control passes next to diamond 2040 to determine whether the impact is less than a second impact threshold.) and 
based on the comparison, initiate the model to re-adjust the adjusted tuning parameters. (Lee Fig 20 Block 2030 and 2050 and [0169] If it is determined that the impact is greater than the first impact threshold, control passes to block 2030 where one or more trained model parameters may be updated. More specifically, for a particular workload type under analysis, one or more trained model parameters may be updated to limit the performance impact of the power management configurations identified in the trained model parameters. As such, dynamic machine or self-learning occurs over processor lifetime, such that an appropriate balance of performance and power management can be realized. [0170] Otherwise, if it is determined that the impact is less than the second impact threshold, control passes to block 2050, where one or more trained model parameters may be updated. More specifically, for a particular workload type under analysis, one or more trained model parameters may be updated to provide for further power savings control by way of the power management configurations identified in the trained model parameters. As such, dynamic machine or self-learning occurs over processor lifetime, such that an appropriate balance of performance and power management can be realized. Understand while shown at this high level in the embodiment of FIG. 20, many variations and alternatives are possible.)

The examiner will take this adjustment/readjustment to be a change to the model not the actual workload. This is consistent with what is shown in Figs 1 and 2 of the specification.

Lee must be seen in conjunction with Bodas. The data collected in Bodas is used to compare with the learned model of Lee ( see Fig 19 Lee Block 1910 (Receive Workload Characteristic Information). That is where the data collection of Bodas comes in).

Lee does not teach a workload monitor to: execute a workload based on the thread scheduling policy; obtain a performance score and a power score from the target system based on execution of the workload, the performance score and the power score corresponding to a tradeoff indication value.
However, Bodas teaches a workload monitor to: (Bodas [0071] FIGS. 5A-B illustrate an example of measuring energy efficiency of an HPC system with one or more workloads according to one embodiment. In one embodiment, system 500 includes a graph 501 and a sequence of workloads 502. Graph 501 shows a workload running on a SUT, and a measurement run to determine the power and performance when the workload is performed.)
 execute a workload based on the thread scheduling policy; obtain a performance score and a power score from the target system based on execution of the workload, the performance score and the power score corresponding to a tradeoff indication value;(Bodas [0061] FIG. 4 illustrates one or more tables 401-406 that show exemplary performance and power values obtained from different workloads operating under varying power constraints according to one embodiment. Generally, each table (e.g., tables 401-406) illustrates a workload (i.e.,workload i 410) and its corresponding measured performance and power values. For example, as shown in FIG. 6A, workload i=1 may correspond to one workload (e.g., HPL),while workload i=6 corresponds to another workload (e.g.,miniMD) and also paragraph 65 which shows a normalized baseline performance and values).

It would have been obvious to a person in the ordinary skill in the art before the filing date of the claimed invention to combine Bodas with the system of Lee to execute a workload. One having ordinary skill in the art would have been motivated to use Bodas into the system of Lee for the purpose of assessing energy efficiency of high performance computing systems that are operated with and without power constraints (Bodas paragraph 01) 

As per claim 2, Bodas teaches including a performance evaluation controller to determine the performance score of the target system during multiple iterations of evaluating the workload execution to re-adjust the adjusted tuning parameters based on the performance score. (Bodas 0080] FIG. 8 is a flowchart of a method 800 to assess energy efficiency of an HPC system with and without power constraints according to one embodiment. Method 800 can be performed by processing logic that may be implemented in software, firmware, hardware, or any combination thereof. In one embodiment, method 800 is performed by an EEBC (e.g., EEBC 110). In one embodiment, method 800 is implemented on an EE benchmarking engine (e.g., EE benchmarking engine 300). Method 800 is configured to perform one or more workloads (i is a workload iterator from 1 to N1, where N1 is the total number of workloads) at one or more power levels (j is a power level iterator from 1 to N2, where N2 is the total number of unconstrained and constrained power levels). [0084] At block 809, determine whether the calculated average cluster power is less than, or equal to, the addition of the cluster power at W(i,j) and a threshold. In one embodiment, the threshold is calculated as 2%. If M(i,j) is not less than, or equal to, the addition of the cluster power at W(i,j) and the threshold, the run is set as “invalid” at block 811. In one embodiment, the run is set as “invalid” if the system power is over 2% of the allocated power. If M(i,j) is less than, or equal to, the addition of the cluster power at W(i,j) and the threshold, the power level iterator (j) adds a value of one (j=j+1) at block 810. And [0085] At block 812, determine whether the power level iterator is greater than the total number of power levels (j>N2). If the power level iterator is not greater than the total number of power levels, the method returns to block 813 to calculate performance and power at the following power level (e.g., W(i,2)). If the power level iterator is greater than the total number of power levels, the workload iterator (i) adds a value of one (i=i+1) at block 814. At block 815, determine whether the workload iterator (i) is greater than the total number of workloads. If the workload iterator (i) is not greater than the total number of workloads, the method returns to block 806 to launch the following workload (e.g., i=2). If the workload iterator (i) is greater than the total number of workloads, a figure of merits and one or more energy efficiencies are calculated at block 816).

As per claim 3, Bodas teaches including a power evaluation controller to determine the power score of the target system during multiple iterations of evaluating the workload execution to re-adjust the adjusted tuning parameters based on the power score. (Bodas [0011] FIG. 4 is a table illustrating performance and power values obtained for different workloads operating under varying power constraints according to one embodiment and [0061] FIG. 4 illustrates one or more tables 401-406 that show exemplary performance and power values obtained from different workloads operating under varying power constraints according to one embodiment. Generally, each table (e.g., tables 401-406) illustrates a workload (i.e., workload i 410) and its corresponding measured performance and power values. For example, as shown in FIG. 6A, workload i=l may correspond to one workload (e.g., HPL), while workload i=6 corresponds to another workload (e.g., miniMD).)

As per claim 5, Lee teaches wherein the workload monitor is to determine a point of the adjusted tuning parameters of the model at which the adjusted tuning parameters are maximized. (Lee [0126] Because compute-bound and memory-bound workloads have very different system requirements, the model solves a classification problem. Given a set of power configuration parameters (e.g., number of cores, number of threads, clock frequency and voltage) and runtime statistics (e.g., various performance/energy counters) at a current time sample, the goal is to find the optimal power configuration to maximize performance and energy efficiency for the next time interval. In different embodiments, two types of prediction models, expert heuristic and machine learning, may be used. and [0182] To learn to minimize long term energy consumption, an immediate reward R is assigned a value of −1×(energy consumption in an interval), so that the cumulative reward is the negative of the total energy consumption. As a result, maximizing total reward is equivalent to minimizing the total energy).

As per claim 7, Bodas teaches the model is to determine initial tuning parameters based on the baseline performance score and the baseline power score of the target system, the model to generate the adjusted tuning parameters based on the initial tuning parameters to configure the thread scheduling policy for a target optimization. (Bodas [0041] In one embodiment, benchmark configuration engine 206 may be utilized to establish a baseline power and a baseline performance. Benchmark configuration engine 206 may also be configured to provide one or more power levels, an idle time between runs, a total number of runs, and any other benchmark configuration that is applicable to the HPC system. In one embodiment, benchmark configuration engine 206 implements various benchmarking techniques in order to provide one or more energy efficiency benchmarks (e.g., workload energy efficiency, power level energy efficiency, and overall EE benchmark metric) for each HPC node within the collection of HPC nodes being analyzed. Benchmark configuration engine 206 includes various parameters that perform a number of functions at the start of each workload [initial run], for example, providing the power levels and number of runs for the selected workload. and [0046] In one embodiment, at the start of each workload, control program manager 210 inputs the selected workloads from workload store 205 and the benchmark parameters (e.g., baseline power, baseline performance, power levels, etc.) from benchmark configuration engine 206. and [0063] In one embodiment, each workload i 410 is initially run under no power limit (100%), as shown in row 450, to determine baseline values for each workload. The baseline values are calculated for each workload as a baseline performance (i) and a baseline power WB(i), which are used to normalize the measured P[i,n]/W[i,n] values at every power level 460. In one embodiment, an energy efficiency calculation for each workload depends upon the condition in which the workload is run.)

As to claims 8, 15 and 22, they are rejected based on the same reason as claim 1.
As to claims 9, 16 and 23, they are rejected based on the same reason as claim 2.
As to claims 10, 17 and 24, they are rejected based on the same reason as claim 3.
As to claims 12,19, they are rejected based on the same reason as claim 5.
As to claims 6, 13,14, 20 and 21, they are rejected based on the same reason as claim 7.

Claims 4, 11, 18 and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Lee (US 2017/0205863 Al) in view of Bodas (US 2017/0185132 A1) in further view of Didehban (US 2019/0196912 A1).

As per claim 4, Lee and Bodas do not teach detect an unexpected state of the target system; and revert the target system to a last known good state to enable multiple iterations of evaluating the thread scheduling policy to continue re-adjusting the adjusted tuning parameters.
However, Didehban teaches detect an unexpected state of the target system; and revert the target system to a last known good state to enable multiple iterations of evaluating the thread scheduling policy to continue re-adjusting the adjusted tuning parameters. (Didehban [0032] Aspects of the disclosure relate to providing a software checkpoint and recovery technique (also referred to herein as InCheck) for complete, safe & timely recovery from soft errors. InCheck makes light-weight error-free checkpoints at basic block granularity, and safely reverts the application execution to the beginning of a last executed basic block using preserved checkpoints. Features of InCheck include verified register file preservation, single memory-location checkpointing, and safe and timely recovery).

It would have been obvious to a person in the ordinary skill in the art before the filing date of the claimed invention to combine Didehban with the system of Lee and Bodas to revert to the last good state. One having ordinary skill in the art would have been motivated to use Didehban into the system of Lee and Bodas for the purpose of allowing for safe re-execution from recoverable soft errors. (Didehban paragraph 09) 

As to claims 11, 18 and 25, they are rejected based on the same reason as claim 4.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

US 20200310510 A1 – discloses applying Machine Learning (ML) techniques for power management at different levels of a power management stack. An apparatus may comprise a first circuitry, a second circuitry, and a third circuitry. The first circuitry may have a plurality of memory registers. The second circuitry may be operable to establish values for a plurality of features based on samples of values of the plurality of memory registers taken at one or more times within a range of time of predetermined length. The third circuitry may be operable to compare the plurality of features against a plurality of learned parameters for a reference workload.

US 20200167196 A1 – discloses executing a workload in an edge environment are disclosed. An example apparatus includes a node scheduler to accept a task from a workload scheduler, the task including a description of a workload and tokens, a workload executor to execute the workload, the node scheduler to access a result of execution of the workload and provide the result to the workload scheduler, and a controller to access the tokens and distribute at least one of the tokens to at least one provider, the provider to provide a resource to the apparatus to execute the workload.

US 10120727 B2 – discloses techniques for allocating configurable computing resources from a pool of configurable computing resources to a logical server or virtual machine. The logical server or virtual machine may use allocated configurable computing resources to implement, execute or run a workload.

US 20180027055 A1 – discloses allocating resources of managed nodes to workloads to balance multiple resource allocation objectives include an orchestrator server to receive resource allocation objective data indicative of multiple resource allocation objectives to be satisfied. The orchestrator server is additionally to determine an initial assignment of a set of workloads among the managed nodes and receive telemetry data from the managed nodes. The orchestrator server is further to determine, as a function of the telemetry data and the resource allocation objective data, an adjustment to the assignment of the workloads to increase an achievement of at least one of the resource allocation objectives without decreasing an achievement of another of the resource allocation objectives, and apply the adjustments to the assignments of the workloads among the managed nodes as the workloads are performed. Other embodiments are also described and claimed.


Any inquiry concerning this communication or earlier communications from the examiner should be directed to MEHRAN KAMRAN whose telephone number is (571)272-3401.  The examiner can normally be reached on 9-5.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emerson Puente can be reached on (571)272-3652.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MEHRAN KAMRAN/           Primary Examiner, Art Unit 2196