DETAILED ACTION
Claims 1-32, 34-35 are pending.
Priority: 4/17/2017 (Domestic)
Assignee: Intel


Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .



Response to Arguments
Applicant’s arguments with respect to claims 1-32, 34-35 have been considered but are not persuasive. The USPTO disagrees with arguments from the applicant. The amendment of claims 1, 12, 22 are anticipated by Fig. 3 of Banerjee. The macro-TLB is bypassed in case of a hit in the micro/nano TLBs. Claims 34-35 each are anticipated by Fig. 2 of Sodhi.  Therefore all rejections are maintained.   


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 35 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term “smallest possible” in claim 35 is a relative term which renders the claim indefinite. The term “smallest possible” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. There is no guide how to measure the size of “smallest”.


Claim Rejections - 35 USC § 103

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 12, 22, 34-35 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sodhi et al. (20120159074) and further in view of Banerjee et al.(20060206686).

As per claim 1, Sodhi discloses
An apparatus
 a processor to monitor cache utilization of an application during execution of the application for a workload(Sodhi, [0026 -- LLC 120 has output bus 122. PCU 140 includes monitor code 160 which is coupled to LLC 120 by monitor signal line 142. FSM is coupled to LLC 120 by LLC size control line 154. FSM is coupled to PCU 140 by signal line 146.]);
 and a memory, coupled to the processor, to store cache utilization statistics responsive to the monitored cache utilization(Sodhi, [0043 -- Line 142 may represent one or more monitor signals used by code 160 to monitor cache 120. Line 142 may represent signals from cache 120 that indicate the amount of cache actually being used by the processor to process data. Line 142 may represent signals sent to or monitored by code 160 using one or more signal lines.], [0058 -- the cache is monitored to identify a reduced or increased amount of the cache being used by a processor to process data. Block 220 may include monitor code of a power control unit monitoring the cache to identify a reduced or increased amount of cache being used by the processor to process data.]); 
wherein the processor is to determine an optimal cache configuration for the application based at least in part on the cache utilization statistics for the workload during the execution of the application for the workload,(Sodhi, [0064 -- block 230 includes a PCU or monitor code identifying a reduced amount of cache being used by the processor to process data (e.g., which may be the amount monitored or detected at block 220 and/or block 225). If an amount of the cache being used by the processor to process data is reduced (e.g., sufficiently to cause a decrease in the amount of cache available for use by the processor), processing continues to block 240.]);
and wherein the processor is to cause turning off of one or more cache ways for subsequent executions of the workload by the application that cause a reduction in  cache performance during the execution of the application for the workload based at least in part on the monitored cache utilization(Sodhi, [0035 -- A single "way" may be a fraction of the total cache size (e.g., such as 1/16), so reduced ways may reduce the size of the case available to the processor, allowing the operating power of the processor to also be reduced. For example, PCU 140 may reduce the execution unit operating voltage when in DCS mode (e.g., since the size of the cache has been reduced). Also, when more cache is being used or needed by the processor, expand logic (e.g., part of unit 140) may exit DCS mode (expand cache size available to the processor to "full ways", thus allowing for operating power to the processor to also be increased)], [0064 -- If an amount of the cache being used by the processor to process data is reduced (e.g., sufficiently to cause a decrease in the amount of cache available for use by the processor), processing continues to block 240. In some cases, block 230 includes a PCU or monitor code identifying an increased amount of cache being used by the processor to process data]);
Sodhi does not disclose the following, however Banerjee discloses:
wherein a multi-layer Translation Lookaside Buffer (TLB) comprises a first TLB and a second TLB, wherein the first TLB is smaller than the second TLB,(Banerjee, [0022 -- The instruction fetcher includes a fetch scheduler that selects each clock cycle the virtual fetch address of one of the plurality of threads for fetching from the instruction cache. The instruction fetcher also includes a macro-TLB, a micro-TLB, and a plurality of nano-TLBs each associated with a respective one of the plurality of threads. The macro-TLB caches physical address translations for memory pages for the plurality of threads. The micro-TLB has a plurality of entries configured to cache the translations for a subset of the memory pages cached in the macro-TLB. The nano-TLBs each cache a physical address translation of at least one memory page for the respective one of the plurality of threads.]);
 wherein the first TLB is to process per surface Banerjee,  [0053 -- At decision block 316, the control logic 208 determines whether the hit determined at decision block 308 was in the micro-TLB 222 or the nano-TLB 202 of the thread context selected for fetching at block 302. If the hit was in the nano-TLB 202, then flow proceeds to block 322]);
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the features of Banerjee into the system of Sodhi, for the benefit of efficient execution of multiple instruction threads can be performed by using three-tiered translation lookaside buffer (TLB) architecture.  

Claims 12 and 22 are similar to claim 1 and therefore the same rejections are incorporated.

As per claim 34, the rejection of claim 1 is incorporated, in addition Sodhi discloses:
wherein the processor is to cause turning on of an amount of cache for subsequent executions of the workload by the application,(Sodhi, [0068 -- At block 260 an amount of cache available for use by the processor is increased, based on the increased amount of cache being used or identified at blocks 220-230. In some cases, the change (increase) may be based on: (1) the detected size of the cache currently being used by the processor (if 4 Mb used, then size to 4 Mb available); (2) metrics and heuristics considering (1); or (3) both (1) and (2). The change or increase may be performed in order to increase the amount of cache available and power consumed so that more data can be processed or so that processing can be performed more quickly.]);

As per claim 35, the rejection of claim 1 is incorporated, in addition Sodhi discloses:
wherein the amount of cache is a smallest possible amount of cache sufficient for subsequent executions of the workload by the application(Sodhi, [0068 -- The change or increase may be performed in order to increase the amount of cache available and power consumed so that more data can be processed or so that processing can be performed more quickly. Since a larger amount of cache is needed for processing to continue at the same or a greater speed and performance level, the amount of cache available and power consumed can be increased to avoid decreasing processing speed and performance.]).

Claims 2, 7-8, 10-11, 13, 18-19, 21, 23, 28-29, 31 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sodhi et al. (20120159074), and further in view of Banerjee et al.(20060206686), and further in view of Reed et al.(20170300418).

As per claim 2, the rejection of claim 1 is incorporated, and Sodhi does not explicitly disclose the following, however Reed discloses:
wherein the cache utilization statistics comprise per frame cache statistics (Reed, [0013 – utilization trends]; [0019 – trends for a set group/frame]; [0050 – hit rate of set group]);
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the controller of Reed into the dynamic cache sizing of Sodhi for benefit of an efficiently functioning computing system which enables and disables power from a cache as appropriate according to the cache utilization trend calculated based on a granule of cache space (group) to tailor it towards the needs of the running program (Reed, 0085). 


wherein the cache utilization statistics are accumulated over two or more frames (Reed, [Fig. 6 – each group/frame collect hit rate; or in another interpretation, each iteration includes a set/frame]);
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the controller of Reed into the dynamic cache sizing of Sodhi for benefit of an efficiently functioning computing system which enables and disables power from a cache as appropriate according to the cache utilization trend calculated based on a granule of cache space/group to tailor it towards the needs of the running program (Reed, 0085). 

As per claim 8, the rejection of claim 1 is incorporated, and Sodhi does not explicitly disclose the following, however Reed discloses:
wherein the cache utilization statistics include cache allocation or cache hit ratio, or a combination thereof (Reed, [Fig. 6 – hit rate monitored]; [Fig. 7 – prefetch but not used]);
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the controller of Reed into the dynamic cache sizing of Sodhi for benefit of an efficiently functioning computing system which enables and disables power from 

As per claim 10, the rejection of claim 1 is incorporated, in addition Sodhi discloses,
wherein the cache utilization statistics are updated after an iteration of execution of the workload (Sodhi, [Fig. 2 220 – monitoring cache]; [0071 – processor continues to monitor after iteration]);

As per claim 11, the rejection of claim 1 is incorporated, in addition Sodhi discloses,
wherein the processor is to comprise one or more of: a Graphics Processing Unit (GPU) or a processor core, or a combination thereof (Sodhi, [0108 – graphics processor]);

Claim 21 is similar to claim 10 and therefore the same rejections are incorporated. 

Claim 31 is similar to claim 10 and therefore the same rejections are incorporated. 

Claims 13, 18-19 are similar to claims 2, 7-8, respectively and therefore the same rejections are incorporated. 

Claims 23, 28-29 are similar to claims 2, 7-8, respectively and therefore the same rejections are incorporated. 

Claim 3-4, 14-15, 24-25, 32 are rejected under 35 U.S.C. 103 as being unpatentable over Sodhi et al. (20120159074), and further in view of Banerjee et al.(20060206686), and further in view of Branover et al (20110283124).

As per claim 3, the rejection of claim 1 is incorporated, and Sodhi discloses different types of cache sizing and operating voltage management configurations, however:
Branover further discloses,
wherein the optimal cache configuration is stored in the memory as a per workload profile for the application, (Branover, [0039 – information of Table I stored in memory based on workload]);
and the smallest amount of cache to be turned on for subsequent executions of the workload is based at least in part on the per workload profile (Branover, [0041 – based on workload, turn on portions of cache]);
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the processor of Branover into the dynamic cache sizing of Sodhi for benefit of selectively altering power of cache memory responsive to a processor (100) 

As per claim 4, the rejection of claim 1 is incorporated, and Sodhi discloses storing cache heuristics/parameters, however:
Branover further discloses,
wherein the processor is to store parameters for the optimal cache configuration in a context image in the memory (Branover, [0039 – information of Table I stored in memory based on workload]; [0041 – state changes for re-sizing]);
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the processor of Branover into the dynamic cache sizing of Sodhi, Reed, Ananthakrishnan for benefit of selectively altering power of cache memory responsive to a processor (100) having changed between the operating points. The power management of processing units is configured to minimize the use of power by corresponding memory subsystem. The controller can perform power allocation actions so as to enable the processor to maximize the performance of power consumed. The 

Claim 14 is similar to claim 3 and therefore the same rejections are incorporated. 

Claim 15 is similar to claim 4 and therefore the same rejections are incorporated. 

Claim 24 is similar to claim 3 and therefore the same rejections are incorporated. 

Claim 25 is similar to claim 4 and therefore the same rejections are incorporated. 

As per claim 32, the rejection of claim 1 is incorporated, and Sodhi does not explicitly disclose the following, however Branover discloses:
wherein the processor is to: 
load a context state and to cause powering on of an amount of cache according to a cache configuration for each cache in the context state before execution of the workload(Branover, [0041 -- The information received by state controller 135 from the various sources may be used to determine an appropriate operating point for the processor, based on factors such as workload, available power, power and frequency limits, and so forth]
collect the cache utilization statistics during execution of the workload(Branover, [0053 -- Activity monitor 164 in this embodiment is coupled to receive information form execution unit(s) 124 (e.g., of FIG. 2). Using the information received form the execution unit(s) 124, activity monitor may calculate the average activity for the corresponding processing node (e.g., core 101 in this case) over a predetermined time interval. Activity monitor 164 may receive information about instructions executed, pipeline stalls, or any other type of information indicative of the activity of the execution unit(s)], [0055 -- CIPS unit 154, which may receive information indicating the number of executed instructions in which the results therefrom are committed to registers, over an interval of one second. As noted above, CIPS is "committed instructions per second", and is thus indicative of a workload of the processor]); 
store the cache utilization statistics at an end of each execution(Branover, [0056 -- decision unit 156 may consider both the P-state and a CIPS value in determining whether or not to perform a cache resizing action, as well as determining what the action is to be]); 
and update a cache configuration parameter of the cache configuration for the subsequent executions of the workload by the application(Branover, [0048 -- If decision unit 156 determines that the cache is to be resized by powering down or powering up selected ways or sets, one or more signals (`Resize Cache`) may be sent to switching unit 158. Responsive to receiving these signals, switching unit 158 may generate one or more signals (`Pwr Dn [N:0]`) for powering up or powering down selected cache ways or sets.]);
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the processor of Branover into the dynamic cache sizing of Sodhi, Reed, Ananthakrishnan for benefit of selectively altering power of cache memory responsive to a processor (100) having changed between the operating points. The power management of processing units is configured to minimize the use of power by corresponding memory subsystem. The controller can perform power allocation actions so as to enable the processor to maximize the performance of power consumed. The power can be selectively removed from ways of cache memory responsive to changing the performance state (Branover, 0009). 


Alternatively, claims 4, 15 and 25, 32 are rejected under 35 U.S.C. 103 as being unpatentable over Sodhi et al. (20120159074), and further in view of Banerjee et al.(20060206686), and further in view of Cohen et al (20050080994).

As per claim 4, the rejection of claim 1 is incorporated, and Sodhi does not explicitly disclose the following, however Cohen further discloses,
wherein the processor is to store parameters for the optimal cache configuration in a context image in the memory (Cohen, [0100 - <cacheValue>  represents cache requirements]; [0101 - <cacheValue> loaded into register]; [0102 – context switch swaps requirements]);
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the processor methods of Cohen into the dynamic cache sizing of Sodhi, Reed, Ananathakrishnan for benefit of partitioning along the ways allows the cache partitions to be powered on/off without affecting address compare logic or data in other cache partitions, thus reducing power consumption (Cohen, 0089). 

Claim 15 is similar to claim 4 and therefore the same rejections are incorporated. 

Claim 25 is similar to claim 4 and therefore the same rejections are incorporated. 

As per claim 32, the rejection of claim 1 is incorporated, and Sodhi does not explicitly disclose the following, however Cohen discloses:
wherein the processor is to: 
load a context state and to cause powering on of an amount of cache according to a cache configuration for each cache in the context state before execution of the workload(Cohen, [0102 -- At each context switch, the operating system can be responsible to maintain the current cache requirements of each process. When the process is loaded, the current <cacheValue> is loaded into the process table. At each context switch, the current processes <cacheValue> is saved in the process table and the new processes <cacheValue> is loaded into the processor register.], [0108 -- In this procedure, a cache working set size is kept along with the task table. When a new task is swapped in, its cache footprint is loaded into the current cache parameters and the old task's cache information is swapped out with the task. Tables are used to track cache utilization with tasks currently in use.]); 
collect the cache utilization statistics during execution of the workload(Cohen, [0098 -- The third approach, "use profile information," is to capture working-set information (via profiling) and feed the profiling information back into the compiler using a subsequent compile], [0108 -- a cache working set size is kept along with the task table. When a new task is swapped in, its cache footprint is loaded into the current cache parameters and the old task's cache information is swapped out with the task. Tables are used to track cache utilization with tasks currently in use.]); 
store the cache utilization statistics at an end of each execution(Cohen, [The third approach, "use profile information," is to capture working-set information (via profiling) and feed the profiling information back into the compiler using a subsequent compile], [0108 -- a cache working set size is kept along with the task table. When a new task is swapped in, its cache footprint is loaded into the current cache parameters and the old task's cache information is swapped out with the task. Tables are used to track cache utilization with tasks currently in use.]); 
and update a cache configuration parameter of the cache configuration for the subsequent executions of the workload by the application(Cohen, [0092 -- Described below are various methods that use static software to determine how much L2 cache is required to run a given software application prior to use and to adjust the cache size accordingly to save power.]);
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the processor methods of Cohen into the dynamic cache sizing of Sodhi, Reed, Ananathakrishnan for benefit of partitioning along the ways allows the cache partitions to be powered on/off without affecting address compare logic or data in other cache partitions, thus reducing power consumption (Cohen, 0089). 


Claims 5-6, 9, 16-17, 20, 26-27, 30 are rejected under 35 U.S.C. 103 as being unpatentable over Sodhi et al. (20120159074), and further in view of Banerjee et al.(20060206686), and further in view of Balakrishnan et al (20100275049).

As per claim 5, the rejection of claim 1 is incorporated, and Sodhi discloses dynamic cache sizing with optimal power and performance. Since banks are part of cache memory, Sodhi discloses banks. However Balakrishnan further discloses:
wherein the processor is to power on one or more banks of caches based at least in part on the optimal cache configuration (Balakrishnan, [Fig. 5 – turn on banks]; [0067 – turn on bank group]);
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the apparatus of Balakrishnan into the dynamic cache sizing of Sodhi, Reed, Ananthakrishnan for benefit of enabling operation of multiple banks (140) of a non-uniform cache access (NUCA) cache (135), where the cache is vertically distributed across multiple banks. Individual banks are disabled by sequentially turning off the individual banks having the larger and smaller access latencies of the NUCA cache at discrete power states. The system enables disabling banks by sequentially turning of banks with larger and smaller access latencies of the NUCA cache at discrete power states, thus conserving power in an effective manner (Balakrishnan, 0009). 

As per claim 6, the rejection of claim 1 is incorporated, and Sodhi discloses dynamic cache sizing with optimal power and performance. Since banks are part of cache memory, Sodhi discloses banks.

wherein the processor is to power off, or leave powered off, one or more banks of caches based at least in part on the optimal cache configuration (Balakrishnan, [Fig. 5 – turn off banks]; [0068 – turn off bank group]);
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the apparatus of Balakrishnan into the dynamic cache sizing of Sodhi, Reed, Ananthakrishnan for benefit of enabling operation of multiple banks (140) of a non-uniform cache access (NUCA) cache (135), where the cache is vertically distributed across multiple banks. Individual banks are disabled by sequentially turning off the individual banks having the larger and smaller access latencies of the NUCA cache at discrete power states. The system enables disabling banks by sequentially turning of banks with larger and smaller access latencies of the NUCA cache at discrete power states, thus conserving power in an effective manner (Balakrishnan, 0009). 

As per claim 9, the rejection of claim 1 is incorporated, and Sodhi discloses dynamic cache sizing with optimal power and performance. Since banks are part of cache memory, Sodhi discloses banks.
Balakrishnan further discloses:
wherein one or more banks of caches are turned on or turned off during execution of the application according to accumulated cache utilization statistics (Balakrishnan, [Fig. 5 – turn on/turn off] ; [0067 – turn on/off banks]);
Therefore it would have been obvious to a person of ordinary skill at the time of filing to incorporate the apparatus of Balakrishnan into the dynamic cache sizing of Sodhi, Reed, Ananthakrishnan for benefit of enabling operation of multiple banks (140) of a non-uniform cache access (NUCA) cache (135), where the cache is vertically distributed across multiple banks. Individual banks are disabled by sequentially turning off the individual banks having the larger and smaller access latencies of the NUCA cache at discrete power states. The system enables disabling banks by sequentially turning of banks with larger and smaller access latencies of the NUCA cache at discrete power states, thus conserving power in an effective manner (Balakrishnan, 0009). 

Claims 16-17, 20 are similar to claims 5-6, 9, respectively, and therefore the same rejections are incorporated. 

Claims 26-27, 30 are similar to claims 5-6, 9, respectively and therefore the same rejections are incorporated. 



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ARVIND TALUKDAR whose telephone number is (571)270-3177. The examiner can normally be reached M-F, 10 am-6pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached on 571-270-7519. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-

Arvind Talukdar
Primary Examiner
Art Unit 2132



/ARVIND TALUKDAR/Primary Examiner, Art Unit 2132