DETAILED ACTION
This is in response to the amendment filed 21 January 2021.
As a result of the amendment, claims 1 - 2, 7 - 8 and 13 - 14 are amended, and claims 19 - 21 are canceled. Therefore, claims 1 - 18 are currently pending in the application. Claims 1, 7 and 13 are independent claims.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 7, and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (U.S. PGPub 2018/0332252; Nov. 14, 2018; hereinafter “Wang”)1 in view of Lo et al. (U.S. PGPub 2018/0321980; Nov. 8, 2018; hereinafter “Lo”)2.
Regarding claim 1, Wang teaches a computer implemented method [Figs 2,3] of controlling energy consumption of a battery powered device, the method comprising: [Fig 1, including a power supply (e.g., a battery) (¶ [0026])] 
[¶ [0003]] the state of the device based on a CPU utilization rate of a CPU of the device; [The state of the device is determined by the collected statistics including the frame rate and a current OPP, the current OPP is a CPU utilization rate.  Prober module 210 is responsible for collecting statistics from the hardware components of the image processing apparatus 100 including the controller 10, the co-controller 20, and the storage device 30. Specifically, the statistics include the current frame rate used for image rendering, the frame rendering times required by the controller 10 and the co-controller 20, the current and maximum Operating Performance Points (OPP) of the controller 10 and the co-controller 20 (¶ [0028]); the current OPP/DVFS may be represented by the frequency and voltage currently applied to the controller (e.g., CPU or GPU) (¶ [0044]).]
selecting, by the device, a policy of a plurality of different policies [current frame rate (Fig 3, S350), upper limit frame rate (S370), target frame rate (S380)] based on the determined state, each policy comprising a respective memory bandwidth setting; and [Fig 3: steps S330, S340 to S350 or step S360 to step S370 or S380; Step S330 determines “frame miss rate” based on collected frame rate (part of the determined state), the missed frame rate is used in Step S340 to determine which policy to select (¶¶ [0045]-[0046]); ¶ [0042] teaches that the memory bandwidth is dependent on the frame rate.]
applying, by the device, the memory bandwidth setting of the selected policy to a speed setting of a memory bus of the device. [S390, Apply adjusted frame rate; (¶ [0042]); ¶ [0042] teaches that the memory bandwidth is dependent on the frame rate.]
While Wang teaches applying a selected memory bandwidth from a plurality of policies and that there is a current and maximum CPU frequency setting (current and maximum OPP/DVFS, Wang does not specifically teach that the policy is the highest reward policy  and further comprises a CPU frequency setting or that device sets the CPU frequency setting of a selected policy to the CPU, and Wang further does not teach the plurality of different polices 
However, in the related art of optimization of electrical consumption [Abstract, i], Lo teaches selecting the best policy [¶ [0139] teaches selecting the best policy, i.e. highest reward] from the plurality of different polices having been derived via reinforcement learning performed on the device by repeatedly changing CPU frequency setting, memory bandwidth settings, and associated rewards based on a combination of energy savings and video quality. [¶ [0070] teaches using predictive control for a video player. ¶ [0078] teaches that continuous on-line learning allows a DVFS controller to adapt to adapt to run-time interference and run effectively. ¶[0078] teaches updating the associated rewards by continuously updating the model, including DVFS controller, ¶ [0158] teaches changing CPU frequency setting, ¶ [0163] teaches controlling memory bandwidth. See also Fig 19 and ¶¶ [0102]-[0115] for further description of reinforcement learning]
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to have applied the reinforcement learning of Lo to the method of Wang to achieve a method in which the best DVFS policy is selected using reinforcement learning for DVFS based on energy savings and video quality to determine memory bandwidth settings for the benefit of adapting to run-time interference and to run effectively on diverse platforms (¶ [0078]).
Regarding claim 7, Wang teaches a battery powered device [Fig 1; a power supply (e.g., a battery) (¶ [0026])] comprising: a memory storage device comprising instructions; [item 30] and a central processing unit (CPU) in communication with the memory storage device, wherein the CPU is configured to execute the instructions to perform operations [items 10, 20; ¶ [0018]]

Regarding claim 13, Wang teaches A non-transitory computer-readable media storing computer instruction [Fig 1, item 30] for controlling energy consumption of a device, that when executed by a central processing unit (CPU), cause the CPU to perform the steps comprising: [items 10, 20; ¶ [0018]]
Claims 2 – 6, 8 – 12 and 14 – 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang in view of Lo and further in view of Molnos (U.S. PGPub 2019/0187778; Jun. 20, 2019; hereinafter “Molnos”)3.
Regarding claim 2, the combination of Wang/Lo teaches the method of claim 1. While Wang in the combination teaches the device playing a first video [¶ [0003] teaches that the material played is video material.] The combination does not teach determining a respective first state of the device responsive to the playing; applying the combination of a CPU frequency setting and the memory bandwidth setting, computing a reward value based on a fps and power utilization; and associating the first state and the reward value with the combination.
However, in the related art of optimization of electrical consumption [Abstract], Molnos teaches for each combination of a plurality of different combinations of CPU frequency settings and memory bandwidth settings: [Fig 3, loop starting at step 104]
determining, by the device, a respective first state of the device responsive to the device playing; [step 104; ¶ [0104]]
applying, by the device, the CPU frequency setting of the combination to the CPU and the memory bandwidth setting of the combination to the speed of the memory bus and, [steps 114, 116 – 124; the method goes on to step 114 of exploitation of each correspondence table 58 in accordance with what was mentioned above concerning the operation of the learning engine 48. (¶ [0109]); Molnos teaches a loop that tests each correspondence table (Fig 1, item 58), correspondence tables include adjusting clock frequency and other settings. That loop can either be an exploitation (step 114) or a random exploration (step 116).] thereafter, computing a reward value for combination based on a fps of the first video and power utilization of the device during playing of the first video; and [step 108; current reward value r(t) (¶ [0106])]
associating, by the device, the first state and the reward value with the combination. [step 110; ¶ [0107]]
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to have applied the teaching of Molnos with the method of the combination of Wang/Lo to achieve a method in which a reward value is calculated for and associated with frequency and memory bandwidth settings for the benefit of optimization of electrical energy consumption (¶ [0100])
Regarding claim 3, the combination of Wang/Lo/Molnos teaches the method of claim 2, and Molnos in the combination further teaches selecting, by the device, a combination having a greatest reward value among combinations associated with each different first state to produce the plurality of policies. [determining the maximum value among the quality values (¶ [0092])]
Regarding claim 4, the combination of Wang/Lo/Molnos teaches the method of claim 2, and Molnos in the combination further teaches computing the reward value for the combination comprises: calculating a function regarding the values to be rewarded. [r(t)=F[s(t), a(t-1)] (¶ [0085]); the objective of the function F is to predefine, according to the envisaged application context, a reward function seeking to optimize the accumulation of rewards r(t) over time, as is well known in the field of learning by reinforcement. A large number of functions for computing said cumulative reward could therefore be suitable for the present invention, in particular those taught in the state of the art. (¶ [0085])]
While Molnos teaches a formula to optimize the reward using various physical and operating parameters of the device, Molnos nor the combination of Wang/Lo/Molnos does not specifically disclose the specific formula claimed. However, the device of the combination of Wang/Lo/Molnos in operation is to play video and the Wang teaches the impact of the frame rate on the power utilization. Molnos ¶ [0085] teaches that reward functions suitable for a given function are well known in the art. In a case where values such as power and fps are parameters to be evaluate and rewarded, one of ordinary skill in the art would find it obvious to create a function that uses the parameters of power and fps of Wang to calculate a reward that emphasizes those values for the benefit of optimizing the rewards over time.
Regarding claim 5, the combination of Wang/Lo/Molnos teaches the method of claim 4. 
Molnos ¶ [0085] teaches that reward functions suitable for a given function are well known in the art, however Molnos nor the combination of Wang/Lo/Molnos teach the specific values for fps and λ as claimed. 
It would have been obvious to one of ordinary skill in the art that In the case where values such as power and fps are to be rewarded, to make a design choice to select specific values for a function that calculates a reward that emphasizes those values to select an appropriate reward value for those values of fps and λ for the benefit of optimizing the rewards over time.
Regarding claim 6, the combination of Wang/Lo/Molnos teaches the method of claim 2, and Molnos in the combination further teaches the combinations are evaluated in a random order. [random exploration (¶ [0110])]
Regarding claims 8 – 12, the claims depend on claim 7, and repeat the limitations of claims 2 – 6 respectively. The claims are rejected under a similar rational as regarding the respective claim above.
Regarding claims 14 – 18, the claims depend on claim 13, and repeat the limitations of claims 2 – 6 respectively. The claims are rejected under a similar rational as regarding the respective claim above.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kim Huynh can be reached on (571)272-4147.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/J.N./            Examiner, Art Unit 2186                                                                                                                                                                                            

/KIM HUYNH/            Supervisor Patent Examiner, Art Unit 2186                                                                                                                                                                                            


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 Reference effectively filed May 10, 2017.
        2 Reference effectively filed Dec. 4, 2015.
        3 Reference effectively filed Dec. 15, 2017.