Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/10/21 has been entered.

Response to Arguments
Applicant’s arguments with respect to claim(s) have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Applicant states: In an effort to advance prosecution, claims 1 through 7 have been amended to set forth “circuitry” and, as such, any further interpretation under §112(f) cannot be proper. As amended, all structure set forth in claims 1 through 7 recite supported structure. Withdrawal of the rejection under 35 U.S.C. §112(b) is respectfully requested.
Examiner states: Examiner respectfully disagrees. The term “circuitry” may not explicitly limit the interpretation to hardware. Examiner recommends using or including at least a memory or processor.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations, a weight manager to apply a first weight, a state identifier to identify a first state, an action identifier to identify candidate actions, a reward calculator to determine reward value, a quality function definer to determine a relative highest state and the code updater to replace at least a portion, in 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-2, 4-9, 11-16 and 18-23 are rejected under 35 U.S.C. 103 as being unpatentable over Nag et al. (US 2020/0065128) and in view of Schibler et al. (US 2019/0312800) in further view of Teng (Pub. No. US 2020/0326943).

As to claims 1, 15 and 22, Nag teaches an apparatus to modify candidate code, the apparatus comprising: 
a weight manager circuitry to apply a first weight value to a first objective function [Nag, par. 0059 wherein each observation o can be thought of as a vector of numeric values selected from a set of possible observation vector … the metrics or observed operational characteristics may indicate the amount of memory allocated for applications and/or application instances, networking latencies experienced by one or more applications, …; the observation value is interpreted as a first weight value]; 
a state identifier circuitry to identify a first state corresponding to the candidate code [Nag, par. 0061 wherein the labels for the states and actions are essentially unique identifiers for the corresponding states and actions]; 
an action identifier circuitry to identify candidate actions corresponding to the identified first state [Nag, par. 0061 wherein the labels for the states and actions are essentially unique identifiers for the corresponding states and actions]; 
Nag does not explicitly disclose determine reward values, a quality function definer to determine a relative highest state and action pair reward value based on respective ones of the reward values and a code updater to replace at least a portion of the candidate code with modified code corresponding to the relative highest state and action pair reward value, at least one of the quality function definer implemented by a logic circuit; however, in an analogous art of Method, Apparatus and System for Real-Time Optimization of Computer-Implemented Application Operations using Machine Learning Techniques, Schibler teaches:
a reward calculator circuitry to determine reward values corresponding to respective ones of (a) the identified first state, (b) one of the candidate actions and (c) the first weight value [Schibler, par. 0049 wherein the first scoring function corresponds to a scoring function selected from a group consisting of: performance measurement W1/cost where W1 represents a weighted value; 0292-0299 wherein the optimizer may select the action with the highest Q-value to determine the updated application settings]; 
a quality function definer circuitry to determine a relative highest state and action pair reward value based on respective ones of the reward values [Schibler, pars. 0283-0289 and 0292-0299 wherein the optimizer may select the action with the highest Q-value to determine the updated application settings]. 
a code updater circuitry to replace at least portion of the candidate code with modified code corresponding to the relative highest state and action pair reward value [Schibler, pars. 0292-0299 wherein the optimizer system causes an updated measurement to be determined in relation to the first object], at least one of the weight manager, the state identifier, the action identifier, the reward calculator, the quality function definer [Schibler, pars. 0283-0289 and 0292-0299 wherein the optimizer may select the action with the highest Q-value to determine the updated application settings] or the code updater implemented by a logic circuit [Schibler, par. 0030].
It would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention to combine the teachings since Nag and Schibler s are in the same field of endeavor such as machine-learning technique to provide method and system which provide modular reinforcement-learning-based application manager interface to a user-specifiable reward-generation to allow the rewards that provide feedback from the computational environment based on highest state and action pair.
However, the combination may not explicitly teach the new limitations.
Teng teaches “pair validator circuitry to identify valid ones of the highest state and action pairs based on capabilities of a target platform; and code updater circuitry to replace at least a portion of the candidate code with modified code corresponding to the valid ones of the highest state and action pair ([0102] … In some embodiments, the first instruction set(s) may be determined or selected by one or more components of the image processing system 100 according to application scenario(s). For example, the processing device 140 may determine the first instruction set(s) based on operation characteristics of the initial computation unit(s) (i.e. pair). In some embodiments, the processing device 140 may determine the first instruction set(s) according to optimization purpose(s). For example, the processing device 140 may select the first instruction set(s) that can optimize operation times of the initial computation unit(s). As another example, the processing device may select the first instruction set(s) that may optimize operational power consumptions of the initial computation unit(s). [0104] In some embodiments, the processing device 140 may select a target instruction set from the one or more second instruction sets that the at least one processor supports for optimizing one or more initial computation units. In some embodiments, the processing device 140 may select the target instruction set based on performances of the one or more second instruction sets that the at least one processor supports according to specific optimization purpose(s). For example, the processing device 140 may select an instruction set with highest computational capability from the one or more second instruction sets as the target instruction set. [0126] In some embodiments, the processing device 140 may optimize the one or more initial computation units by compiling instructions included in the one or more initial computation units using (or based on) the at least one instruction set. The processing device 140 may further obtain an optimized processing program based on the one or more optimized computation units and the rest of the plurality of computation units that are not optimized (if any). In some embodiments, the optimized processing program may be presented as an executable file. For example, the processing device 140 may generate an executable file based on the one or more optimized computation units and the rest of the plurality of computation units that are not optimized (if any).)”
It would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention to combine the teachings since Nag, Schibler, Teng are in the same field of endeavor such as machine-learning technique to provide method and system which provide modular reinforcement-learning-based application manager interface to a user-specifiable reward-generation to allow the rewards that provide feedback from the computational environment based on highest state and action pair.

As to claim 2, 9 and 16, Nag, Schibler, Teng teach the apparatus as defined in claim 1, further including a machine learning engine circuitry to estimate a quality function by applying the respective ones of the reward values to a neural network [Schibler, Fig. 3 and par. 0144; 0292-0299 . 

As to claim 4, 11 and 18, Nag, Schibler, Teng teach the apparatus as defined in claim 1, further including an objective function selector circuitry to: 
select alternate objective functions [Nag, par. 0077]; and 
invoke the weight manager to apply alternate weight values to the alternate objective functions [Nag, par. 0077]. As to claim 5, 12 and 19, Nag, Schibler, Teng teach the apparatus as defined in claim 4, wherein the reward calculator circuitry is to calculate an aggregate reward for the reward values based on the objective function and the alternate objective functions [Nag, par. 0062 and 0067]. As to claim 6, 13 and 20, Nag, Schibler, Teng teach the apparatus as defined in claim 1, wherein the state identifier circuitry is to iteratively identify additional states corresponding to the candidate code, the action identifier circuitry to identify additional candidate actions corresponding to the respective additional states [Nag, par. 0062, wherein a transition from state to state as a result of action produces observation.  Each state transition is associated with a probability]. As to claim 7, 14 and 21, Nag, Schibler, Teng teach the apparatus as defined in claim 1, wherein the weight manager circuitry is to determine first weight values for the objective functions and a second weight values for the alternate objective functions based on behavioral observation [Nag, par. 0061 wherein the observations generated by the environment and transmitted to the manager reflect the state of the environment at the time that the observation are made] of a code developer associated with the candidate code [Nag, 0072 and 0074]. 

As to claim 8, Nag teaches a non-transitory computer readable storage medium comprising computer readable instructions that, when executed, cause at least one processor to at least: 
apply weight values to objective functions [Nag, par. 0059 wherein each observation o can be thought of as a vector of numeric values selected from a set of possible observation vector … the metrics or observed operational characteristics may indicate the amount of memory allocated for applications and/or application instances, networking latencies experienced by one or more applications, …; the observation value is interpreted as a first weight value]; 
identify a first state corresponding to the candidate code [Nag, par. 0061 wherein the labels for the states and actions are essentially unique identifiers for the corresponding states and actions]; 
identify candidate actions corresponding to the identified first state [Nag, par. 0061 wherein the labels for the states and actions are essentially unique identifiers for the corresponding states and actions]; 
Nag does not explicitly disclose determine reward value, a relative highest state and action pair reward value based on respective ones of the reward values and replace at least a portion of the candidate code with modified code corresponding to the relative highest state and action pair reward value, at least one of the quality function definer implemented by a logic circuit; however, in an analogous art of Method, Apparatus and System for Real-Time Optimization of Computer-Implemented Application Operations using Machine Learning Techniques, Schibler teaches:
determine reward values corresponding to respective ones of (a) the identified first state, (b) one of the candidate actions and (c) respective ones of the weight values [Schibler, par. 0049 wherein the first scoring function corresponds to a scoring function selected from a group consisting of: performance measurement W1/cost where W1 represents a weighted value; 0292-0299 wherein the optimizer may select the action with the highest Q-value to determine the updated application settings]; 
determine a relative highest state and action pair reward value based on respective ones of the reward values [Schibler, pars. 0283-0289 and 0292-0299 wherein the optimizer may select the action with the highest Q-value to determine the updated application settings]. 
replace at least portion of the candidate code with modified code corresponding to the relative highest state and action pair reward value [Schibler, pars. 0292-0299 wherein the optimizer system causes an updated measurement to be determined in relation to the first object].

“identify valid ones of state and action pairs based on processing hardware of a target
device: and replace at least a portion of the candidate code with modified code corresponding to
valid ones of the state and action pairs” as provided by rejection of claim 1.

As to claim 23, the combination teaches the claim, wherein Teng teaches “the apparatus as defined in claim 1, wherein capabilities of a target platform include at least one of graphical processing unit (GPU) hardware or field programmable gate array (FPGA) hardware ([0023] In some embodiments, the processor 210 may include one or more hardware processors, such as a microcontroller, a microprocessor, a reduced instruction set computer (RISC), an application specific integrated circuits (ASICs), an application-specific instruction-set processor (ASIP), a central processing unit (CPU), a graphics processing unit ( GPU), a physics processing unit (PPU), a microcontroller unit, a digital signal processor (DSP), a field programmable gate array ( FPGA), an advanced RISC machine (ARM), a programmable logic device (PLD), any circuit or processor capable of executing one or more functions, or the like, or a combinations thereof.)”
Rational to claim 1 is applied here.


Claims 3, 10 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Nag, in view of Schibler in view of Teng and in view of Tran et al. (US 2019/0354849).

As to claims 3, 10 and 17, 
Nag, Schibler, Teng teach the apparatus as defined in claim 2.

the quality function definer circuitry is to define the quality function as a Bellman estimation [Tran, par. 0067 wherein in order to update the artificial neural network, a Bellman equation may be used].  
It would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention to combine the teachings since Nag, Schibler, Teng, and Tran are in the same field of endeavor such as machine-learning technique to provide method and system which provide modular reinforcement-learning-based application manager interface to a user-specifiable reward-generation to allow the rewards that provide feedback from the computational environment using Bellman function.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WYNUEL S AQUINO whose telephone number is (571)272-7478. The examiner can normally be reached 9AM-5PM EST M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lewis Bullock can be reached on 571-272-3759. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance 





/WYNUEL S AQUINO/Primary Examiner, Art Unit 2199