DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
This Office Action is responsive to communication (16/456,984) filed on 06/28/2019.
Claims 1-22 are pending.
Claim 22 is new.
Claims 1, 8 and 15 are amended.
Claims 1-22 will be examined.

Response to Arguments
Abstract objection is withdrawn in view of applicant’s amendment.
The 35 USC § 101 rejections are withdrawn in view of applicant’s amendments.
Applicant’s arguments filed 04/27/2021 have been fully considered.
As an initial matter, the examiner notes that the Applicants added a new limitation “a code updater to replace at least portion of the candidate code with modified code corresponding to the relative highest state and action pair reward value, at least one of the weight manager, the state identifier, the action identifier, the reward calculator, the quality function definer or the code updater implemented by a logic circuit” into claims 1, 8 and 15.
Accordingly, Applicant’s amendment necessitated the new ground(s) of rejection as being presented in details below.

 

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations, a weight manager to apply a first weight, a state identifier to identify a first state, an action identifier to identify candidate actions, a reward calculator to determine reward value, a quality function definer to determine a relative highest state and the code updater to replace at least a portion, in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.



In claim 1, the claimed element “a code updater to replace at least a portion of the candidate code with modified code corresponding to relative highest state and action pair reward value, at least one of the weight manager, the state identifier, the action identifier, the reward calculator, the quality function definer or the code updater implemented by a logic circuit” invokes 35 U.S.C. 112(f). However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function.  The claimed elements “ at least one of the weight manager, the state identifier, the action identifier, the reward calculator, the quality function definer or the code updater implemented by a logic circuit” is referred to only one of them was implemented by a circuit while others weren’t implement by a circuit. Therefore, the claim is indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.
The dependent claims 2-7 are rejected based on its parent claim 1.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-2, 4-9, 11-16 and 18-22 are rejected under 35 U.S.C. 103 as being unpatentable over Nag et al. (US 2020/0065128) and in view of Schibler et al. (US 2019/0312800).

As to claims 1, 15 and 22, Nag teaches an apparatus to modify candidate code, the apparatus comprising: 
a weight manager to apply a first weight value to a first objective function [Nag, par. 0059 wherein each observation o can be thought of as a vector of numeric values selected from a set of possible observation vector … the metrics or observed operational characteristics may indicate the amount of memory allocated for applications and/or application instances, networking latencies experienced by one or more applications, …; the observation value is interpreted as a first weight value]; 
a state identifier to identify a first state corresponding to the candidate code [Nag, par. 0061 wherein the labels for the states and actions are essentially unique identifiers for the corresponding states and actions]; 
an action identifier to identify candidate actions corresponding to the identified first state [Nag, par. 0061 wherein the labels for the states and actions are essentially unique identifiers for the corresponding states and actions]; 
Nag does not explicitly disclose determine reward values, a quality function definer to determine a relative highest state and action pair reward value based on respective ones of the reward values and a code updater to replace at least a portion of the candidate code with modified code corresponding to the relative highest state and action pair reward value, at least one of the quality function definer implemented by a logic circuit; however, in an analogous art of Method, Apparatus and System for Real-Time Optimization of Computer-Implemented Application Operations using Machine Learning Techniques, Schibler teaches:
a reward calculator to determine reward values corresponding to respective ones of (a) the identified first state, (b) one of the candidate actions and (c) the first weight value [Schibler, par. 0049 wherein the first scoring function corresponds to a scoring function selected from a group consisting of: performance measurement W1/cost where W1 represents a weighted value; 0292-0299 wherein the optimizer may select the action with the highest Q-value to determine the updated application settings]; 
a quality function definer to determine a relative highest state and action pair reward value based on respective ones of the reward values [Schibler, pars. 0283-0289 and 0292-0299 wherein the optimizer may select the action with the highest Q-value to determine the updated application settings]. 
a code updater to replace at least portion of the candidate code with modified code corresponding to the relative highest state and action pair reward value [Schibler, pars. 0292-0299 wherein the optimizer system causes an updated measurement to be determined in relation to the first object], at least one of the weight manager, the state identifier, the action the quality function definer [Schibler, pars. 0283-0289 and 0292-0299 wherein the optimizer may select the action with the highest Q-value to determine the updated application settings] or the code updater implemented by a logic circuit [Schibler, par. 0030].
It would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention to combine the teachings since Nag and Schibler s are in the same field of endeavor such as machine-learning technique to provide method and system which provide modular reinforcement-learning-based application manager interface to a user-specifiable reward-generation to allow the rewards that provide feedback from the computational environment based on highest state and action pair.

As to claim 2, 9 and 16, Nag and Schibler teach the apparatus as defined in claim 1, further including a machine learning engine to estimate a quality function by applying the respective ones of the reward values to a neural network [Schibler, Fig. 3 and par. 0144; 0292-0299 wherein the optimizer may select the action with the highest Q-value to determine the updated application settings]. 

As to claim 4, 11 and 18, Nag and Schibler teach the apparatus as defined in claim 1, further including an objective function selector to: 
select a second objective function [Nag, par. 0077]; and 
invoke the weight manager to apply a second weight value to the second objective function [Nag, par. 0077]. As to claim 5, 12 and 19, Nag and Schibler teach the apparatus as defined in claim 4, wherein the reward calculator is to calculate an aggregate reward for the reward values based on the first and second objective functions [Nag, par. 0062 and 0067]. As to claim 6, 13 and 20, Nag and Schibler teach the apparatus as defined in claim 1, wherein the state identifier is to iteratively identify additional states corresponding to the candidate code, the action identifier to identify additional candidate actions corresponding to the respective additional states [Nag, par. 0062, wherein a transition from state to state as a result of action produces observation.  Each state transition is associated with a probability]. As to claim 7, 14 and 21, Nag and Schibler teach the apparatus as defined in claim 1, wherein the weight manager is to determine the first weight value for the first objective function and a second weight value for a second objective function based on behavioral observation [Nag, par. 0061 wherein the observations generated by the environment and transmitted to the manager reflect the state of the environment at the time that the observation are made] of a code developer associated with the candidate code [Nag, 0072 and 0074]. 

As to claim 8, Nag teaches a non-transitory computer readable storage medium comprising computer readable instructions that, when executed, cause at least one processor to at least: 
apply a first weight value to a first objective function [Nag, par. 0059 wherein each observation o can be thought of as a vector of numeric values selected from a set of possible observation vector … the metrics or observed operational characteristics may indicate the amount of memory allocated for applications and/or application instances, networking latencies ; 
identify a first state corresponding to the candidate code [Nag, par. 0061 wherein the labels for the states and actions are essentially unique identifiers for the corresponding states and actions]; 
identify candidate actions corresponding to the identified first state [Nag, par. 0061 wherein the labels for the states and actions are essentially unique identifiers for the corresponding states and actions]; 
Nag does not explicitly disclose determine reward value, a relative highest state and action pair reward value based on respective ones of the reward values and replace at least a portion of the candidate code with modified code corresponding to the relative highest state and action pair reward value, at least one of the quality function definer implemented by a logic circuit; however, in an analogous art of Method, Apparatus and System for Real-Time Optimization of Computer-Implemented Application Operations using Machine Learning Techniques, Schibler teaches:
determine reward values corresponding to respective ones of (a) the identified first state, (b) one of the candidate actions and (c) the first weight value [Schibler, par. 0049 wherein the first scoring function corresponds to a scoring function selected from a group consisting of: performance measurement W1/cost where W1 represents a weighted value; 0292-0299 wherein the optimizer may select the action with the highest Q-value to determine the updated application settings]; 
determine a relative highest state and action pair reward value based on respective ones of the reward values [Schibler, pars. 0283-0289 and 0292-0299 wherein the optimizer may select the action with the highest Q-value to determine the updated application settings]. 
replace at least portion of the candidate code with modified code corresponding to the relative highest state and action pair reward value [Schibler, pars. 0292-0299 wherein the optimizer system causes an updated measurement to be determined in relation to the first object].
It would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention to combine the teachings since Nag and Schibler are in the same field of endeavor such as machine-learning technique to provide method and system which provide modular reinforcement-learning-based application manager interface to a user-specifiable reward-generation to allow the rewards that provide feedback from the computational environment based on highest state and action pair.

Claims 3, 10 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Nag, in view of Schibler and in view of Tran et al. (US 2019/0354849).

As to claims 3, 10 and 17, 
Nag and Schibler teach the apparatus as defined in claim 2.
Nag and Schibler do not explicitly disclose a Bellman estimation; however, in an analogous art of Automatic Data Preprocessing, Tran teaches:
the quality function definer is to define the quality function as a Bellman estimation [Tran, par. 0067 wherein in order to update the artificial neural network, a Bellman equation may be used].  
It would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention to combine the teachings since Nag, Schibler and Tran are in the same field of endeavor such as machine-learning technique to provide method and system which .

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.  See PTO 892.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TINA HUYNH whose telephone number is (408)918-7598.  The examiner can normally be reached on 8:00 - 5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lewis Bullock can be reached on 571-272-3759.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to 
the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/TINA HUYNH/Examiner, Art Unit 2199 

/LEWIS A BULLOCK  JR/Supervisory Patent Examiner, Art Unit 2199