DETAILED ACTION

Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Interpretation

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  

Such claim limitations are: 
"machine learning module", "controlled process module", "reinforcement learning module", “loss or reward calculator”, “process states augmentation and differentiation module”, and “differentiator” in claims 1, 3-5, 12-13 and 18.

Because these claim limitations are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, they are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have these limitations interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitations to avoid them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitations recite sufficient structure to perform the claimed function so as to avoid them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 101

35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-2, 9-11 and 19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to abstract idea without significantly more. 

The claim 1 recites:
A PID controller system comprising:
a proportional, integral and derivative (PID) controller;
a machine learning module having an output connection to the PID controller; and
a controlled process module that provides controlled variables to the PID controller via a differentiator that also takes in set-points for the controlled variables and the differentiator outputs a control error to the PID controller, wherein the control error is a difference between the set-points and the controlled variables; and
wherein:
the controlled variables and set-points for the controlled variables are input to the machine learning module; and
the machine learning module calculates an instantaneous loss or reward which is an increasing or decreasing, respectively, function of the control error.

Step 1: 
The claim recites a PID controller system. Thus, the claim is directed to a product, which are statutory categories of invention. 

Step 2A Prong one:
The limitation of the machine learning module calculates an instantaneous loss or reward which is an increasing or decreasing, respectively, function of the control error, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, “calculates” in the context of this claim encompasses the user mentally calculates a loss or reward based on a machine learning model, with help of a pen and paper, as long as the model and the input data set are very simple. 
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.

Step 2A Prong two:
Besides the abstract ideas, the claim recites the additional limitations of providing controlled variables to the PID controller via a differentiator that also takes in set-points for the controlled variables and the differentiator outputs a control error to the PID controller, wherein the control error is a difference between the set-points and the controlled variables; the controlled variables and set-points for the controlled variables are input to the machine learning module. These additional limitations represent mere data gathering (provide or input the data) that is necessary for use of the recited judicial exception (“calculating”) and is recited at a high level of generality. These elements are thus insignificant extra-solution activities. The recited “PID controller system” and “PID controller”, are additional elements which are to implement the system. But the “PID controller system” and “PID controller” are recited generically that they represent no more than mere instructions to apply the judicial exceptions on a controller system and a controller. As such, it is nothing more than an attempt to generally link the use of the judicial exceptions to the technological environment of computer. The recited “controlled process module”, “differentiator” and “machine learning module”, are additional elements which are to implement the system. But the “controlled process module”, “differentiator” and “machine learning module” are recited generically that they represent no more than mere instructions to apply the judicial exceptions on a controller and a controller system. As such, it is nothing more than an attempt to generally link the use of the judicial exceptions to the technological environment of computer. Even when viewed in combination, these additional limitation and additional elements do not integrate the recited judicial exception into a practical application and the claim is directed to the judicial exception.

Step 2B: 
The claim as a whole does not amounts to significantly more than the recited exception. The claim has the following additional limitations and elements:
1) A PID controller system;
2) a proportional, integral and derivative (PID) controller;
3) a machine learning module;
4) controlled process module;
5) a differentiator;
6) provides controlled variables to the PID controller via a differentiator that also takes in set-points for the controlled variables and the differentiator outputs a control error to the PID controller, wherein the control error is a difference between the set-points and the controlled variables;
7) the controlled variables and set-points for the controlled variables are input to the machine learning module.
8) a machine learning module having an output connection to the PID controller.
Regarding 1) – 5), as explained previously, the PID controller system, proportional, integral and derivative (PID) controller, machine learning module, controlled process module and differentiator are at best the equivalent of merely adding the words “apply it” to the judicial exception. Mere instructions to apply an exception cannot provide an inventive concept. Regarding 6) - 7), as explained previously, are extra-solution activities, which for purposes of Step 2A Prong Two was considered insignificant. The recitation of a controlled process module that provides controlled variables to the PID controller via a differentiator that also takes in set-points for the controlled variables and the differentiator outputs a control error to the PID controller and the controlled variables and set-points for the controlled variables are input to the machine learning module are mere data gathering that is recited at a high level of generality, and, as disclosed in Badgwell US 20190187631 A1, are also well-known. These limitations therefore remain insignificant extra-solution activities even upon reconsideration. Regarding 6), “the control error is a difference between the set-points and the controlled variables” merely further limiting the scope of abstract ideas or stating merely technical environment of these abstract ideas. Regarding 8), “a machine learning module having an output connection to the PID controller” merely further limiting the scope of abstract ideas or stating merely technical environment of these abstract ideas.  Thus, 1) – 8) do not amount to significantly more. Even when considered in combination, these additional elements and limitations represent mere instructions to apply an exception and insignificant extra-solution activity, which do not provide an inventive concept. The claim is not eligible.

Regarding claims 2, 9-10 and 19:

Step 1: Claims 2 recite a PID controller system. Thus, the claim is directed to a product, which are statutory categories of invention. Claim 19 recite an auto tuner system. Thus, the claim is directed to a product, which are statutory categories of invention.

Step 2A Prong one:
Similar to claim 1, the claims recite additional loss or reward components … and integrated … under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea.

Step 2A Prong two:
Similar to claim 1, besides the abstract ideas, the claim recites receives … and provides …, that are necessary for use of the recited judicial exception and is recited at a high level of generality. These limitations are thus insignificant extra-solution activities. The recited auto tuner system are additional elements which are configured to implement the method steps. But the auto tuner system is recited generically that it represents no more than mere instructions or performing generic functions to apply the judicial exceptions on a computer. As such, it is nothing more than an attempt to generally link the use of the judicial exceptions to the technological environment of a computer. Even when viewed in combination, these additional limitations and elements do not integrate the recited judicial exception into a practical application and the claim is directed to the judicial exception.

Step 2B: 
Similar to claim 1, the additional elements, as explained previously, are at best the equivalent of merely adding the words “apply it” to the judicial exception. Mere instructions to apply an exception cannot provide an inventive concept; the additional limitations, as explained previously, are extra-solution activities, which for purposes of Step 2A Prong Two was considered insignificant. The other recited additional limitations and elements in the claims either merely further limiting the scope of abstract ideas or stating merely technical environment of these abstract ideas that it does not impose any meaningful limits on practicing the abstract idea, for example, “control variables from the controlled process module”. The claim as a whole does not amount to significantly more than the recited exception. Even when considered in combination, these additional elements and limitations represent mere instructions to apply an exception and insignificant extra-solution activity, which do not provide an inventive concept. The claim is not eligible.

Claim Objections

Claims 1, 5 and 9 are objected to because of the following informalities: 

Claim 1 recites “A PID controller system” at the beginning of the claim that has acronym “PID”. For examination purpose, “A PID controller system” will be construed as “A proportional, integral and derivative (PID) controller system”.

Claim 5 recites “using process state” that has typo. For examination purpose, “using process state” will be construed as “using process states”.

Claim 5 recites “control time difference” that has typo. For examination purpose, “control time difference” will construed as “control action time difference”.

Claim 9 recites “A method for auto tuning a PID controller” at the beginning of the claim that has acronym “PID”. For examination purpose, “A method for auto tuning a PID controller” will be construed as “A method for auto tuning a proportional, integral and derivative (PID) controller system”.

Appropriate correction is required.

Claim Rejections - 35 USC § 112

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 2-3, 6, 9-10, 12-13 and 16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim 2 recites "additional loss or reward components". The relationship between "additional loss or reward components" and "an instantaneous loss or reward" in claim 1 is not clear. For examination purpose, "additional loss or reward components" will be construed as "additional instantaneous loss or reward components".

Claim 2 recites “control action absolute time difference”. It is not clear what “control action” is. For examination purpose, "control action absolute time difference" will be construed as "control action absolute time difference, wherein the control action is input to the machine learning model.”

Claim 3 recites “a control action” that lacks antecedent. For examination purpose, “a control action” will be construed as “the control action”.

Claim 6 recites “controlled process variables”. The relationship between "controlled process variables" and "controlled variables" in claim 1 is not clear. For examination purpose, "controlled process variables" will be construed as "controlled variables".

Claim 9 recites “inputs to a proportional, integral and derivative (PID) controller”. The relationship between "a proportional, integral and derivative (PID) controller " and "a PID controller" is not clear. For examination purpose, “inputs to a proportional, integral and derivative (PID) controller” will be construed as “inputs to the proportional, integral and derivative (PID) controller”.

Claim 9 recites “a control error value”. The relationship between “a control error value” and “a difference” is not clear. For examination purpose, “a control error value” will be construed as “the difference”.

Claim 10 recites “control action time difference”. It is not clear what "control action" is. For examination purpose, "control action time difference" will be construed as "control action time difference, wherein the control action is input to the machine learning model”.

Claim 12 recites “a control action”. The relationship between “a control action” and “control action” in claim 10 is not clear. For examination purpose, “a control action” will be construed as “the control action”.

Claim 13 recites “a controlled process input time difference”. The relationship between “a controlled process input” and “control action” in claim 10 is not clear. For examination purpose, “a controlled process input time difference” will be construed as “the control action time difference”.

Claim 16 recites “the reinforcement learning module” that lacks antecedent. For examination purpose, “the reinforcement learning module” will be construed as “a reinforcement learning module”.

Claim Rejections - 35 USC § 102

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 1, 3-6 and 9-20 are rejected under 35 U.S.C. 102(a)(1) & (a)(2) as being anticipated by Badgwell US 20190187631 A1.

Regarding claim 1, Badgwell teaches a PID controller system (Fig. 3 [0049] - [0051]) comprising:
a proportional, integral and derivative (PID) controller (Fig. 3 [0050]);
a machine learning module having an output connection to the PID controller (Fig. 3 [0051] learning agent and modules); and
a controlled process module that provides controlled variables to the PID controller via a differentiator that also takes in set-points for the controlled variables and the differentiator outputs a control error to the PID controller, wherein the control error is a difference between the set-points and the controlled variables (Fig. 3 [0049] [0050] [0029] – [0032] controlled process module 360 provide controlled variables to PID controller via a differentiator that calculates the error of the controlled variable); and
wherein:
the controlled variables and set-points for the controlled variables are input to the machine learning module (Fig. 3 [0051] [0052] the control variables and set-points are provided to the learning agent and modules); and
the machine learning module calculates an instantaneous loss or reward which is an increasing or decreasing, respectively, function of the control error (Fig. 3 [0051] [0037] [0039] a reward is determined, the reward is based on process controlled variable state related to the size of error relative to the setpoint; reward strategy can be to assign a reward of +100 – increasing or a reward of -100 – an instantaneous loss that decreasing).
Badgwell teaches:
[0049] FIG. 3 schematically shows an example of process controller configuration for implementing a reinforcement learning agent as part of the process controller configuration. In FIG. 3, a controller 360 is part of an overall control system for a reactor 370 including at least one source of controller input (corresponding to a controlled variable) and at least one device for receiving controller output (for controlling a manipulated variable). ... Examples of detectors to provide a source of controller input based on a corresponding controlled variable can include, but are not limited to, a thermocouple, thermometer, or other temperature detector 371, a pressure detector 372, or a detector for product characterization 373. ... A device for receiving controller output can correspond to an actuator, an electrical activator, or another process controller…. An input actuator can correspond to, for example, an actuator 381 associated with a valve to change the valve position. ....

    PNG
    media_image1.png
    738
    1013
    media_image1.png
    Greyscale

[0050] During operation, process controller 360 can receive controller input from a detector (or from a plurality of detectors) 371, 372, or 373…. This controller input can be processed to determine the current value of a controlled variable and to provide an appropriate controller output for control of a device corresponding to a manipulated variable, such as an output signal for an actuator 381 (such as for controlling a valve position) or an output signal for an electrical activator 382 (such as for controlling a duty cycle of a heater). The controller input from a detector (or optionally from another process controller) can be processed using a proportional-integral-derivative (PID) control module 366 to generate the appropriate controller output.
[0051] In FIG. 3, the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366. For example, the controller input from the one or more of detectors 371, 372, or 373 can be used by state analysis module 369 to determine one or more states that are associated with the current value of the controlled variable and/or the value of the controlled variable over a period of time. Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform based on a stored state-action value function 367. The state-action value function can correspond to a plurality of discrete state-action values, a continuous set of state-action values, or a combination thereof. Based on the selected action, the tuning parameters in control module 366 can be modified, such as by making an incremental change in one or more of the tuning parameters. The modified set of tuning parameters for the proportional, integral, (and optional derivative) terms can then be used by proportional-integral control module 366 for determining the controller output signal to actuator 381 and/or electrical activator382. At a later point, after one or more additional evaluations of the state by state analysis module 369, a reward can be determined by reward module 368 that corresponds to the combination of state and action that was selected.
[0029] Proportional-Integral-Derivative (PID) controllers are commonly used as process controllers for modifying a manipulated variable in response to a controlled variable. ...
…
[0032] In Equations (1) through (4), a controlled variable refers to a variable that is measured by the PID controller in order to keep the variable near a reference value or reference trajectory. The reference value in turn tracks a setpoint value through a first order filter. … The PID can attempt to control the flowrate relative to a target or setpoint corresponding to the desired flowrate (i.e., the flowrate setpoint). In order to control the flowrate, measurements of the flow rate can be taken, which correspond to the current value of the controlled variable at a given time. The error in the controlled variable would correspond to the difference between the measured value of the flowrate and the reference value. In order to control the flowrate, the PID can change the position of a valve so that the valve is more open or more closed. The PID control algorithm executes at distinct points in time separated by a constant user-defined control interval. The control gain, integral time, and derivative time are tunable parameters that can be used to modify how the PID controller responds to differences between the measured flowrate and the setpoint for the flowrate.
[0052] …, the proportional-integral control module 366 can also receive changes to the setpoint for a controlled variable, such as from a setpoint modification module 390. Setpoint modification module 390 can also provide setpoint changes to state analysis module 369. Alternatively, changes to the setpoint can be provided from various other types of components, such as an input provided by a process controller associated with another controlled variable.
[0037] …, suitable states for use by a reinforcement learning agent can include states based on one or more measurements of a controlled variable relative to the setpoint, and may often correspond to states based on a plurality of measurements of a controlled variable relative to a setpoint. Examples of possible states can include states related to the size of error relative to the setpoint; states related to convergence or divergence from the setpoint; states related to first order oscillation around a setpoint, and states related to longer time scale oscillation around a setpoint.

    PNG
    media_image2.png
    784
    1162
    media_image2.png
    Greyscale

[0039] Based on these state definitions, a plurality of rewards can be developed that correspond to the reward for ending an evaluation period in a given state and/or a plurality of states. …, the rewards can be based on a combination of the error standard deviation and values for one or both of the oscillation states. An example of a reward strategy can be to assign a reward of +100 when the system is in a target or desired state. The target or desired state can correspond to, for example, having a small error standard deviation and an amount of oscillation (either first order or second order) that is below pre-defined threshold(s). Any convenient reward strategy can be used. States where unstable oscillations are occurring can be assigned a reward of −100. Other states can be assigned any convenient value.

Regarding claim 3, Badgwell further teaches:
 a reinforcement learning module that provides the output from the machine learning module to the PID controller (Fig. 3 [0049] – [0051] reinforcement learning agent provide modified set of tuning parameters for PID control module);
a loss or reward calculator having inputs of controlled variables, set-points for controlled variables, and control action, and having an output to the reinforcement learning module (Figs. 2-3 [0046] [0051] [0052] [0057] the control variables, manipulated variables – control action, and set-points are provided to the learning agent and modules to determine the state, reward is calculated based on the state), a differentiator having an input of a control action from the PID controller and having an output to the reinforcement learning module (Fig. 3 [0051] [0037] [0039] a reward is determined, the reward is based on process controlled variable state related to the size of error relative to the setpoint - differentiator); and a process states augmentation and differentiation module having inputs of process states, controlled variables, and set-points for controlled variables, and having an output to the reinforcement learning module (Figs. 2-3 [0037] [0039] [0046] the control variables, manipulated variables and set points are inputted and augmented, states related to first order oscillation around a setpoint, and states related to longer time scale oscillation around a setpoint are created - differentiation.).

Regarding claim 4, Badgwell further teaches an output of the reinforcement learning module provides auto tuning for the PID controller that includes integral action that ensures a steady state offset free tracking of set-points by relevant process variables (Fig. 3 [0049] – [0051] reinforcement learning agent provide modified set of tuning parameters for PID control module).

Regarding claim 5, Badgwell further teaches using process state and control time difference and augmenting differentiated states with control errors before use as data for the reinforcement learning module, are effected ([0037] states related to the size of error relative to the setpoint; states related to convergence or divergence from the setpoint; states related to first order oscillation around a setpoint, and states related to longer time scale oscillation around a setpoint).

Regarding claim 6, Badgwell further teaches a process state is defined as controlled process variables values and time differences thus producing a standard PID linear controller having proportional, integral and derivative actions for each process variable including optimized loop interactions ([0037] states related to the size of error relative to the setpoint; states related to convergence or divergence from the setpoint; states related to first order oscillation around a setpoint, and states related to longer time scale oscillation around a setpoint).

Regarding claim 9, Badgwell teaches the claimed system. Therefore, he teaches the method steps for implementing the system.

Regarding claim 10, Badgwell further teaches adding loss or reward components based on a controlled variables time difference or control action time difference ([0037] [0057] states related to first order oscillation around a setpoint, and states related to longer time scale oscillation around a setpoint).

Regarding claims 11-12, Badgwell teaches the claimed system. Therefore, he teaches the method steps for implementing the system.
Regarding claim 12, Badgwell further teaches to produce control laws that converge to an optimal control law that is sent to the PID controller to optimize performance of the PID controller ([0051] incremental change in one or more of the tuning parameters – converge to optimal).

Regarding claim 13, Badgwell further teaches reinforcement learning produces integral action when it is configured to optimize a controlled process input time difference; and control action supplied to reinforced learning is a time difference of the control action applied to the controlled process module (Figs. 2-3 [0037] [0046] [0051] [0052] [0057] state related to oscillation of manipulated variables – controlled process input time difference are provided to the learning agent and modules to determine the state, reward is calculated based on the state).

Regarding claims 14-20, Badgwell teaches the claimed PID controller system. Therefore, he teaches the auto tuner system for implementing the PID controller system system.

Regarding claim 19, Badgwell further teaches the control error is integrated at the PID controller ([0029] – [0032]).

Regarding claim 20, Badgwell further teaches the PID controller and the machine learning module are deployed on the same microcontroller hardware ([0052] learning modules co-located with PID module).

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 2 is rejected under 35 U.S.C. 103 as being unpatentable over Badgwell as applied to claims 1, 3-6 and 9-20 above, in view of Chad US 5159547 A.

Regarding claim 2, Badgwell does not explicitly teach additional loss or reward components are based on a controlled variables absolute time difference or control action absolute time difference.
Chad teaches additional loss or reward components are based on a controlled variables absolute time difference or control action absolute time difference (column 2 lines 19-23 and lines 45-53, column 4 lines 25-41, the damping ratio of response parameters).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Badgwell to incorporate the teachings of Chad because they all directed to tuning PID controller. Additional loss or reward components based on a controlled variables absolute time difference or control action absolute time difference will help keep controlled variable close to setpoint.

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Badgwell as applied to claims 1, 3-6 and 9-20 above, in view of Roychowdhury US 20210182385 A1.

Regarding claim 7, Badgwell further teaches the auto tuning mechanism has a multi-objective loss function consisting of multiple losses (Fig. 2 [0057] objectives of error standard deviation, and controlled variable oscillation metric and manipulated variable oscillation metric - multi-objective loss function) and produces an optimal control law at any time without a restart of an auto tuner needed (Fig. 3 [0049] – [0051] reinforcement learning agent provide modified set of tuning parameters for PID control module).
Badgwell does not explicitly further teaches the multi-objective loss function with a pre-specified non-negative weighting factor where a Q-function has weighting factors as parameters and produces an optimal control law for any values of the weighting factors.
Roychowdhury teaches the multi-objective loss function with a pre-specified non-negative weighting factor where a Q-function has weighting factors as parameters and produces an optimal control law for any values of the weighting factors (: Fig. Abstract [0095] [0096] [0063] ACRE automated tuning a controller based on reinforcement Q-learning, cost function with weight matrixes optimized).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Badgwell to incorporate the teachings of Chad because they all directed to tuning a controller. Using a pre-specified non-negative weighting factor where a Q-function has weighting factors as parameters will help augment the multiple losses.

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Badgwell in view of Roychowdhury as applied to claim 7 above, further in view of Hoffmann US 20170261949 A1.

Regarding claim 8, Badgwell further teaches sufficient statistics relevant for an action state value function estimation being collected in a controller and transferring to a computer that provides reinforcement learning results in sending an updated control law back to the controller ([0039] error standard deviation)
Neither Badgwell nor Roychowdhury explicitly teaches the controller is an edge controller.
Hoffmann teaches the controller is an edge controller (Fig. 1 [0078] [0085] [0113] the performance data collected in edge controller sent to machine learning in server).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Badgwell to incorporate the teachings of Hoffmann because they all directed to tuning the control device. Using an edge controller will help provide state data to machine learning module in cloud.

Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
YASUI US 20210055712 A1 teaches PID parameter adjusted through reinforcement Q learning.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Tang whose telephone number is (571)272-7437.  The examiner can normally be reached on M-F 7:30-4 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Thomas Lee can be reached on (571)272-3667.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/M.T./           Examiner, Art Unit 2115



/THOMAS C LEE/           Supervisory Patent Examiner, Art Unit 2115