DETAILED ACTION
Notice of AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim(s) 1-13 are pending.
Claim(s) 1-13 are rejected.
Priority
Foreign priority:
	Acknowledgment is made of applicant’s claim for foreign priority to application no. JP2019-213293 filled on 11/26/2019. The certified copy has been received.
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 01/13/2021, 07/02/2021, /02/18/2022, and 06/14/2022 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement(s) is/are being considered by the examiner.
Drawings
Drawings filled on 11/04/2020 are found to be acceptable for the examination purposes.




Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claim 13 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.
Claim 13:
	The claim(s) 13 does not fall within at least one of the four categories of patent eligible subject matter, because the subject matter of claim 13 are directed to transitory form of computer-readable storage medium.
	Non-limiting examples of claims that are not directed to any of the statutory categories include:
Transitory forms of signal transmission (often referred to as "signals per se"), such as a propagating electrical or electromagnetic signal or carrier wave;
	Claim 13 recites, “a storage medium having recorded thereon a program for causing a computer to function” such that the subject matter of claim 13 are directed to transitory form of computer-readable storage medium. A transitory, propagating signal does not fall within any statutory category. Mentor Graphics Corp. v. EVE-USA, Inc., 851 F.3d 1275, 1294, 112 USPQ2d 1120, 1133 (Fed. Cir. 2017); Nuijten, 500 F.3d at 1356-1357, 84 USPQ2d at 1501-03.
	Therefore, claim(s) 13 does not fall within at least one of the four categories of patent eligible subject matter.
	Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f), is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f):
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f), is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f), is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f), because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are:
“first acquisition unit” in claims 1 and 9-11.
“first learning processing unit” in claims 1 and 11.
“learning processing of a first model” in claim 1.
“first supply unit” in claim 2.
“first recommended control parameter acquisition unit” in claim 2.
“first control unit” in claim 2.
“second acquisition unit” in claim 6.
“second learning processing unit” in claim 6.
“learning processing of a second model” in claim 6.
“second supply unit” in claim 7.
“second recommended control parameter acquisition unit” in claim 7.
“second control unit” in claim 7.

The claim limitations as described above uses generic placeholders for performing the claimed function such that the generic placeholders are modified by functional language as discussed below:
the generic placeholder “first acquisition unit” is modified by the functional language “for acquiring” and “configured to acquire.”
the generic placeholder “first learning processing unit” is modified by the functional language “for executing,” “configured to set the reward value,” and “configured to increase or decrease the reward value.”
the generic placeholder “learning processing of a first model” is modified by the functional language “configured to output.”
the generic placeholder “first supply unit” is modified by the functional language “for supplying”.
the generic placeholder “first recommended control parameter acquisition unit” is modified by the functional language “for acquiring.”
the generic placeholder “first control unit” is modified by the functional language “for controlling.”
the generic placeholder “second acquisition unit” is modified by the functional language “for acquiring”.
the generic placeholder “second learning processing unit” is modified by the functional language “for executing.”
the generic placeholder “learning processing of a second model” is modified by the functional language “configured to output.”
the generic placeholder “second supply unit” is modified by the functional language “for supplying.”
the generic placeholder “second recommended control parameter acquisition unit” is modified by the functional language “for acquiring.”
the generic placeholder “second control unit” is modified by the functional language “for controlling.”

Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f), it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
“first acquisition unit” being interpreted to cover the corresponding structure described in the specification ¶31-¶32 : “The apparatus 4 is configured to perform learning for each device to be controlled 20 (T). The apparatus 4 may be one or plurality of computers and may be composed of a PC or the like. The apparatus 4 has a measurement data acquisition unit 40,”…“The measurement data acquisition unit 40 is one example of a first acquisition unit.”
“first learning processing unit” being interpreted to cover the corresponding structure described in the specification ¶31 and ¶37: “The apparatus 4 is configured to perform learning for each device to be controlled 20 (T). The apparatus 4 may be one or plurality of computers and may be composed of a PC or the like. The apparatus 4 has” “a learning processing unit 44,”…“The learning processing unit 44 is one example of a first learning processing unit.”
“first supply unit” being interpreted to cover the corresponding structure described in the specification ¶31 and ¶39: “The apparatus 4 is configured to perform learning for each device to be controlled 20 (T). The apparatus 4 may be one or plurality of computers and may be composed of a PC or the like. The apparatus 4 has” “a supply unit 46,” “a learning processing unit 44,”…“The supply unit 46 is one example of a first supply unit.”
“first recommended control parameter acquisition unit” being interpreted to cover the corresponding structure described in the specification ¶31 and ¶40: “The apparatus 4 is configured to perform learning for each device to be controlled 20 (T). The apparatus 4 may be one or plurality of computers and may be composed of a PC or the like. The apparatus 4 has” “a recommended control parameter acquisition unit 47, and a control unit 49.”…“The recommended control parameter acquisition unit 47 is one example of a first recommended control parameter acquisition unit.”
“first control unit” being interpreted to cover the corresponding structure described in the specification ¶31 and ¶41: “The apparatus 4 is configured to perform learning for each device to be controlled 20 (T). The apparatus 4 may be one or plurality of computers and may be composed of a PC or the like. The apparatus 4 has” “a control unit 49.”…“The control unit 49 is one example of a first control unit”
“second learning processing unit” being interpreted to cover the corresponding structure described in the specification ¶91 and ¶92: “An apparatus 4A of the system 1A further includes a learning processing unit 44A, a model 45A, a supply unit 46A, a recommended control parameter acquisition unit 47A, and a control unit 49A.”…“The learning processing unit 44A is one example of a second learning processing unit.”
“second supply unit” being interpreted to cover the corresponding structure described in the specification ¶91 and ¶97: “An apparatus 4A of the system 1A further includes a learning processing unit 44A, a model 45A, a supply unit 46A, a recommended control parameter acquisition unit 47A, and a control unit 49A.”…“ The supply unit 46A is one example of a second supply unit.”
“second recommended control parameter acquisition unit” being interpreted to cover the corresponding structure described in the specification ¶91 and ¶98: “An apparatus 4A of the system 1A further includes a learning processing unit 44A, a model 45A, a supply unit 46A, a recommended control parameter acquisition unit 47A, and a control unit 49A.”…“The recommended control parameter acquisition unit 47A is one example of a second recommended control parameter acquisition unit.”
“second control unit” being interpreted to cover the corresponding structure described in the specification ¶91 and ¶99: “An apparatus 4A of the system 1A further includes a learning processing unit 44A, a model 45A, a supply unit 46A, a recommended control parameter acquisition unit 47A, and a control unit 49A.”…“ The control unit 49A is one example of a second control unit.”

Written description fails to disclose the corresponding structure:
The following claim limitations invoke 35 U.S.C. 112(f):
“learning processing of a first model” in claim 1.
 “second acquisition unit” “for acquiring” in claim 6.
“learning processing of a second model” “configured to output” in claim 6.
However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. 
Therefore, the claims 1and 6 are indefinite and are rejected under 35 U.S.C. 112. Please see the 35 U.S.C. 112 section below.

If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f).

Claim 13 is not interpreted under 35 USC 112f, because it’s rejected under 35 USC 101:
	Claim 13 is rejected under 35 USC 101, and thus 35 USC 112(f) claim interpretation is not applied to claim 13. Please see the 35 USC 101 rejection of claim 13 above.







Claim Rejections - 35 USC § 112
35 U.S.C. 112(a)
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claim 1-11 rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor at the time the application was filed, had possession of the claimed invention.
Claim 1:
	Claim recites “learning processing of a first model” “configured to output” that invokes 35 U.S.C. 112(f).
	Since the limitation invokes 35 U.S.C. 112(f), the description of the specification must include sufficient structure for the “learning processing of a first model.” 
	However, there is no insufficient disclosure of the corresponding structure, material, or acts for performing the entire claimed function and there is no clear linkage between the structure, material, or acts and the function. There is no structure in the disclosure for “learning processing of a first model”. The disclosure is devoid of any structure of “learning processing of a first model” that performs the function “configured to output a recommended control parameter indicating the first type of control content recommended for increasing a reward value determined by a preset reward function in response to input of the measurement data” as described in the claims. 
	“Learning processing of a first model” is not a structure to perform claimed function “configured to output a recommended control parameter indicating the first type of control content recommended for increasing a reward value determined by a preset reward function in response to input of the measurement data.”
	From the description in the specification, one of the ordinary skilled in the art will not understand that “Learning processing of a first model” is a structure that is capable of performing claimed function “configured to output a recommended control parameter indicating the first type of control content recommended for increasing a reward value determined by a preset reward function in response to input of the measurement data.”
	Appropriate correction is required.

Claims 2-11:
	Based on their dependencies in claim 1, claims 2-11 also include the same deficiencies as claim 1; therefore, for the same reasons as described above in claim 1, claims 2-11 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor at the time the application was filed, had possession of the claimed invention.



Claim 6:
	Claim recites the following limitations that invoke 35 U.S.C. 112(f): 
“second acquisition unit” “for acquiring.”
Since the limitation invokes 35 U.S.C. 112(f), the description of the specification must include sufficient structure for the “second acquisition unit.”
However, there is no insufficient disclosure of the corresponding structure, material, or acts for performing the entire claimed function and there is no clear linkage between the structure, material, or acts and the function. There is no structure in the disclosure for “second acquisition unit”. 
The disclosure is devoid of any structure of “second acquisition unit” that performs the function “for acquiring measurement data measured by a sensor,” as described in the claim. 
Applicant’s specification ¶32 and ¶92 describes, “The measurement data acquisition unit 40 is one example of a first acquisition unit”… “in the present variation, the measurement data acquisition unit 40 is also one example of a second acquisition unit and is configured to acquire measurement data included in the learning data used for the learning processing of the model 45A,” such that claimed “first acquisition unit” and “second acquisition unit” both are defined in the specification as the measurement data acquisition unit 40. Two separate distinguishable structures for “first acquisition unit” and “second acquisition unit” are not defined.
From the description in the specification, one of the ordinary skilled in the art will not understand that “the measurement data acquisition unit 40” is also a structure for “second acquisition unit.” From the description in the specification, one of the ordinary skilled in the art will understand that “the measurement data acquisition unit 40” is a structure for “first acquisition unit,” and since “second acquisition unit” is different than the “first acquisition unit,” the measurement data acquisition unit 40 cannot also be structure for “second acquisition unit.”
There are no separate structure defined in the specification for “second acquisition unit.”
“learning processing of a second model” “configured to output.”
Since the limitation invokes 35 U.S.C. 112(f), the description of the specification must include sufficient structure for the “learning processing of a second model”
However, there is no insufficient disclosure of the corresponding structure, material, or acts for performing the entire claimed function and there is no clear linkage between the structure, material, or acts and the function. There is no structure in the disclosure for “learning processing of a second model”. The disclosure is devoid of any structure of “learning processing of a second model” that performs the function “configured to output a recommended control parameter indicating the second type of control content recommended for increasing the reward value in response to input of the measurement data,” as described in the claim.
“Learning processing of a second model” is not a structure to perform claimed function “configured to output a recommended control parameter indicating the second type of control content recommended for increasing the reward value in response to input of the measurement data.”
From the description in the specification, one of the ordinary skilled in the art will not understand that “learning processing of a second model” is a structure that is capable of performing claimed function “configured to output a recommended control parameter indicating the second type of control content recommended for increasing the reward value in response to input of the measurement data.”
	Appropriate correction is required.












35 U.S.C. 112(b)

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 1-11 are rejected under 35 U.S.C. 112(b), as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, regards as the invention.
Written description fails to disclose corresponding structure:
The following claim limitations invoke 35 U.S.C. 112(f):
“learning processing of a first model” in claim 1.
“second acquisition unit” “for acquiring” in claim 6.
“learning processing of a second model” “configured to output” in claim 6.
However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. 

Claim 1:
	Claim recites “learning processing of a first model” “configured to output” that invokes 35 U.S.C. 112(f).
	Since the limitation invokes 35 U.S.C. 112(f), the description of the specification must include sufficient structure for the “learning processing of a first model.” 
	However, there is no insufficient disclosure of the corresponding structure, material, or acts for performing the entire claimed function and there is no clear linkage between the structure, material, or acts and the function. There is no structure in the disclosure for “learning processing of a first model”. The disclosure is devoid of any structure of “learning processing of a first model” that performs the function “configured to output a recommended control parameter indicating the first type of control content recommended for increasing a reward value determined by a preset reward function in response to input of the measurement data” as described in the claims.
	It’s not clear what is the structure of “learning processing of a first model” that performs the function “configured to output a recommended control parameter indicating the first type of control content recommended for increasing a reward value determined by a preset reward function in response to input of the measurement data.”
	Appropriate correction is required.

Claim 6:
	Claim recites “second acquisition unit” “for acquiring” that invoke 35 U.S.C. 112(f).
	Since the limitation invokes 35 U.S.C. 112(f), the description of the specification must include sufficient structure for the “second acquisition unit.”
	However, there is insufficient disclosure of the corresponding structure, material, or acts for performing the entire claimed function and there is no clear linkage between the structure, material, or acts and the function. There is no structure in the disclosure for “second acquisition unit”. 
	The disclosure is devoid of any structure of “second acquisition unit” that performs the function “for acquiring measurement data measured by a sensor,” as described in the claim. 
	Applicant’s specification ¶32 and ¶92 describes, “The measurement data acquisition unit 40 is one example of a first acquisition unit”… “in the present variation, the measurement data acquisition unit 40 is also one example of a second acquisition unit and is configured to acquire measurement data included in the learning data used for the learning processing of the model 45A,” such that claimed “first acquisition unit” and “second acquisition unit” both are defined in the specification as the measurement data acquisition unit 40. 
	Two separate distinguishable structures for “first acquisition unit” and “second acquisition unit” are not defined; therefor it’s not clear how these two different “first acquisition unit” and “second acquisition unit” are described as same structure “the measurement data acquisition unit 40.”
	For the examination purpose it is construed that the structure of “first acquisition unit” is described as “the measurement data acquisition unit 40,” and there is no structure defined for “second acquisition unit” such that second acquisition unit can be any unit that acquires measurement/detected data.
	Appropriate correction is required.

Claim 6:
Claim recites “learning processing of a second model” “configured to output.”
 that invokes 35 U.S.C. 112(f).
	Since the limitation invokes 35 U.S.C. 112(f), the description of the specification must include sufficient structure for the “learning processing of a second model”
	However, there is insufficient disclosure of the corresponding structure, material, or acts for performing the entire claimed function and there is no clear linkage between the structure, material, or acts and the function. There is no structure in the disclosure for “learning processing of a second model”. The disclosure is devoid of any structure of “learning processing of a second model” that performs the function “configured to output a recommended control parameter indicating the second type of control content recommended for increasing the reward value in response to input of the measurement data,” as described in the claim.
	“Learning processing of a second model” is not a structure to perform claimed function “configured to output a recommended control parameter indicating the second type of control content recommended for increasing the reward value in response to input of the measurement data.”
	Therefore it’s not clear what is the structure for “Learning processing of a second model.”
	Appropriate correction is required.

Therefore, the claim is indefinite and is rejected under 35 U.S.C. 112(b).
Applicant may:
Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph;
Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or
Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.




Insufficient antecedent basis or unclear limitations:
Claims 1, 3-5, 8:
	Claim 1 recites “at least one device to be controlled,” and claims 3-5 and 8 recite “each device to be controlled”.
	There is insufficient antecedent basis for the limitations in the claim.
	For the examination purpose, the limitation in claim 1 is construed as, “at least one device of a plurality of devices to be controlled.”
	For the examination purpose, the limitations in claims 3-5 and 8 are construed as, “each device of the plurality of devices to be controlled”
	Appropriate correction is required.

Claims 6:
	Claim 6 recites 
	“a second acquisition unit for acquiring measurement data measured by a sensor, and
	a second learning processing unit for executing, 
	by using learning data including the measurement data acquired by the second acquisition unit and a control parameter indicating a second type of control content of the at least one device to be controlled, 
	a learning processing of a second model configured to output a recommended control parameter indicating the second type of control content recommended for increasing the reward value in response to input of the measurement data,”

	There is insufficient antecedent basis for the limitations “measurement data,” “a sensor,” “learning data,” “a control parameter,” “a learning processing,” “a recommended control parameter,” “”
in the claim.
	For the examination purpose, the limitations are construed as,
	“a second acquisition unit for acquiring the measurement data measured by [[a]] the sensor, and
	a second learning processing unit for executing, 
	by using the learning data including the measurement data acquired by the second acquisition unit and a second control parameter indicating a second type of control content of the at least one device to be controlled, 
	a second learning processing of a second model configured to output a second recommended control parameter indicating the second type of control content recommended for increasing the reward value in response to input of the measurement data,”
	Appropriate correction is required.

Claims 7:
	Claim 7 recites “a second recommended control parameter acquisition unit for acquiring the recommended control parameter” in lines 4-6 and “the recommended control parameter acquired by the second recommended control parameter acquisition unit” in lines 7-9.
	There is insufficient antecedent basis for the limitation “the recommended control parameter” in the claim.
	For the examination purpose, the limitations are construed as, “a second recommended control parameter acquisition unit for acquiring the second recommended control parameter” in lines 4-6 and “the second recommended control parameter acquired by the second recommended control parameter acquisition unit” in lines 7-9.
	Appropriate correction is required.

Claims 11:
	Claim 11 recites “wherein the first acquisition unit is configured to acquire each of a first group of measurement data including at least one type of measurement data and a second group of measurement data including at least one type of measurement data,” in lines 2-4.
	There is insufficient antecedent basis for the limitation “measurement data” and “one type of measurement data” in the claim.
	For the examination purpose, the limitations are construed as, “wherein the first acquisition unit is configured to acquire each of a first group of the measurement data including at least one type of the measurement data and a second group of the measurement data including the at least one type of the measurement data.”
	Appropriate correction is required.


Claims 2-11:
	Based on their dependencies in claim 1, claims 2-11 also include the same deficiencies as claim 1; therefore, for the same reasons as described above in claim 1, claims 2-11 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.















Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-5, 8-9, 12, and 13 is/are rejected under 35 U.S.C. 102(a)(1)/102(a)(2)  as being anticipated by BADGWELL et al. (US20190187631A1) [hereinafter BADGWELL].
Claim 1:
	Regarding claim 1, BADGWELL discloses, “An apparatus comprising: a first acquisition unit for acquiring measurement data measured by a sensor,” [See the apparatus; see the system acquires measured data from the detectors: “During operation, process controller 360 can receive controller input from a detector (or from a plurality of detectors) 371, 372, or 373,” (¶50)… “the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366.” (¶51)];
	“a first learning processing unit for executing,”  [See the learning processing unit 350: “the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366.” “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform based on a stored state-action value function 367.” (¶51)];
	“by using learning data including the measurement data acquired by the first acquisition unit and a control parameter indicating a first type of control content of at least one device of a plurality of devices to be controlled,” [See learning agent uses the acquired measured data (e.g.; from detectors) and type of control content (e.g.; target control command such as selected action to perform based on a stored state-action value function 367) of at least one device that is being controlled using the control content/command/action: “In FIG. 3, the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366.” “the controller input from the one or more of detectors 371, 372, or 373 can be used by state analysis module 369 to determine one or more states that are associated with the current value of the controlled variable” “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified, such as by making an incremental change in one or more of the tuning parameters.” (¶51)];
	“a learning processing of a first model configured to output a recommended control parameter indicating the first type of control content recommended for increasing a reward value determined by a preset reward function in response to input of the measurement data.” [See in the learning processing; a model outputs recommended control parameter including a type of control content (e.g.; model that outputs desired control command to control device in order to achieve reward after evaluation of the selected state) that is used to increase a determined reward value (e.g.; selected action is used to increase reward by 100) based on the input measured data (e.g.; based in the inputs from the detectors): “During operation, process controller 360 can receive controller input from a detector (or from a plurality of detectors) 371, 372, or 373,” (¶50)… “the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366.” “the controller input from the one or more of detectors 371, 372, or 373 can be used by state analysis module 369 to determine one or more states that are associated with the current value of the controlled variable and/or the value of the controlled variable over a period of time. Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified,” “The modified set of tuning parameters for the proportional, integral, (and optional derivative) terms can then be used by proportional-integral control module 366 for determining the controller output signal to actuator 381 and/or electrical activator 382.” “after one or more additional evaluations of the state by state analysis module 369, a reward can be determined by reward module 368 that corresponds to the combination of state and action that was selected.” (¶51)… “Based on these state definitions, a plurality of rewards can be developed that correspond to the reward for ending an evaluation period in a given state and/or a plurality of states.” “the rewards can be based on a combination of the error standard deviation and values for one or both of the oscillation states.” “a reward strategy can be to assign a reward of +100 when the system is in a target or desired state.” (¶39)… “when the error was small and the process was not oscillating then the process was in state 0, the best possible state, and a reward of 100 was returned.” (¶57)].

Claim 2:
	Regarding claim 2, BADGWELL discloses all the elements of claim 1.
	BADGWELL further discloses, “a first supply unit for supplying the measurement data acquired by the first acquisition unit to the first model,” [See the acquired measured/detected data are supplied to the model for further evaluation (e.g.; detected data are input to the model and then the model outputs recommended/selected action after evaluation of the state): “the controller input from the one or more of detectors 371, 372, or 373 can be used by state analysis module 369 to determine one or more states that are associated with the current value of the controlled variable and/or the value of the controlled variable over a period of time. Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform based on a stored state-action value function 367.” “Based on the selected action, the tuning parameters in control module 366 can be modified,” (¶51)];
	“a first recommended control parameter acquisition unit for acquiring the recommended control parameter outputted from the first model in response to the supply of the measurement data to the first model,” [See in response to the supplied measured/detected data, the recommended control parameter is outputted from the model (e.g.; model outputs desired control command to control device in order to achieve reward after evaluation of the selected state); the outputted recommended control parameter is acquired by the controller to perform control: “controller input from the one or more of detectors 371, 372, or 373 can be used by state analysis module 369 to determine one or more states that are associated with the current value of the controlled” “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified, such as by making an incremental change in one or more of the tuning parameters. The modified set of tuning parameters for the proportional, integral, (and optional derivative) terms can then be used by proportional-integral control module 366 for determining the controller output signal to actuator 381 and/or electrical activator 382” (¶51)];
	“a first control unit for controlling the at least one device to be controlled by using the recommended control parameter acquired by the first recommended control parameter acquisition unit.” [See the controller uses the acquired control command to control the at least one device (e.g.; control the actuator): “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified, such as by making an incremental change in one or more of the tuning parameters. The modified set of tuning parameters for the proportional, integral, (and optional derivative) terms can then be used by proportional-integral control module 366 for determining the controller output signal to actuator 381 and/or electrical activator 382.” (¶51)].

Claim 3:
	Regarding claim 3, BADGWELL discloses all the elements of claim 1.
	BADGWELL further discloses, “wherein each device of the plurality of devices to be controlled is controlled by any feedback control of P control, PI control, PD control, and PID control,” [Examiner notes that claim requires only one of any feedback control of P control, PI control, PD control, and PID control.
	See the device is controlled by feedback control of PID control (i.e.; proportional, integral, or derivative control): “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified, such as by making an incremental change in one or more of the tuning parameters. The modified set of tuning parameters for the proportional, integral, (and optional derivative) terms can then be used by proportional-integral control module 366 for determining the controller output signal to actuator 381 and/or electrical activator 382.” (¶51)… “the controller tuning parameters can include at least one of a proportional tuning parameter and a gain parameter, at least one of an integral tuning parameter and an integral time parameter, and optionally at least one of a derivative tuning parameter and a derivative time parameter.” (¶9)].
	“wherein the first type of control content is a target value of the feedback control.” [See the type of control content in the recommended control parameter is a target value (e.g.; control content to achieve desired state): “the proportional-integral control module 366 can also receive changes to the setpoint for a controlled variable, such as from a setpoint modification module 390. Setpoint modification module 390 can also provide setpoint changes to state analysis module 369. Alternatively, changes to the setpoint can be provided from various other types of components, such as an input provided by a process controller associated with another controlled variable.” (¶52)… “assign a reward of +100 when the system is in a target or desired state. The target or desired state can correspond to, for example, having a small error standard deviation and an amount of oscillation (either first order or second order) that is below pre-defined threshold(s).” (¶39)].


Claim 4:
	Regarding claim 4, BADGWELL discloses all the elements of claim 1.
	BADGWELL further discloses, “wherein each device of the plurality of devices to be controlled is controlled by any feedback control of PI control, PD control, and PID control,” [Examiner notes that claim requires only one of any feedback control of P control, PI control, PD control, and PID control.
	See the device is controlled by feedback control of PID control (i.e.; proportional, integral, or derivative control): “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified, such as by making an incremental change in one or more of the tuning parameters. The modified set of tuning parameters for the proportional, integral, (and optional derivative) terms can then be used by proportional-integral control module 366 for determining the controller output signal to actuator 381 and/or electrical activator 382.” (¶51)… “the controller tuning parameters can include at least one of a proportional tuning parameter and a gain parameter, at least one of an integral tuning parameter and an integral time parameter, and optionally at least one of a derivative tuning parameter and a derivative time parameter.” (¶9)].
	“wherein the first type of control content is a piece of identification information of a gain set used for the feedback control among pieces of identification information preassociated with each gain set including a value of a proportional gain and at least one of a value of an integral gain or a value of a derivative gain of the feedback control.” [Examiner notes that control content is an identification information of a gain set where the gain set is any one of 1. A proportional gain and at least one of a value of an integral gain, or 2. a value of a derivative gain of the feedback control.
	See the type of control content is an identification information of a gain set including a value of derivative gain (e.g.; deification of gain set such that identifying corresponding gain based on the selected action where the gain is a value of derivative gain): “Based on determining a current state, the agent can select an action to perform.” “because the agent is tuning a PID controller, the actions selected by the agent can correspond to various types of changes in the tuning parameters for the PID controller. Examples of possible actions can include” “changing the magnitude of the controller integral time parameter and/or the integral tuning parameter; or changing the magnitude of the controller derivative time parameter and/or the derivative tuning parameter.” (¶40)].

Claim 5:
	Regarding claim 5, BADGWELL discloses all the elements of claim 1.
	BADGWELL further discloses, “wherein each device of the plurality of devices to be controlled is controlled by any feedback control of PI control, PD control, and PID control,” [Examiner notes that claim requires only one of any feedback control of P control, PI control, PD control, and PID control.
	See the device is controlled by feedback control of PID control (i.e.; proportional, integral, or derivative control): “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified, such as by making an incremental change in one or more of the tuning parameters. The modified set of tuning parameters for the proportional, integral, (and optional derivative) terms can then be used by proportional-integral control module 366 for determining the controller output signal to actuator 381 and/or electrical activator 382.” (¶51)… “the controller tuning parameters can include at least one of a proportional tuning parameter and a gain parameter, at least one of an integral tuning parameter and an integral time parameter, and optionally at least one of a derivative tuning parameter and a derivative time parameter.” (¶9)];
	“wherein the first type of control content is at least one of a value of a proportional gain, a value of an integral gain, or a value of a derivative gain of the feedback control.” [Examiner notes that control content is any one of 1. a value of a proportional gain, 2. a value of an integral gain, or 3. a value of a derivative gain of the feedback control.
	See the type of control content is a value of any one of 1. a proportional gain, 2. an integral gain, or 3. a derivative gain of the feedback control: “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified, such as by making an incremental change in one or more of the tuning parameters. The modified set of tuning parameters for the proportional, integral, (and optional derivative) terms can then be used by proportional-integral control module 366 for determining the controller output signal to actuator 381 and/or electrical activator 382.” (¶51)… “the controller tuning parameters can include at least one of a proportional tuning parameter and a gain parameter, at least one of an integral tuning parameter and an integral time parameter, and optionally at least one of a derivative tuning parameter and a derivative time parameter.” (¶9)].

Claim 8:
	Regarding claim 8, BADGWELL discloses all the elements of claim 1.
	BADGWELL further discloses, “the first type of control content is an output value of each device of the plurality of devices to be controlled.” [See the type of control content is output value of device (e.g.; determining recommended control parameter using the feedback values outputted from the devices such as the measured feedback data from detectors): “In FIG. 3, the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366.” “the controller input from the one or more of detectors 371, 372, or 373 can be used by state analysis module 369 to determine one or more states that are associated with the current value of the controlled variable” “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified, such as by making an incremental change in one or more of the tuning parameters.” (¶51)].



Claim 9:
	Regarding claim 9, BADGWELL discloses all the elements of claim 1.
	BADGWELL further discloses, “the first acquisition unit is configured to acquire the measurement data indicating a physical quantity that may act as a disturbance to the at least one device to be controlled.” [See the disturbance data of the devices are acquired (e.g.; error related states such as size of error relative to the setpoint; states related to convergence or divergence from the setpoint; states related to first order oscillation around a setpoint, and states related to longer time scale oscillation around a setpoint): “the controller input from the one or more of detectors 371, 372, or 373 can be used by state analysis module 369 to determine one or more states that are associated with the current value of the controlled variable” “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified, such as by making an incremental change in one or more of the tuning parameters.” (¶51)… “a plurality of rewards can be developed that correspond to the reward for ending an evaluation period in a given state and/or a plurality of states. In the low dimensional space example described above, the rewards can be based on a combination of the error standard deviation and values for one or both of the oscillation states.” (¶39)… “Examples of possible states can include states related to the size of error relative to the setpoint; states related to convergence or divergence from the setpoint; states related to first order oscillation around a setpoint, and states related to longer time scale oscillation around a setpoint.” (¶37)].
Claim 12:
	Regarding claim 12, BADGWELL discloses, “A method comprising:
a first acquisition step of acquiring measurement data measured by a sensor,” [See the method; see the system acquires measured data from the detectors: “During operation, process controller 360 can receive controller input from a detector (or from a plurality of detectors) 371, 372, or 373,” (¶50)… “the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366.” (¶51)];
	“a first learning processing step of executing,”  [See the learning processing unit 350: “the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366.” “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform based on a stored state-action value function 367.” (¶51)];
	“by using learning data including the measurement data acquired by the first acquisition step and a control parameter indicating a first type of control content of at least one device to be controlled” [See learning agent uses the acquired measured data (e.g.; from detectors) and type of control content (e.g.; target control command such as selected action to perform based on a stored state-action value function 367) of at least one device that is being controlled using the control content/command/action: “In FIG. 3, the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366.” “the controller input from the one or more of detectors 371, 372, or 373 can be used by state analysis module 369 to determine one or more states that are associated with the current value of the controlled variable” “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified, such as by making an incremental change in one or more of the tuning parameters.” (¶51)];
	“a learning processing of a first model configured to output a recommended control parameter indicating the first type of control content recommended for increasing a reward value determined by a preset reward function in response to input of the measurement data.” [See in the learning processing; a model outputs recommended control parameter including a type of control content (e.g.; model that outputs desired control command to control device in order to achieve reward after evaluation of the selected state) that is used to increase a determined reward value (e.g.; selected action is used to increase reward by 100) based on the input measured data (e.g.; based in the inputs from the detectors): “During operation, process controller 360 can receive controller input from a detector (or from a plurality of detectors) 371, 372, or 373,” (¶50)… “the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366.” “the controller input from the one or more of detectors 371, 372, or 373 can be used by state analysis module 369 to determine one or more states that are associated with the current value of the controlled variable and/or the value of the controlled variable over a period of time. Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified,” “The modified set of tuning parameters for the proportional, integral, (and optional derivative) terms can then be used by proportional-integral control module 366 for determining the controller output signal to actuator 381 and/or electrical activator 382.” “after one or more additional evaluations of the state by state analysis module 369, a reward can be determined by reward module 368 that corresponds to the combination of state and action that was selected.” (¶51)… “Based on these state definitions, a plurality of rewards can be developed that correspond to the reward for ending an evaluation period in a given state and/or a plurality of states.” “the rewards can be based on a combination of the error standard deviation and values for one or both of the oscillation states.” “a reward strategy can be to assign a reward of +100 when the system is in a target or desired state.” (¶39)… “when the error was small and the process was not oscillating then the process was in state 0, the best possible state, and a reward of 100 was returned.” (¶57)].


Claim 13:
	Regarding claim 13, BADGWELL discloses, “A storage medium having recorded thereon a program for causing a computer to function as:” [See storage medium having recorded thereon a program for causing a computer to function: “The learning agent and modules shown in FIG. 3 can” “be implemented as modules that run by executing computer-executable instructions that are in memory associated with a processor.” “Such computer-executable instructions can be stored using computer-readable media. Computer-readable media can be any available media that can be accessed by a processor (or other computing device) and includes both volatile and nonvolatile, removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data.” (¶53)];
	“a first acquisition unit for acquiring measurement data measured by a sensor,” [See the method; see the system acquires measured data from the detectors: “During operation, process controller 360 can receive controller input from a detector (or from a plurality of detectors) 371, 372, or 373,” (¶50)… “the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366.” (¶51)];
	“a first learning processing unit for executing,”  [See the learning processing unit 350: “the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366.” “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform based on a stored state-action value function 367.” (¶51)];
	“by using learning data including the measurement data acquired by the first acquisition unit and a control parameter indicating a first type of control content of at least one device to be controlled,” [See learning agent uses the acquired measured data (e.g.; from detectors) and type of control content (e.g.; target control command such as selected action to perform based on a stored state-action value function 367) of at least one device that is being controlled using the control content/command/action: “In FIG. 3, the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366.” “the controller input from the one or more of detectors 371, 372, or 373 can be used by state analysis module 369 to determine one or more states that are associated with the current value of the controlled variable” “Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified, such as by making an incremental change in one or more of the tuning parameters.” (¶51)];
	“a learning processing of a first model configured to output a recommended control parameter indicating the first type of control content recommended for increasing a reward value determined by a preset reward function in response to input of the measurement data.” [See in the learning processing; a model outputs recommended control parameter including a type of control content (e.g.; model that outputs desired control command to control device in order to achieve reward after evaluation of the selected state) that is used to increase a determined reward value (e.g.; selected action is used to increase reward by 100) based on the input measured data (e.g.; based in the inputs from the detectors): “During operation, process controller 360 can receive controller input from a detector (or from a plurality of detectors) 371, 372, or 373,” (¶50)… “the controller input from one or more of detectors 371, 372, or 373 can also be used by a learning agent 350 to modify the tuning parameters for the PID control module 366.” “the controller input from the one or more of detectors 371, 372, or 373 can be used by state analysis module 369 to determine one or more states that are associated with the current value of the controlled variable and/or the value of the controlled variable over a period of time. Based on the state determined by the state analysis module 369, the learning agent 350 can select an action to perform” “Based on the selected action, the tuning parameters in control module 366 can be modified,” “The modified set of tuning parameters for the proportional, integral, (and optional derivative) terms can then be used by proportional-integral control module 366 for determining the controller output signal to actuator 381 and/or electrical activator 382.” “after one or more additional evaluations of the state by state analysis module 369, a reward can be determined by reward module 368 that corresponds to the combination of state and action that was selected.” (¶51)… “Based on these state definitions, a plurality of rewards can be developed that correspond to the reward for ending an evaluation period in a given state and/or a plurality of states.” “the rewards can be based on a combination of the error standard deviation and values for one or both of the oscillation states.” “a reward strategy can be to assign a reward of +100 when the system is in a target or desired state.” (¶39)… “when the error was small and the process was not oscillating then the process was in state 0, the best possible state, and a reward of 100 was returned.” (¶57)].








Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claim(s) 6-7 and 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over BADGWELL, and further in view of KO et al. (US20200041160A1) [hereinafter KO].
Claim 6:
	Regarding claim 6, BADGWELL discloses all the elements of claims 1 and 4, but doesn’t explicitly disclose, “a second acquisition unit for acquiring the measurement data measured by [[a]] the sensor, and a second learning processing unit for executing, by using the learning data including the measurement data acquired by the second acquisition unit and a second control parameter indicating a second type of control content of the at least one device to be controlled, a second learning processing of a second model configured to output a second recommended control parameter indicating the second type of control content recommended for increasing the reward value in response to input of the measurement data, wherein the second type of control content is a target value of the feedback control.”
	However, KO discloses, “a second acquisition unit for acquiring the measurement data measured by [[a]] the sensor,” [See the system acquires measured sensor data (e.g.; measured temperature data): “the processor 180 may acquire second temperature related data D′ through the collecting unit 340 by operating the air conditioning device 330” (¶103)];
	“a second learning processing unit for executing,” [See the second learning processing unit (e.g.; learning processing unit with learning models 310 and 320): “the energy balance learning model 310 and the control learning model 320 may be learned based on the temperature related data (second temperature related data D′) collected from the air conditioning device 330 by the operation of the air conditioning device 330.” (¶101)];
	“by using the learning data including the measurement data acquired by the second acquisition unit and a second control parameter indicating a second type of control content of the at least one device to be controlled,” [See system executes learning processing using the learning data that includes measurement data (e.g.; measured temperature) and a second control parameter indicating a second type of control content of the at least one device to be controlled (e.g.; target control value Q’): “the energy balance learning model 310 and the control learning model 320 may be learned based on the temperature related data (second temperature related data D′) collected from the air conditioning device 330 by the operation of the air conditioning device 330.” (¶101)… “the processor 180 may perform control the learning of the energy balance learning model 310 so as to acquire a second predicted temperature value T′ corresponding to the second temperature related data D′. The processor 180 may update the hyper parameters of the energy balance learning model 310 in order to optimize the second predicted temperature values T′.” (¶106)… “the processor 180 may perform control so as to learn the control learning model 320 to acquire a second control value Q based on the second predicted temperature value T′. The processor 180 may update the control parameter of the control learning model 320 to optimize a control values Q′.” (¶108)…];
	“a second learning processing of a second model configured to output a second recommended control parameter indicating the second type of control content recommended for increasing the reward value in response to input of the measurement data,” [See the learning processing outputs second recommended control parameter indicating the second type of control content (e.g.; desired control command Q), where the recommended control parameter is used for increasing the reward (e.g.; update desired control command Q such that control parameter is used to increase the reward) in response to acquired measured data (e.g.; measured temperature): “The processor 180 may perform control so as to learn the energy balance learning model 310 based on the second temperature related data D′ (S1150).” (¶105)… “the processor 180 may perform control the learning of the energy balance learning model 310 so as to acquire a second predicted temperature value T′ corresponding to the second temperature related data D′.” (¶106)… “the processor 180 may perform control so as to learn the control learning model 320 to acquire a second control value Q based on the second predicted temperature value T′. The processor 180 may update the control parameter of the control learning model 320 to optimize a control values Q′.” (¶108)… “the artificial intelligence unit 120 gives a reward based on a gap between the base line 220 and the output value, thereby acquiring one or more parameters for enabling the output value according to control of the control learning model 320 to most closely follow the base line 220.” (¶196)… “As the gap between the base line 220 and the output value is decreased, the given reward may be increased. The artificial intelligence unit 120 may acquire one or more parameters for maximizing the reward.” (¶198)… “the reward given when the first parameter is used is greater than the reward given when the second parameter is used. In this case, the artificial intelligence unit 120 may acquire the first parameter as the parameter for enabling the output value to most closely follow the base line.” (¶201)];
	“wherein the second type of control content is a target value of the feedback control.” [See second control content is a target value of feedback control (e.g.; target value T’ based on feedback control using temperature measured as feedback): “the energy balance learning model 310 and the control learning model 320 may be learned based on the temperature related data (second temperature related data D′) collected from the air conditioning device 330 by the operation of the air conditioning device 330.” (¶101)… “the processor 180 may perform control the learning of the energy balance learning model 310 so as to acquire a second predicted temperature value T′ corresponding to the second temperature related data D′. The processor 180 may update the hyper parameters of the energy balance learning model 310 in order to optimize the second predicted temperature values T′.” (¶106)… “the processor 180 may perform control so as to learn the control learning model 320 to acquire a second control value Q based on the second predicted temperature value T′.” (¶108)];

Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the capability of acquiring measured data, and executing learning processing using learning data including the measurement data and a second control parameter indicating a second type of control content of the at least one device to be controlled, and outputting second recommended control parameter indicating the second type of control content recommended for increasing the reward value in response to input of the measurement data to be able to change reward to variously combine various operational goals according to a degree of importance and to acquire an optimal parameter taught by KO with the apparatus taught by BADGWELL as discussed above. A person of ordinary skill in the optimizing control parameter via learning processing field would have been motivated to make such combination in order to change reward to variously combine various operational goals according to a degree of importance and to acquire an optimal parameter [KO: “the reward is changed according to the position of the gap to variously combine various operational goals according to a degree of importance and to acquire an optimal parameter.” (¶243)].

Claim 7:
	Regarding claim 7, BADGWELL and KO disclose all the elements of claims 1, 4, and 6, but BADGWELL doesn’t explicitly disclose, “a second supply unit for supplying the measurement data acquired by the second acquisition unit to the second model, a second recommended control parameter acquisition unit for acquiring the second recommended control parameter outputted from the second model in response to the supply of the measurement data to the second model, and a second control unit for controlling the at least one device to be controlled by using the second recommended control parameter acquired by the second recommended control parameter acquisition unit.”
	However, KO discloses, “a second supply unit for supplying the measurement data acquired by the second acquisition unit to the second model, a second recommended control parameter acquisition unit for acquiring the second recommended control parameter outputted from the second model in response to the supply of the measurement data to the second model,” [See the measured data (e.g.; temperature data D’) is supplied to the model, and then the second recommended control parameter (e.g.; target control value Q’) outputted from the second model is acquired, where the second recommended control parameter is obtained in response to the measured data: “The second temperature related data D′ may be collected in real time by the collection unit 340 as long as the operation of the air conditioning device 330 continues.” (¶104)…“The processor 180 may perform control so as to learn the energy balance learning model 310 based on the second temperature related data D′ (S1150).” (¶105)… “the processor 180 may perform control the learning of the energy balance learning model 310 so as to acquire a second predicted temperature value T′ corresponding to the second temperature related data D′.” (¶106)… “the processor 180 may perform control so as to learn the control learning model 320 to acquire a second control value Q based on the second predicted temperature value T′. The processor 180 may update the control parameter of the control learning model 320 to optimize a control values Q′.” (¶108)];
	a second control unit for controlling the at least one device to be controlled by using the second recommended control parameter acquired by the second recommended control parameter acquisition unit.” [See acquired second recommended control parameter (e.g.; target control value Q’) is used by the controller to control the devices (e.g.; control to perform adjustments): “the processor 180 may perform control so as to learn the control learning model 320 to acquire a second control value Q based on the second predicted temperature value T′. The processor 180 may update the control parameter of the control learning model 320 to optimize a control values Q′.” (¶108)… “The processor 180 may adjust the opening of the air conditioning device 330 according to the second control value Q′, and may operate the air conditioning device 330 at the adjusted opening degree (S1170).” (¶109)].
	Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of KO with the apparatus taught by BADGWELL as discussed above. A person of ordinary skill in the optimizing control parameter via learning processing field would have been motivated to make such combination for the same reasons as described above in claim 6.

Claim 10:
	Regarding claim 10, BADGWELL discloses all the elements of claim 1, but BADGWELL doesn’t explicitly disclose, “wherein the first acquisition unit is configured to acquire the measurement data indicating a consumption of at least one of energy or raw material by a facility including the at least one device to be controlled.”
	However, KO discloses,, “wherein the first acquisition unit is configured to acquire the measurement data indicating a consumption of at least one of energy or raw material by a facility including the at least one device to be controlled.” [Examiner notes that measurement data indicates only one of consumption of energy, or consumption of raw material.
	See the energy consumption related data is acquired by a facility including the at least one device to be controlled (e.g.; feedback measured temperature data is acquired to calculate desired control parameter in order to maintain energy consumption using the acquired set baseline to maintain desired power consumption): “in order to” “prevent excessive power consumption, the second base line 910 may be set to a specific temperature. For example, the set value may be 30° C. and the specific temperature may be 40° C.” (¶263)… “air conditioning device operates and is maintained at a constant temperature according to the control value acquired more accurately and quickly based on the predicted temperature value acquired by the energy balance learning model, so that there is a merit of providing a user with a comfortable environment and being capable of efficient energy consumption.” (¶317)… “The energy balance learning model 310 may be learned so as to update hyper parameters based on the temperature related data D and D′ to acquire predicted temperature values T and T′ according to the updated hyper parameters. The temperature related data D and D′ may be temperatures or opening degrees.” “the energy balance learning model 310 may receive at least one among the temperature and the opening degree.” (¶75)… “the predicted temperature values T and T′ output by the energy balance learning model 310 may be provided to the control learning model 320. The control learning model 320 may acquire the control values Q and Q′ by learning the predicted temperature values T and T′. In addition, the control learning model 320 may update control parameters based on the predicted temperature values T and T′.” (¶87)].
	Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the above described teachings of KO with the apparatus taught by BADGWELL as discussed above. A person of ordinary skill in the optimizing control parameter via learning processing field would have been motivated to make such combination for the same reasons as described above in claim 6.

Claim(s) 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over BADGWELL, and further in view of PARK et al. (US20210335344A1) [hereinafter PARK] and Shimizu et al. (US20180210406A1) [hereinafter Shimizu].
Claim 11:
	Regarding claim 11, BADGWELL discloses all the elements of claim 1, but doesn’t explicitly disclose, “wherein the first acquisition unit is configured to acquire each of a first group of the measurement data including at least one type of the measurement data and a second group of the measurement data including the at least one type of the measurement data.” “wherein the reward function used in the first learning processing unit is configured to: set the reward value to 0 independently of each value of the second group of the measurement data when at least one of the first group of the measurement data does not satisfy a reference condition, and increase or decrease the reward value according to each value of the second group of the measurement data when each of the first group of the measurement data satisfies the reference condition.”
	However, PARK discloses, “wherein the first acquisition unit is configured to acquire each of a first group of the measurement data including at least one type of the measurement data and a second group of the measurement data including the at least one type of the measurement data.” [See the first group of measured data (e.g.; current temperature data in fig. 7) including at least one type of measurement data (e.g.; operational data) and second group of measured data (e.g.; air cleaner measured concentration of dust data in fig. 10) including at least one type of measurement data (e.g.; operational data): “For action “turn-on”, when the current temperature is less than the optimal temperature, a reward of 0 may be given to the air conditioner. This may be because the purpose for operating the air conditioner is achieved and the turned-off state may be maintained.” (¶192)… “Referring to FIG. 10, for action “turn-off”,” “a reward of 0 may be given to the air cleaner when the concentration of dust is greater than the reference value.” (¶208)].
	“wherein the reward function used in the first learning processing unit is configured to: set the reward value to 0 independently of each value of the second group of the measurement data when at least one of the first group of the measurement data does not satisfy a reference condition, and” [See, independently of each value of the second group of the measurement data (e.g.; independent of air cleaner measured concentration of dust data), the system sets the reward value to zero when at least one of the first group of the measurement data does not satisfy a reference condition (e.g.; temperature doesn’t satisfy reference condition such that when air conditioner is turned on the current temperature is not greater than the optimal temperature such that current temperature < optimal temperature as shown in fig. 7): “For action “turn-on”, when the current temperature is less than the optimal temperature, a reward of 0 may be given to the air conditioner.” (¶192)], but doesn’t explicitly disclose, “increase or decrease the reward value according to each value of the second group of the measurement data when each of the first group of the measurement data satisfies the reference condition.”
	However, Shimizu discloses, “increase or decrease the reward value according to each value of the second group of the measurement data when each of the first group of the measurement data satisfies the reference condition.” [Examiner notes that claim requires only one increasing or decreasing a reward value.
	See, increasing the reward according to value of the second group of the measurement data (e.g.; according to the degree), when each of the first group of the measurement data satisfies the reference condition (e.g.; when measured cycle time data satisfies reference condition such that When cycle time in a series of operations (machining operations) of the machine 2 is shorter than a prescribed reference value set in advance): “When cycle time in a series of operations (machining operations) of the machine 2 is shorter than a prescribed reference value set in advance, a positive reward is given according to the degree. On the other hand, when the cycle time in the series of operations (machining operations) of the machine 2 is longer than the prescribed reference value set in advance, a negative reward is given according to the degree.” (¶88)].
Therefore, it would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to have combined the first group of measured data including at least one type of measurement data and second group of measured data including at least one type of measurement data, and combined the capability of setting the reward value to zero when at least one of the first group of the measurement data does not satisfy a reference condition independently of each value of the second group of the measurement data taught by PARK and combined the capability of increasing the reward according to value of the second group of the measurement data when each of the first group of the measurement data satisfies the reference condition taught by Shimizu with the apparatus taught by BADGWELL as discussed above. A person of ordinary skill in the optimizing control parameter via learning processing field would have been motivated to make such combination in order to properly control device based on the status information and characteristic information of the artificial intelligence device [PARK: “properly grasp a device to be controlled based on the status information and characteristic information of the artificial intelligence device.” (¶16)], and in order to learn an optimum override control setting value  [Shimizu: “a numerical controller and a machine learning device that perform machine learning to learn an optimum override control setting value.” (¶15)].

	












Conclusion
	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure is listed in the PTO-892 Notice of Reference Cited document.

US 20220205666 A1 - Control Method for Air Conditioner, and Device for Air Conditioner and Storage Medium:
	A first reward matrix is constructed according to multiple sets of target operating parameters of an air conditioner, wherein each of the multiple sets of the target operating parameters of the air conditioner at least comprises a target indoor environment temperature, a target outdoor environment temperature, a target setting temperature, a target intermediate temperature of an indoor evaporator, a target intermediate temperature of an outdoor condenser, a first target operating frequency of a compressor, a first target opening degree of an electronic expansion valve and a first target rotating speed of an external fan; a maximum expected benefit of performing a current action in a current state is calculated based on the first reward matrix and a Q-learning algorithm, wherein the current state is represented by a current indoor environment temperature and a current outdoor environment temperature, and the current action is represented by a current operating frequency of the compressor, a current opening degree of the electronic expansion valve and a current rotating speed of the external fan; and target action parameters under the maximum expected benefit are acquired, and operation of the air conditioner is controlled based on second target action parameters, wherein the second target action parameters at least comprise a second target operating frequency of the compressor, a second target opening degree of the electronic expansion valve and a second target rotating speed of the external fan (¶6).

US 20180335758 A1 - Machine learning device, servo control system, and machine learning method:
	A reward output step of outputting a value of a reward in the reinforcement learning on the basis of the deviation included in the state information; and a value function updating step of updating an action-value function on the basis of the value of the reward, the state information, and the action information (¶13).

	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMED SHAFAYET whose telephone number is (571)272-8239. The examiner can normally be reached M-F 8:30 AM-5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kenneth M Lo can be reached on (571)272-9774. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/M.S./
Examiner
Art Unit 2116



/KENNETH M LO/Supervisory Patent Examiner, Art Unit 2116