DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
	The amendments filed on 2/25/2021 have been entered. Per an interview with the applicant on 4/12/2021, an examiner’s amendment was proposed in order to help further prosecution and place the claims in condition for allowance. The applicant granted permission to make the requested changes (shown below in the Examiner’s Amendment section).

EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given in an interview with attorney Eric D. Kirsch on 4/12/2021.

Please amend claims per the amendments filed on 2/25/2021. 
Please cancel dependent claims 13, 14, and 15.
Please further amend claims 1, 11, and 12 as follows:


a control section that controls the target device according to a control condition indicated by recommended control condition data; 
a state acquiring section that acquires state data indicating a state of the facility after the target device has been controlled by the control section; 
a control condition acquiring section that acquires control condition data indicating a control condition of each target device; 
a learning processing section that uses kernel dynamic policy programming to generate learning data including the state data and the control condition data to perform learning processing of a model that outputs recommended control condition data indicating a control condition recommended for each target device in response to input of the state data and distributes the learning processing for the target device among the plurality of agents.

Claim 11 (Currently Amended) A method in which a plurality of agents, which each set one or more devices among a plurality of devices provided in a facility to be target devices, comprising: 
acquiring control condition data indicating a control condition of each target device; 
controlling the target device according to the control condition indicated by recommended control condition data; 
acquiring state data indicating a state of the facility in response to the target device being controlled according to the control condition; and 
using kernel dynamic policy programming to generate learning data including the state data and the control condition data to perform learning processing of a model that outputs recommended 
distributing the learning processing for the target device among the plurality of agents. 

	Claim 12 (Currently Amended) A non-transitory recording medium storing thereon a program that causes one or more computers to function as a plurality of agents that each set one or more devices among a plurality of devices provided in a facility to be target devices, wherein each of the plurality of agents include:
	a state acquiring section that acquires state data indicating a state of the facility;
	a control condition acquiring section that acquires control condition data indicating a control condition of each target device;
	a learning processing section that uses kernel dynamic policy programming to generate learning data including the state data and the control condition data to perform learning processing of a model that outputs recommended control condition data indicating a control condition recommended for each target device in response to input of the state data;
	a control section that controls the target device according to the control condition indicated by the recommended control condition data;
	each state acquiring section acquires the state data in response to the target device being controlled by the control section; and 
	the learning processing for the target device is distributed among the plurality of agents. 

Allowable Subject Matter
The following is an examiner’s statement of reasons for allowance: 


“where the learning processing section for each agent uses kernel dynamic policy programming for the target device.”; in order to overcome the current prior art of record. 

The closest prior art of record is Locke et al. (US PGPUB 20180364654). Locke teaches a building management system that is reinforced with artificial intelligence learning in which a plurality of agent controllers perform control functions on assigned equipment in response to the learning and control commands. The agents are capable of sharing the learned knowledge with other agents thus providing a distributed learning system. Locke is silent however on utilizing kernel dynamic policy programming techniques in order to facilitate the learning and control within the system by the agents. The kernel dynamic policy programming provides an advantage over traditional artificial intelligence techniques in that it allows for a factorial learning process that reduces computations and devices required in the learning and allows the system to establish an equilibrium faster. 
Further, an NPL search was performed and the two closest prior art of record are, Xu, “Kernel-Based Approximate Dynamic Programming for Real-Time Online Learning Control: An Experimental Study” and Cui, “Kernel dynamic policy programming: Applicable reinforcement learning to robot systems with high dimensional states”. Xu discusses the framework (see A. Framework of KDHP) and provides the structure (see Fig. 1) of the theoretical kernel dynamic policy systems. At the time of this paper, this was one of the first applications on a physical control system which was a double-link inverted pendulum. The next closest prior art is Cui is application of kernel dynamic policy programming for use in controlling the Shadow Dexterous Hand, which is humanoid hand robot used for unscrewing bottle caps. Though the cited prior art discusses applications of kernel dynamic policy programming, in each case the application is isolated to a targeted and simple system. In the present application, the applicant has found a way to apply it to a large scale building management and control system and .

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHRISTOPHER W CARTER whose telephone number is (469)295-9262.  The examiner can normally be reached on 9-6:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Rocio Del Mar Perez-Velez can be reached on 571-270-5935.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact 






/C.W.C./Examiner, Art Unit 2117                                                                                                                                                                                                        
/ROCIO DEL MAR PEREZ-VELEZ/Supervisory Patent Examiner, Art Unit 2117