Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This action is in response to an application filed 11/6/20.
Claims 1-26 are pending.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-9 are rejected under 35 U.S.C. 103 as being unpatentable over Application of SARSA Learning Algorithm for Reactive Power Control in Power System” by Tousi et al. (Tousi) in view of US 2013/0282189 to Stoupis et al. (Stoupis).

Claim 1: Tousi discloses a method for autonomous voltage control in an electric power system, the method comprising: 
acquiring state information at buses of the electric power system (pg. 1199, col. 2, 6th par. “The state variables in the problem are voltages of some busbars”); 
detecting a state violation from the state information (pg. 1199, col. 2, 4th par. “maintain the magnitude of voltages … inside acceptable ranges”); 
generating a first action setting based on the state violation using a predetermined algorithm by an Al agent (see e.g. Fig. 1) of the electric power system where the state violation occurs (pg. 1200, col. 1, 4th full par. “voltage set point or as reactive power injection”); and 
maintaining a second action setting by an Al agent of the electric power system where no substantial state violation is detected (pg. 1200, col. 1, 8th par. “While the entire state variables lie inside the limits … the system is told to be in final state condition”).

Tousi does not explicitly disclose first and second AI agents assigned to first and second regions.

Stoupis teaches first and second controllers of first and second regions (par. [0037] “partition the global power network model in to one or more local power network models for one or more substations based upon areas of responsibility”, par. [0024] “one or more local grid management applications”). 

It would have been obvious at the time of filing to assign first and second AI agents (see e.g. Tousi Fig. 1) to first and second regions (Stoupis par. [0037] “partition the global power network model in to one or more local power network models”). Those of ordinary skill in the art would have been motivated to do so to provide reactive power control in a distributed power network.

Claim 2: Tousi and Stoupis teach the method of claim 1, wherein the state information includes a bus voltage magnitude (Tousi pg. 1199, col. 2, 4th par. “magnitude of voltages”).

Claim 3: Tousi and Stoupis teach the method of claim 2, wherein the bus voltage magnitude is measured by a phasor measurement unit (PMU) or a supervisory control and data acquisition (SCADA) system coupled to the bus (Stoupis par. [0041] “supervisory control and data acquisition (SCADA) system”).

Claim 4: Tousi and Stoupis teach the method of claim 2, wherein the state violation includes the bus voltage magnitude dropping below a predetermined lower bound or rising above a predetermined upper bound (Tousi pg. 1199, col. 2, 4th par. “maintain the magnitude of voltages … inside acceptable ranges”).

Claim 5: Tousi and Stoupis teach the method of claim 1 further comprising executing the first action setting in the electric power system to reduce the state violation (Tousi pg. 1200, col. 1, 4th full par. “voltage set point or as reactive power injection”).

Claim 6: Tousi and Stoupis teach the method of claim 5, wherein the executing the first action setting includes changing a bus voltage of a power generator in the first region (Tousi pg. 1200, col. 1, 4th full par. “voltage set point or as reactive power injection”). 

Claim 7: Tousi and Stoupis teach the method of claim 1, wherein the first region includes two or more geographical zones (Stoupis par. [0028] “partitioned … based upon feeder boundaries”).

Claim 8: Tousi and Stoupis teach the method of claim 1 further comprising adjusting a partition of the electric power system by allocating a first bus from the first region to a third region of the plurality of regions, wherein the first bus is substantially uncontrollable by local resources in the first region and substantially controllable by local resources in the third region (Stoupis par. [0042] The feeder change power grid even 520”).

Claim 9: Tousi and Stoupis teach the method of claim 8, wherein the adjusting is repeated until all the buses in the first region is controllable by the local resources thereof (Stoupis par. [0037] “based upon areas of responsibility”).

Claims 10, 13-20 and 23-24 are rejected under 35 U.S.C. 103 as being unpatentable over “Application of SARSA Learning Algorithm for Reactive Power Control in Power System” by Tousi et al. (Tousi) in view of US 2013/0282189 to Stoupis et al. (Stoupis) in view of “Load Shedding Scheme with Deep Reinforcement Learning to Improve Short-term Voltage Stability” by Zhang et al. (Zhang).

Claim 10: Tousi and Stoupis teach the method of claim 1, wherein the predetermined algorithm is a reinforcement learning (RL) algorithm (Tousi pg. 1198, col. 1, last full par. “SARSA … Reinforcement Learning (RL) algorithm”).

Tousi and Stoupis do not teach wherein the algorithm is a deep reinforcement learning algorithm.

Zhang teaches monitoring and controlling voltage using a deep reinforcement learning (DRL) algorithm (Abstract “scheme against voltage instability with deep reinforcement learning (DRL)”). 

It would have been obvious at the time of filing to use a deep reinforcement learning algorithm. Those of ordinary skill in the art would have been motivated to do so as a known alternative learning algorithm which would have produced only the expected results. 

Claim 13: Tousi discloses a system for autonomous voltage control in an electric power system, the system comprising: 
measurement devices coupled to buses of the electric power system for measuring state information at the buses (pg. 1199, col. 2, 2nd to last full par. “voltages of some busbars”, it would have been understood that this requires some form of measurement device); 
a processor; 
a computer-readable storage medium (pg. 1201, col. 1, 3rd full par. “AMD Athlon 3.2GHz”, it would have been understood that this comprises a process and storage medium), comprising: 
software instructions executable on the processor to perform operations, including: 
acquiring state information from the measurement devices (pg. 1199, col. 2, 6th par. “The state variables in the problem are voltages of some busbars”); 
detecting a state violation from the state information (pg. 1199, col. 2, 4th par. “maintain the magnitude of voltages … inside acceptable ranges”); 
generating a first action setting based on the state violation using a reinforcement learning (RL) algorithm by an Al agent of the electric power system where the state violation occurs (pg. 1200, col. 1, 4th full par. “voltage set point or as reactive power injection”); and
maintaining a second action setting by an Al agent of the electric power system where no substantial state violation is detected (pg. 1200, col. 1, 8th par. “While the entire state variables lie inside the limits … the system is told to be in final state condition”).

Tousi does not explicitly disclose first and second AI agents assigned to first and second regions.

Stoupis teaches first and second controllers of first and second regions (par. [0037] “partition the global power network model in to one or more local power network models for one or more substations based upon areas of responsibility”, par. [0024] “one or more local grid management applications”). 

It would have been obvious at the time of filing to assign first and second AI agents (see e.g. Tousi Fig. 1) to first and second regions (Stoupis par. [0037] “partition the global power network model in to one or more local power network models”). Those of ordinary skill in the art would have been motivated to do so to provide reactive power control in a distributed power network.

Tousi and Stoupis do not teach using a deep reinforcement learning algorithm.

Zhang teaches monitoring and controlling voltage using a deep reinforcement learning (DRL) algorithm (Abstract “scheme against voltage instability with deep reinforcement learning (DRL)”). 

It would have been obvious at the time of filing to use a deep reinforcement learning algorithm. Those of ordinary skill in the art would have been motivated to do so as a known alternative learning algorithm which would have produced only the expected results. 

Claim 14: Tousi, Stoupis and Zhang teach the system of claim 13, wherein the state information includes a bus voltage magnitude (Tousi pg. 1199, col. 2, 4th par. “magnitude of voltages”).

Claim 15: Tousi, Stoupis and Zhang teach the system of claim 13, wherein the measurement devices includes phasor measurement units (PMU) or a supervisory control and data acquisition (SCADA) system (Stoupis par. [0041] “supervisory control and data acquis ion (SCADA) system”).

Claim 16: Tousi, Stoupis and Zhang teach the system of claim 13, wherein the state violation includes a bus voltage magnitude dropping below a predetermined lower bound or rising above a predetermined upper bound (Tousi pg. 1199, col. 2, 4th par. “maintain the magnitude of voltages … inside acceptable ranges”).

Claim 17: Tousi, Stoupis and Zhang teach the system of claim 13 further comprising executing the first action setting in the electric power system to reduce the state violation (Tousi pg. 1200, col. 1, 4th full par. “voltage set point or as reactive power injection”).

Claim 18: Tousi, Stoupis and Zhang teach the system of claim 17, wherein the executing the first action setting includes changing a bus voltage of a power generator in the first region (Tousi pg. 1200, col. 1, 4th full par. “voltage set point or as reactive power injection”).

Claim 19: Tousi, Stoupis and Zhang teach the system of claim 13 further comprising adjusting a partition of the electric power system by allocating a bus from the first region to a third region of the electric power system, wherein the bus is substantially uncontrollable by local resources in the first region, but substantially controllable by local resources in the third region (Stoupis par. [0042] “The feeder change power grid even 520”).

Claim 20: Tousi, Stoupis and Zhang teach the system of claim 19, wherein the adjusting is repeated until all the buses in the first region is controllable by the local resources thereof (Stoupis par. [0037] “based upon areas of responsibility”).

Claim 23: A method for autonomous voltage control in an electric power system, the method comprising: 
acquiring state information at buses of the electric power system (pg. 1199, col. 2, 6th par. “The state variables in the problem are voltages of some busbars”); 
detecting a state violation from the state information (pg. 1199, col. 2, 4th par. “maintain the magnitude of voltages … inside acceptable ranges”); 
generating a first action setting based on the state violation using a reinforcement learning (RL) algorithm by an Al agent of the electric power system where the state violation occurs (pg. 1200, col. 1, 4th full par. “voltage set point or as reactive power injection”); 
maintaining a second action setting by an Al agent of the electric power system where no substantial state violation is detected (pg. 1200, col. 1, 8th par. “While the entire state variables lie inside the limits … the system is told to be in final state condition”); and 
executing the first action setting in the electric power system to reduce the state violation (pg. 1200, col. 1, 4th full par. “voltage set point or as reactive power injection”).

Tousi does not explicitly disclose first and second AI agents assigned to first and second regions.

Stoupis teaches first and second controllers of first and second regions (par. [0037] “partition the global power network model in to one or more local power network models for one or more substations based upon areas of responsibility”, par. [0024] “one or more local grid management applications”). 

It would have been obvious at the time of filing to assign first and second AI agents (see e.g. Tousi Fig. 1) to first and second regions (Stoupis par. [0037] “partition the global power network model in to one or more local power network models”). Those of ordinary skill in the art would have been motivated to do so to provide reactive power control in a distributed power network.

Tousi and Stoupis do not teach using a deep reinforcement learning algorithm.

Zhang teaches monitoring and controlling voltage using a deep reinforcement learning (DRL) algorithm (Abstract “scheme against voltage instability with deep reinforcement learning (DRL)”). 

It would have been obvious at the time of filing to use a deep reinforcement learning algorithm. Those of ordinary skill in the art would have been motivated to do so as a known alternative learning algorithm which would have produced only the expected results. 

Claim 24: The method of claim 23 further comprising adjusting a partition of the electric power system by allocating a first bus from the first region to a third region of the plurality of regions, wherein the first bus is substantially uncontrollable by local resources in the first region and substantially controllable by local resources in the third region (Stoupis par. [0042] The feeder change power grid even 520”, par. [0037] “based upon areas of responsibility”).

Claims 11, 21 and 25 are rejected under 35 U.S.C. 103 as being unpatentable over “Application of SARSA Learning Algorithm for Reactive Power Control in Power System” by Tousi et al. (Tousi) in view of US 2013/0282189 to Stoupis et al. (Stoupis) in view of “Load Shedding Scheme with Deep Reinforcement Learning to Improve Short-term Voltage Stability” by Zhang et al. (Zhang) in view of US 2016/0048150 to Chiang et al. (Chiang).

Claims 11, 21 and 25: Tousi, Stoupis and Zhang teach claims 10, 13 and 23, wherein the generating the first action setting includes a training process comprising: 
obtaining an initial grid state using a power grid simulator (pg. 1199, 1st partial par. “observes the current state st”, pg. 1200, col. 2, last full par. “The power system was simulated”);
determining the state violation based on a deviation by the state information from the initial grid state (pg. 1199, col. 2, 4th par. “maintain the magnitude of voltage in some nodes inside acceptable ranges”); 
generating a first suggested action based on the state violation (pg. 1199, col. 1, 1st partial par. “the agent observes the current state st … and chooses an action at”); 
executing the first suggested action in the power grid simulator to obtain a new grid state (pg. 1199, col. 1, 1st partial par. “As a result the environment goes into a new state st+1”); 
calculating and evaluating with a reward function according to the new grid state (pg. 1199, col. 1, 1st partial par. “the agent receives a reward rs+1”); and
determining if the state violation is solved, wherein if the state violation is solved, the training process obtains a second power flow file at a second time step for another round of training process, and if the state violation is not solved, the training process generates a second suggested action by an updated version of the first Al agent (e.g. pg. 1199, col. 2, Table 1 “Repeat for each step in the episode … If is final state, end the episode (all voltages are inside the boundaries”).

Tousi, Stoupis and Zhang do not explicitly teach obtaining a first power flow file of the electric power system at a first time step.

Chiang teaches obtaining a power flow file (par. [0102] “reading power flow files”). 

It would have been obvious at the time of filing to obtain a power flow file and obtain the initial grid state from the power flow file (see e.g. Chiang par. [0102] “the power flow file … the initial state of the power network”). Those of ordinary skill in the art would have been motivated to do so as a known source of the necessary information which would have produced only the expected results. 

Claims 12, 22 and 26 are rejected under 35 U.S.C. 103 as being unpatentable over “Application of SARSA Learning Algorithm for Reactive Power Control in Power System” by Tousi et al. (Tousi) in view of US 2013/0282189 to Stoupis et al. (Stoupis) in view of “Load Shedding Scheme with Deep Reinforcement Learning to Improve Short-term Voltage Stability” by Zhang et al. (Zhang) in view of US 2016/0048150 to Chiang et al. (Chiang) in view of US 2020/0151562 to Piectquin et al. (Pietquin).

Claims 12, 22 and 26: Tousi, Stoupis, Zhang and Chiang teach claims 11, 21 and 25, but do not teach: 
storing grid transition information into a replay buffer of the first Al agent; and
sampling the replay buffer to update the first Al agent.

Pietquin teaches:
storing grid transition information into a replay buffer of the first Al agent (par. [0052] “replay buffer 150 … stores reinforcement learning transitions” par. [0035] “grid mains power”); and
sampling the replay buffer to update the first Al agent (par. [0055] “for use in training the reinforcement learning system”).

It would have been obvious at the time of filing to store grid transition information in a replay buffer and update the first AI agent (Pietquin par. [0055] “replay buffer 150 for use in training”). Those of ordinary skill in the art would have been motivated to do so as a known means of generating training data which would have produced only the expected results. 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASON D MITCHELL whose telephone number is (571)272-3728. The examiner can normally be reached Monday through Thursday 7:00am - 4:30pm and alternate Fridays 7:00am 3:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lewis Bullock can be reached on (571)272-3759. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JASON D MITCHELL/Primary Examiner, Art Unit 2199