DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The Information Disclosure Statement filed 15 August 2019 has been fully considered. A signed copy is attached.
An Examiner’s Amendment has been entered with this office action.
Claims 5 and 6 have been canceled, per the Examiner’s Amendment.
Claims 21-23 have been added, per the Examiner’s Amendment.
Claims 1-4 and 7-23 are allowed, reasons follow. 


Priority

Examiner acknowledges that instant application is a Continuation of Applications 14/275,933 (now US patent # 9,753,441) and 15/660,146 (now US patent # 10,423,129) and has been accorded the benefit of the earliest original filing date.

Terminal Disclaimer
The terminal disclaimer over commonly assigned Patent 10,423,129 filed 12 March 2021 is acknowledged.


EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided 
Authorization for this examiner’s amendment was given in an interview with Atty. William O’Sullivan (Reg #59,005) on 10 March 2021.
The application has been amended as follows: 

Claim 1 has been replaced with the following:
A computer-implemented method for automatically controlling a dynamical system in an environment to execute a task defined by a cost function, the dynamical system having a state space and a control space, wherein executing the task comprises following at least one trajectory in a set of possible trajectories, the method comprising:
generating, by one or more computer-based processors, a sequence of Markov Decision Processes (MDPs), the sequence of MDPs comprising one or more MDPs, each MDP defining a subset of possible trajectories in the set of possible trajectories, each possible trajectory comprising a sequence of states in the state space;
computing values of controlling attributes for at least one of the states in at least one of the possible trajectories defined by at least one MDP in the sequence of MDPs based on the cost function; and
generating, based on the computed values of the controlling attributes, at least one feedback control policy for controlling the dynamical system to follow one or more trajectories in the set of possible trajectories during the execution of the task, wherein controlling attributes at each of the states in the at least one MDP comprise at least one of:
(i) an estimate of an optimal cost to complete the task by following a first trajectory starting from the state, or
(ii) (a) an estimate of an optimal cost to complete the task by following a second trajectory starting from the state, (b) a first sequence of at least one control input to be executed at the state to follow the second trajectory, and (c) a failure probability of reaching an undesired state when the dynamical system executes the first sequence of at least one control input, and 
wherein the controlling attributes at each of the states in the at least one MDP further comprise at least one of:
(i) an estimate of a minimal failure probability to complete the task by following a third trajectory starting from the state,
(ii) (a) an estimate of a minimal failure probability to complete the task by following a fourth trajectory starting from the state, (b) a second sequence of at least one control input to be executed at the state to follow the fourth trajectory, and (c) an estimate of a cost to complete the task when the dynamical system executes the second sequence of at least one control input,
(iii) an estimate of an optimal cost to arrive at the state starting from a current state of the dynamical system, or
(iv) data representing at least one of a physical constraint, a logical constraint, or a temporal constraint of trajectories that comprise the state.
	
Claim 5 has been Canceled.


Claim 6 has been Canceled.

Claim 19 has been replaced with the following:
19. 	A system comprising:
one or more computer-based processors;
one or more non-transitory machine-readable media storing instructions that, when executed by the one or more computer-based processors, cause the one or more computer-based processors to perform operations comprising:
generating a sequence of MDPs, the sequence of MDPs comprising one or more MDPs, each MDP defining a subset of possible trajectories in the set of possible trajectories, each possible trajectory comprising a sequence of states in the state space;
computing values of controlling attributes for at least one of the states in at least one of the possible trajectories defined by at least one MDP in the sequence of MDPs based on the cost function; and
generating, based on the computed values of the controlling attributes, at least one feedback control policy for controlling the dynamical system to follow one or more trajectories in the set of possible trajectories during the execution of the task,
wherein controlling attributes at each of the states in the at least one MDP comprise at least one of:
(i) an estimate of an optimal cost to complete the task by following a first trajectory starting from the state, or
(ii) (a) an estimate of an optimal cost to complete the task by following a second trajectory starting from the state, (b) a first sequence of at least one control input to be executed at the state to follow the second trajectory, and (c) a failure probability of reaching an undesired state when the dynamical system executes the first sequence of at least one control input, and 
wherein the controlling attributes at each of the states in the at least one MDP further comprise at least one of:
(i) an estimate of a minimal failure probability to complete the task by following a third trajectory starting from the state,
(ii) (a) an estimate of a minimal failure probability to complete the task by following a fourth trajectory starting from the state, (b) a second sequence of at least one control input to be executed at the state to follow the fourth trajectory, and (c) an estimate of a cost to complete the task when the dynamical system executes the second sequence of at least one control input,
(iii) an estimate of an optimal cost to arrive at the state starting from a current state of the dynamical system, or
(iv) data representing at least one of a physical constraint, a logical constraint, or a temporal constraint of trajectories that comprise the state.

Claim 20 has been replaced with the following:
20.	(Currently Amended) One or more non-transitory machine-readable media storing instructions that, when executed by one or more computer-based processors, cause the one or more computer-based processors to perform operations comprising:
generating a sequence of MDPs, the sequence of MDPs comprising one or more MDPs, each MDP defining a subset of possible trajectories in the set of possible trajectories, each possible trajectory comprising a sequence of states in the state space;
computing values of controlling attributes for at least one of the states in at least one of the possible trajectories defined by at least one MDP in the sequence of MDPs based on the cost function; and
generating, based on the computed values of the controlling attributes, at least one feedback control policy for controlling the dynamical system to follow one or more trajectories in the set of possible trajectories during the execution of the task, and
wherein controlling attributes at each of the states in the at least one MDP comprise at least one of:
(i) an estimate of an optimal cost to complete the task by following a first trajectory starting from the state, or
(ii) (a) an estimate of an optimal cost to complete the task by following a second trajectory starting from the state, (b) a first sequence of at least one control input to be executed at the state to follow the second trajectory, and (c) a failure probability of reaching an undesired state when the dynamical system executes the first sequence of at least one control input, and 
wherein the controlling attributes at each of the states in the at least one MDP further comprise at least one of:
(i) an estimate of a minimal failure probability to complete the task by following a third trajectory starting from the state,
(ii) (a) an estimate of a minimal failure probability to complete the task by following a fourth trajectory starting from the state, (b) a second sequence of at least one control input to be executed at the state to follow the fourth trajectory, and (c) an estimate of a cost to complete the task when the dynamical system executes the second sequence of at least one control input,
(iii) an estimate of an optimal cost to arrive at the state starting from a current state of the dynamical system, or
(iv) data representing at least one of a physical constraint, a logical constraint, or a temporal constraint of trajectories that comprise the state.


Claim 21 has been added as the following:
21.	A computer-implemented method for automatically controlling a dynamical system in an environment to execute a task defined by a cost function, the dynamical system having a state space and a control space, wherein executing the task comprises following at least one trajectory in a set of possible trajectories, the method comprising:
generating, by one or more computer-based processors, a sequence of Markov Decision Processes (MDPs), the sequence of MDPs comprising one or more MDPs, each MDP defining a subset of possible trajectories in the set of possible trajectories, each possible trajectory comprising a sequence of states in the state space;
computing values of controlling attributes for at least one of the states in at least one of the possible trajectories defined by at least one MDP in the sequence of MDPs based on the cost function; and
generating, based on the computed values of the controlling attributes, at least one feedback control policy for controlling the dynamical system to follow one or more trajectories in the set of possible trajectories during the execution of the task, 
wherein the state space comprises a set of possible states, each of the states comprising at least one of a component related to the dynamical system, a component related to the environment, or a time-dependent component; and wherein the control space comprises a set of control inputs, 
wherein the dynamical system has a dynamic that defines behaviors of the dynamical system given a sequence of at least one control input executed at one of the states in the state space of the dynamical system, 
wherein the dynamic of the dynamical system further defines behaviors of the dynamical systems given disturbances presented at the one of the states of the dynamical system, and
wherein generating the sequence of MDPs comprises:
initializing an empty MDP as a first MDP in the sequence of MDPs; and 
repeatedly constructing a new MDP from a previous MDP in an incremental manner, wherein incrementally constructing each of the new MDPs from the previous MDP comprises:	
constructing one or more boundary states; and
constructing at least one interior state;
wherein constructing the one or more boundary states comprises:
sampling at least one boundary state from a boundary of the state space,
adding the sampled at least one boundary state to the previous MDP, and
initializing values of controlling attributes for the sampled at least one boundary state; 
wherein constructing the at least one interior state comprises:
identifying at least one state from the previous MDP based on a selection criterion,
from the identified at least one state, simulating behaviors of the dynamical system given at least one sequence of at least one control input to obtain at least one interior state,
adding the at least one interior state to the previous MDP, and
initializing values of the controlling attributes for the at least one interior state; and
wherein states in the new MDP comprise the states in the previous MDP, the one or more boundary states, and the at least one interior state.

Claim 22 has been added as the following:

22. 	The computer-implemented method of claim 21, wherein the dynamical system comprises at least one of:  (i) one or more self-driving cars, (ii) one or more semi-autonomous cars, (iii) one or more aircraft, (iv) one or more unmanned aerial vehicles, (v) one or more robotic manipulators,  (vi) one or more steerable needles, (vii) one or more robotic surgical systems, or (viii) any other mechanical system.

Claim 23 has been added as the following:

23. 	The computer-implemented method of claim 21, wherein the task comprises at least one of i) navigating from a first location to a second location in the environment, ii) following a path defined by a set of waypoints, or iii) performing one or more sub-tasks in a predefined temporal order.

(End Examiner’s Amendment)


Allowable Subject Matter
Claims 1-4 and 7-23 are allowed.

The following is an examiner’s statement of reasons for allowance: The following is an examiner’s statement of reasons for allowance: While Budiman, et al., US PG-Pub 2011/0172976 teaches the use of a plurality of markov decision processes in trajectory planning for routing from an origin to a target; Hung et al., US PG-Pub 2010/0138096 teaches risk estimation and ranking of possible paths generated by a sequence of markov decision processes according to that risk estimation; and Anderson et al., US Pg-Pub 2007/0094187 teaches computing the risk boundary of a state-sequence projection for path planning, and Dugan et al., US 8,346,694 teaches construction of a Markov Model according to boundary states and attributes, then augmenting the Markov Model with failure probabilities to weight the decision process to minimize failure; none of the references, alone or in reasonable combination, teach or fairly suggest all of the limitations of the claimed invention, particularly:.
(Claim 1) 
wherein the controlling attributes at each of the states in the at least one MDP further comprise at least one of:
(i) an estimate of a minimal failure probability to complete the task by following a third trajectory starting from the state,
(ii) (a) an estimate of a minimal failure probability to complete the task by following a fourth trajectory starting from the state, (b) a second sequence of at least one control input to be executed at the state to follow the fourth trajectory, and (c) an estimate of a cost to complete the task when the dynamical system executes the second sequence of at least one control input,
(iii) an estimate of an optimal cost to arrive at the state starting from a current state of the dynamical system, or
(iv) data representing at least one of a physical constraint, a logical constraint, or a temporal constraint of trajectories that comprise the state.
(Excerpted)
…in combination with the remaining limitations and features of the claimed invention.

Similarly, Regarding Claim 21, while the references of record teach many of the limitations and elements of the claimed invention as outlined above; none of the references, alone or in reasonable combination teach or fairly suggest all of the limitations of the claimed invention, particularly:
	(Claim 21)
	wherein generating the sequence of MDPs comprises:
initializing an empty MDP as a first MDP in the sequence of MDPs; and 
repeatedly constructing a new MDP from a previous MDP in an incremental manner, wherein incrementally constructing each of the new MDPs from the previous MDP comprises:	
constructing one or more boundary states; and
constructing at least one interior state;
wherein constructing the one or more boundary states comprises:
sampling at least one boundary state from a boundary of the state space,
adding the sampled at least one boundary state to the previous MDP, and
initializing values of controlling attributes for the sampled at least one boundary state; 
wherein constructing the at least one interior state comprises:
identifying at least one state from the previous MDP based on a selection criterion,
from the identified at least one state, simulating behaviors of the dynamical system given at least one sequence of at least one control input to obtain at least one interior state,
adding the at least one interior state to the previous MDP, and
initializing values of the controlling attributes for the at least one interior state; and
wherein states in the new MDP comprise the states in the previous MDP, the one or more boundary states, and the at least one interior state.
	(Excerpted)
	…in combination with the remaining limitations and features of the claimed invention.

Independent claims 19 and 20 recite substantively the same subject matter identified with respect to claim 1 above. Accordingly, mutatis mutandis, these claims are likewise persuasive for the above noted reason(s).



The dependent Claims 2-4, 7-18, and 22-23, being definite, fully enabled, further limiting, and dependent upon the independent claim(s), are likewise persuasive for at least the above noted reason(s).

It is for these reasons that Applicant’s invention defines over the prior art of record.
	

Conclusion
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOSHUA T SANDERS whose telephone number is (571)272-5591.  The examiner can normally be reached on Generally Monday through Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mohammad Ali can be reached on 571-272-4105.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/J.T.S./Examiner, Art Unit 2119 

/MOHAMMAD ALI/Supervisory Patent Examiner, Art Unit 2119