DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The Information Disclosure Statements filed on 01/22/2021 and 02/23/2021 have been considered. An initialed copy of the Form 1449 is enclosed herewith.
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Status of Claims
Claims 1-8 were originally filled on 01/22/2021 and claimed priority on EP20157410.0, that was filled on 02/14/2020 . 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-2 and 7-8 are rejected under 35 U.S.C. 102(a)(20 as being anticipated by Van Heukelom et al (US 20200174481 A1) (Hereinafter referred to as Van Heukelom) 

Regarding Claim 1, Van Heukelom discloses a method for controlling a robot (See at least Van Heukelom Paragraph 0011, the autonomous vehicle is interpreted as a robot), comprising: 
acquiring sensor data representing an environment of the robot (See at least Van Heukelom Paragraph 0026, sensor data of the environment is captured); 
identifying one or more objects in the environment of the robot from the sensor data (See at least Van Heukelom Paragraph 0027 and Figure 1, objects in the environment are identified using the sensor data); 
associating the robot and each of the one or more objects with a respective agent of a multiagent system (See at least Van Heukelom Paragraphs 0053-0055 and Figure 4, the robot, and objects 310 and 404 are interpreted as respective agents of a multiagent system); 
determining, for each agent of the multiagent system, a quality measure which includes a reward term for a movement action at a position (See at least Van Heukelom Paragraphs 0054-0055 and Figure 4, the predicted trajectories of the objects are interpreted as reward terms for a movement action at a position; See at least Van Heukelom Paragraph 0028 and Figure 1, the robot’s trajectory is also interpreted as a reward term for a movement action at a position), and a coupling term which depends on probabilities of the other agents of the multiagent system occupying the same position as the respective agent at a time (See at least Van Heukelom Paragraphs 0033-0035 and Figure 1, the probability of where the object will be is determined; See at least Van Heukelom Paragraphs 0054-0057 and Figure 4, the probability of each object’s position for each time step is combined to make a merged map with an aggregated prediction possibilities, which is interpreted as a probability of the agents occupying the same position at the same time); 
determining a movement policy of the robot that selects movement actions with a higher value of the quality measure determined for the robot with higher probability than movement actions with a lower value of the quality measure (See at least Van Heukelom Paragraphs 0037-0039 and Figure 1, the trajectory with the lowest probability of collision (e.g. highest probability of avoiding collision) is determined as the movement policy of the robot); and 
controlling the robot according to the movement policy (See at least Van Heukelom Paragraphs 0022 and 0039, the robot is controlled to follow the trajectory).

Regarding Claim 2, Van Heukelom discloses the coupling term is a function of occupancy measures of the other agents, (See at least Van Heukelom Paragraphs 0054-0057 and Figure 4, the probability is a function of the position of other agents), wherein, for each of the agents the occupancy measure for a position and a time denotes a likelihood of the agent being in the position at the time (See at least Van Heukelom Paragraphs 0054-0057 and Figure 4, the probability of each object’s potential position for each time step is determined).

Regarding Claim 7, Van Heukelom discloses a robot controller configured to control a robot (See at least Van Heukelom Paragraph 0080, the controller controls the driving system and other components of the robot), the robot controller configured to: 
acquire sensor data representing an environment of the robot (See at least Van Heukelom Paragraph 0026, sensor data of the environment is captured); 
identify one or more objects in the environment of the robot from the sensor data (See at least Van Heukelom Paragraph 0027 and Figure 1, objects in the environment are identified using the sensor data); 
associate the robot and each of the one or more objects with a respective agent of a multiagent system (See at least Van Heukelom Paragraphs 0053-0055 and Figure 4, the robot, and objects 310 and 404 are interpreted as respective agents of a multiagent system); 
determine, for each agent of the multiagent system, a quality measure which includes a reward term for a movement action at a position (See at least Van Heukelom Paragraphs 0054-0055 and Figure 4, the predicted trajectories of the objects are interpreted as reward terms for a movement action at a position; See at least Van Heukelom Paragraph 0028 and Figure 1, the robot’s trajectory is also interpreted as a reward term for a movement action at a position), and a coupling term which depends on probabilities of the other agents of the multiagent system occupying the same position as the respective agent at a time (See at least Van Heukelom Paragraphs 0033-0035 and Figure 1, the probability of where the object will be is determined; See at least Van Heukelom Paragraphs 0054-0057 and Figure 4, the probability of each object’s position for each time step is combined to make a merged map with an aggregated prediction possibilities, which is interpreted as a probability of the agents occupying the same position at the same time); 
determine a movement policy of the robot that selects movement actions with a higher value of the quality measure determined for the robot with higher probability than movement actions with a lower value of the quality measure (See at least Van Heukelom Paragraphs 0037-0039 and Figure 1, the trajectory with the lowest probability of collision (e.g. highest probability of avoiding collision) is determined as the movement policy of the robot); and 
control the robot according to the movement policy (See at least Van Heukelom Paragraphs 0022 and 0039, the robot is controlled to follow the trajectory).

Regarding Claim 8, Van Heukelom discloses a non-transitory computer-readable medium on which are stored instructions for controlling a robot (See at least Van Heukelom Paragraphs 0114 and 0011, the autonomous vehicle is interpreted as a robot), the instructions, when executed by a computer, causing the computer to perform the following steps: 
acquiring sensor data representing an environment of the robot (See at least Van Heukelom Paragraph 0026, sensor data of the environment is captured); 
identifying one or more objects in the environment of the robot from the sensor data (See at least Van Heukelom Paragraph 0027 and Figure 1, objects in the environment are identified using the sensor data); 
associating the robot and each of the one or more objects with a respective agent of a multiagent system (See at least Van Heukelom Paragraphs 0053-0055 and Figure 4, the robot, and objects 310 and 404 are interpreted as respective agents of a multiagent system); 
determining, for each agent of the multiagent system, a quality measure which includes a reward term for a movement action at a position (See at least Van Heukelom Paragraphs 0054-0055 and Figure 4, the predicted trajectories of the objects are interpreted as reward terms for a movement action at a position; See at least Van Heukelom Paragraph 0028 and Figure 1, the robot’s trajectory is also interpreted as a reward term for a movement action at a position), and a coupling term which depends on probabilities of the other agents of the multiagent system occupying the same position as the respective agent at a time (See at least Van Heukelom Paragraphs 0033-0035 and Figure 1, the probability of where the object will be is determined; See at least Van Heukelom Paragraphs 0054-0057 and Figure 4, the probability of each object’s position for each time step is combined to make a merged map with an aggregated prediction possibilities, which is interpreted as a probability of the agents occupying the same position at the same time); 
determining a movement policy of the robot that selects movement actions with a higher value of the quality measure determined for the robot with higher probability than movement actions with a lower value of the quality measure (See at least Van Heukelom Paragraphs 0037-0039 and Figure 1, the trajectory with the lowest probability of collision (e.g. highest probability of avoiding collision) is determined as the movement policy of the robot); and 
controlling the robot according to the movement policy (See at least Van Heukelom Paragraphs 0022 and 0039, the robot is controlled to follow the trajectory).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3-5 are rejected under 35 U.S.C. 103 as being unpatentable over Van Heukelom in view of Ostafew (US 20210237769 A1) (Hereinafter referred to as Ostafew) 

Regarding Claim 3, Van Heukelom discloses the determining of the quality measures includes iteratively determining the quality measures in a plurality of iterations (See at least Van Heukelom Paragraphs 0054-0057 and Figure 4, the position and probability of each object is determined for each time step, which is interpreted as iteratively determining the quality measures), wherein each iteration includes a forward pass from an initial time to an end time over a plurality of time steps (See at least Van Heukelom Paragraphs 0054-0057 and Figure 4, the time steps are a forward pass from an initial time to an end time)…
Even though Van Heukelom discloses iteratively determining the quality measures over a plurality of time steps, Van Heukelom fails to discloses each iteration includes… a backward pass from the end time to the initial time over the plurality of time steps.
However, Ostafew discloses a backward pass from the end time to the initial time over the plurality of time steps (See at least Ostafew Paragraphs 0205-0209 and Figure 16, the position of the vehicle is determined by going backwards in time from an end time (t+X) to an initial time (t), which is interpreted as a backward pass).
It would have been obvious to one of ordinary skill to modify the teachings disclosed in Van Heukelom with Ostafew to have the iterations include a backward pass from the end time to the initial time over the plurality of time steps. By using the backward pass from an end time to an initial time, a vehicle can determine where it can clear another vehicle that is heading its way (See at least Ostafew Paragraphs 0205-0209 and Figure 16), which would increase the safety of the system by preventing collisions. 

Regarding Claim 4, Van Heukelom discloses the coupling term is a function of the occupancy measures of the other agents (See at least Van Heukelom Paragraphs 0054-0057 and Figure 4, the occupancy measures of the other agents is determined) and the forward pass includes updating, for each agent, the occupancy measure of a next time step by propagating the occupancy measure of a current time step of an agent using the policy of the agent at the current time step (See at least Van Heukelom Paragraphs 0054-0057 and Figure 4, the occupancy measures of the other agents is determined for each time step starting from the current time step).

	Regarding Claim 5, Van Heukelom discloses the coupling term is a function of the occupancy measures of the other agents (See at least Van Heukelom Paragraphs 0054-0057 and Figure 4, the occupancy measures of the other agents are determined) and…updating, for each agent, the quality measure and policy of the agent (See at least Van Heukelom Paragraphs 0054-0057 and Figure 4, the quality measures and polices of the agents are updated for each time step)…
	Even though Van Heukelom discloses updating the quality measure and policy of each agent, Van Heukelom fails to disclose the backward pass includes updating, for each agent, the quality measure and policy of the agent…at a current time step by using the occupancy measure of the other agents at a next time step.
	However, Ostafew discloses the backward pass includes updating the position and path of each agent at a current time step by using the occupancy measure of the other agents at a next time step (See at least Ostafew Paragraphs 0205-0209 and Figure 16, the backward pass updates the movement action of the agent at a current time step (t) by using the occupancy measures of the other vehicle at a next time step (t+X)).
It would have been obvious to one of ordinary skill to modify the teachings disclosed in Van Heukelom with Ostafew to update, for each agent, the quality measure and policy of the agent at a current time step by using the occupancy measure of the other agents at a next time step. By updating the agent at a current time step by using the occupancy measure of the other agents at a next time step, a vehicle can determine where it can clear another vehicle that is heading its way (See at least Ostafew Paragraphs 0205-0209 and Figure 16), which would increase the safety of the system by preventing collisions. 

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Van Heukelom in view of Palanisamy et al (US 20190278282 A1) (Hereinafter referred to as Palanisamy) 

Regarding Claim 6, Van Heukelom fails to disclose the movement policy is determined such that actions for a system state of the multiagent system are distributed according to a Boltzmann distribution depending on the quality measure determined for the robot.
However, Palanisamy discloses this limitation (See at least Palanisamy Paragraphs 0057 and 0059, the path of objects is predicted and used to plan the path for the vehicle, which is interpreted as movement policy of the multiagent system; See at least Palanisamy Paragraphs 0068-0069 and 0073, the movement policy is determined for the autonomous vehicle using a Boltzmann distribution depending on the reward, and the predicted paths of the objects).
It would have been obvious to one of ordinary skill to modify the teachings disclosed in Van Heukelom with Palanisamy to determine the movement policy based on the Boltzmann distribution depending on the quality measure. By using the Boltzmann distribution, the vehicle can choose the next movement policy/task for the next iteration for each of the N tasks, higher-level state, and the reward (See at least Palanisamy Paragraphs 0068-0069 and 0073). This would allow the vehicle to explore different methods of avoiding obstacles and select the easiest task (See at least Palanisamy Paragraphs 0068-0069 and 0073), which would increase the safety of the system by preventing accidents. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Zhu et al (US 11230297 B2) teaches predicting the probability of the movement of a pedestrian near an autonomous vehicle
Dupuis et al (US 20200198140 A1) teaches motion planning for a robot based on dynamic obstacles
Abrahams (US 20200172098 A1) teaches determining the probability of a moving object’s position at a future time
Nanri et al (US 20200111366 A1) teaches predicting the probability of another vehicle’s movement
Xiao et al (US 20180141544 A1) teaches predicting the trajectory of another vehicle and determining the probability of collision
Obata et al (US 20170210379 A1) teaches predicting the behavior of a user’s vehicle and other vehicles, and determining the possibility of collision between the vehicles


Any inquiry concerning this communication or earlier communications from the examiner should be directed to ESVINDER SINGH whose telephone number is (571)272-7875. The examiner can normally be reached Monday-Friday: 9 am-5 pm est.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abby Lin can be reached on 571-270-3976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/E.S./               Examiner, Art Unit 3664                                                                                                                                                                                         
/BHAVESH V AMIN/              Primary Examiner, Art Unit 3664