Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 12/03/2021. The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 4, 9, 11, 12, 14, 19 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang, Zhengtao, De Xu, and Min Tan. "Visual measurement and prediction of ball trajectory for table tennis robot." [hereinafter Zhang] in view of Tominaga, Masafumi, Hirotaka Ohta, and Shuji Hashimoto. "Image sequence prediction for remote robot control." [hereinafter Tominaga] further in view of Fasola, Juan, Paul E. Rybski, and M. Veloso. "Fast goal navigation with obstacle avoidance using a dynamic local visual model” [hereinafter Fasola].
Regarding claim 1, Zhang teaches:
(system consists of two cameras and a personal computer… prediction should be finished as soon as possible to leave more time to the robot to finish the hitting task… ping pong paddle moved to a location to hit an incoming ping pong ball; Zhang; II Distributed High-Speed StereoVision System, A. Hardware Architecture) and causing the robotic agent to move the one or more objects to the one or more target locations by repeatedly performing the following (the ball is returned by the paddle fixed on the end of a 5-DOF robot; Zhang; I Introduction, paragraphs 5-6):
	Zhang does not explicitly teach: receiving a current image of a current state of the real-world environment, determining, from the current image, a next sequence of actions to be performed by the robotic agent using a next image prediction neural network that predicts future images based on a current action and an action to be performed by the robotic agent
	Tominaga teaches:
receiving a current image of a current state of the real-world environment, determining, from the current image, a next sequence of actions to be performed by the robotic agent using a next image prediction neural network that predicts future images based on a current action and an action to be performed by the robotic agent (using a predicted image sequence in order to avoid time gaps between transmission and arrival… when operator makes moves or rotations, received images are manipulated via changing and/or scaling to construct a predicted image; Tominaga; III Image Prediction For Remote Control Operation)
It would have been obvious before the effective filing date for a person of ordinary skill in the art to combine the image prediction model taught by Tominaga and the apparatus that controls movement in a robotic agent taught by Zhang. The image prediction taught by Tominaga avoids time gap between image transmission and arrival time (B Image Processing For Remote Control operation, paragraph 1), thus eliminating possible delays between image capture and movement of the robotic agent taught by Zhang.
Zhang in view of Tominaga [hereinafter Zhang-Tominaga] does not explicitly teach: wherein the next sequence is the sequence of a plurality of candidate sequences that, if performed by the robotic agent starting from when the environment is in the current state, would be most likely to result in the one or more objects being moved to the respective target locations, and directing the robotic agent to perform the next sequence of actions.
Fasola teaches: wherein the next sequence is the sequence of a plurality of candidate sequences that, if performed by the robotic agent starting from when the environment is in the current state, would be most likely to result in the one or more objects being moved to the respective target locations (3 different walking angles evaluated every 300ms… multiple states in navigation algorithm, of which one is chosen at a time to eventually reach the goal; Fasola; Obstacle Avoidance Algorithm, Figure 6), and directing the robotic agent to perform the next sequence of actions (AIBO is directed to follow the algorithm, deciding to take one action after another; Fasola; Figure 6).
It would have been obvious before the effective filing date for a person of ordinary skill in the art to combine the navigation and obstacle avoidance algorithm taught by Fasola and the teachings of Zhang-Tominaga because the obstacle avoidance taught specifically by Fasola allows the mobile robot and robotic agent taught by Zhang-Tominaga to avoid obstacles which is a challenging problem for mobile robots (Abstract; §1 Introduction, paragraph 1).
	Regarding claims 11 and 12, Zhang teaches all the limitations and motivations of claim 1 in apparatus and/or product form rather than method form. Therefore, the supporting rationale of the rejection to claim 1 applies equally as well to those elements of claims 11 and 12. Zhang additionally teaches that the method is run on a system including a computer (Zhang; Abstract).
claim 2, Zhang teaches: wherein the current image is an image captured by a camera of the robotic agent (smart cameras have computation ability by integrating a digital signal processor and a field-programmable gate array… camera captures and processes images of the ping pong ball in motion; Zhang; II Distributed High-Speed Sterovision System, A Hardware Architecture).
	Regarding claim 4, Fasola teaches wherein directing the robotic agent to perform the next sequence of actions comprises:
directing the robotic agent to interrupt a current sequence of actions being performed by the robotic agent and to begin performing the next sequence of actions (AIBO switches between goal-navigation mode and contour-following mode… and is directed to turn in place if obstacle is in front, with the possibility of additional instructions; Fasola; §5 Obstacle Avoidance Algorithm, paragraph 2, Figure 6).
	Examiner notes that when AIBO detects an obstacle, the goal-navigating mode of the algorithm is interrupted, and AIBO moves onto the contour-following mode to avoid the obstacle. 
Regarding claim 9, Fasola teaches sampling the candidate sequences from a distribution over possible action sequences (multiple states in the navigation algorithm, and at each step, one of the candidates of steps is selected; Fasola; Obstacle Avoidance Algorithm).
Regarding claim 19, Fasola teaches all the limitations and motivations of claim 9 in apparatus and/or product form rather than method form. Therefore, the supporting rationale of the rejection to claims 9-10 applies equally as well to those elements of claims 19. Fasola additionally teaches that the method is meant to control the navigation of an autonomous robot, AIBO (Fasola; Abstract).

	Regarding claims 14 and 21, Fasola teaches all the limitations and motivations of claim 4 in apparatus and/or product form rather than method form. Therefore, the supporting rationale of the rejection to claim 4 applies equally as well to those elements of claims 14 and 21. Fasola additionally (Fasola, §3 The Robot Platform).

Claim 3 is rejected under U.S.C. 103 as unpatentable over Zhang in view of Tominaga in view of Fasola [hereinafter Zhang-Tominaga-Fasola] further in view of Hohl, Lukas, et al. "Aibo and Webots: Simulation, wireless remote control and controller transfer” [hereinafter Hohl]. 
Regarding claim 3, Hohl teaches: further comprising: providing, for presentation to a user, a user interface that allows the user to specify the objects to be moved and the target locations (GUI runs on a host computer and commands can be given to the AIBO robot via the GUI; §3 System Overview, Fig. 3). 

Claims 5-8 and 15-18 are rejected under U.S.C. 103 as unpatentable over [Zhang-Tominaga-Fasola] further in view of Walker, Jacob, Abhinav Gupta, and Martial Hebert. "Dense optical flow prediction from a static image." [hereinafter Walker]
Regarding claim 5, Walker teaches wherein the next image prediction neural network is a recurrent neural network that has been trained to (Neural network based approach to motion prediction; Walker; Abstract, §2 Background, Convolutional Neural Networks):
Walker does not explicitly teach: receive as input at least a current image and an input action, and process the input to generate a next image that is an image of a predicted next state of the environment if the robotic agent performs the input action when the environment is in the current state.
Zhang-Tominaga-Fasola teaches:
receive as input at least a current image and an input action, (captured image sequences are sent to the operator side computer… operator makes a forward/backward or rotation movement; Tominaga; III Image Prediction For Remote Control Operation, A. Image Presentation from Omni-directional Camera, B. Image Processing for Remote Control Operation) and process the input to generate a next image that is an image of a predicted next state of the environment if the robotic agent performs the input action when the environment is in the current state (images are manipulated via changing and/or scaling when the operator makes a forward/backward or rotational movement; Tominaga; III Image Prediction For Remote Control Operation.
Zhang-Tominaga-Fasola does not teach: wherein, as part of generating the next image, the recurrent neural network generates a flow map that identifies, for each of a plurality of pixels in the next image, a respective predicted likelihood of the pixel having moved from each of a plurality of pixels in the current image.
However, Walker teaches: wherein, as part of generating the next image, the recurrent neural network generates a flow map that identifies (network similar to standard 7-layer architecture… for every pixel in the image, distribution of motions are predicted with directions and magnitudes, etc.; Figure 2), for each of a plurality of pixels in the next image, a respective predicted likelihood of the pixel having moved from each of a plurality of pixels in the current image (optical flow vectors quantized… probability distribution over flow vectors for each pixel, average of the vectors taken to produce the final prediction output for each pixel; Walker; §3.1 Regression as Classification, Figure 2).
	Examiner notes that Walker teaches the use of flow vectors for each pixel, giving a probability distribution a pixel will move in a certain direction. This is done for every single pixel. While Walker’s use of flow mapping and flow vectors is chronological (using a current image to predict a future image/frame based on movement probability), it would have been obvious before the effective filing date for a person of ordinary skill in the art to use the same method of flow vectors of each pixel to get Tominaga in Zhang-Tominaga-Fasola, which predicts based upon the robotic agent’s own movements, but not necessarily the movements of the environment. 
	Regarding claim 6, Walker teaches wherein determining the next sequence of actions comprises:
determining, using flow maps generated by the next image prediction neural network, a respective likelihood for each of the candidate sequences that performance of the actions in the candidate sequence by the robotic agent would result in the objects being moved to the target locations (network finds active elements in the scene and correctly predicts future motion based on context; Walker; Figure 4).
	Examiner notes that the network taught by Walker does not only predict the overall motion of the image, but also individual elements in the scene, which can be mapped to the object being moved in the claim language. For example, in multiple scenes such as the surfing scene on the second row, the directions for the overall wave in addition to the surfing and the crashing waves are predicted. In addition, in the archery scene, the motion of the archer’s hand and the bow are both predicted. The object being moved by the target in the claim language could be mapped to an individual element, such as the bow. And the object that moves the bow could be the human. Both of these elements are successfully predicted by Walker.
	Regarding claim 7, Zhang teaches wherein determining the next sequence of actions comprises:
determining one or more pixels in the current image that depict the one or more objects as currently located in the environment (for ball recognition, pixel values are analyzed to recognize the ball from the background and players; Zhang; III Image-Processing Algorithm, A Preprocessing).
claim 8, Walker teaches wherein determining the respective likelihood for a given candidate sequence comprises recursively feeding as input to the neural network the actions in the sequence and the next images generated by the neural network for the actions (a modified Recurrent Neural Network, RNN, is proposed for multi-frame prediction; Walker; §5 Multi-Frame Prediction).
	Examiner notes that the broadest reasonable interpretation of recursively feeding as input to the neural network the actions in the sequence and the next images generated is back propagation. Recurrent Neural Networks inherently exercise back propagation through loops and recurrence. 
	Regarding claims 15-18, Walker teaches all the limitations and motivations of claims 5-8 in apparatus and/or product form rather than method form. Therefore, the supporting rationale of the rejection to claims 5-8 applies equally as well to those elements of claims 15-18. Walker additionally teaches that their researched utilized Tesla K40 GPUs (Walker; Acknowledgements).

	Claims 10, 20 are rejected under U.S.C. 103 as unpatentable over [Zhang-Tominaga-Fasola] further in view of Bartels, Chris and Gerard de Haan. "Smoothness constraints in recursive search motion estimation for picture rate conversion." [hereinafter Bartels].
	
	Regarding claim 10, Bartels teaches sampling the candidate sequences comprises: 
performing multiple iterations of sampling using a cross-entropy technique (vector candidate likelihood algorithm may be repeated over the same frame pair until convergence is reached… multiple iterations over the same frame pair improve convergence; Bartels; III RS-ME, Algorithm 1).
Examiner notes that it would have been obvious before the effective filing date for a person of ordinary skill in the art to use the teaching of a neural network image prediction system for navigation and motion control in a robotic agent as taught by Zhang-Tominaga-Fasola-Hohl with the teaching of a cross-entropy technique as taught by Bartels. Examiner further notes that Bartels teaches the use of an Bartels further teaches that the algorithm can be repeated over the same pixels to improve convergence rather than running it only once. This also takes into account differences in estimation, otherwise known as a cross-entropy technique. 
Regarding claims 20, Bartels teaches all the limitations and motivations of claims 9-10 in apparatus and/or product form rather than method form. Therefore, the supporting rationale of the rejection to claims 9-10 applies equally as well to those elements of claims 19-20. Bartels additionally teaches that the method is meant to target consumer-market embedded devices with limited resources (Bartels; Abstract).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC WU whose telephone number is (571) 272-3380. The examiner can normally be reached Mon-Thu: 0730-1700; Fri: Flex.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandes Rivas can be reached on (571) 272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and 
/ERIC C WU/Examiner, Art Unit 2128                                                                                                                                                                                                        
/OMAR F FERNANDEZ RIVAS/Supervisory Patent Examiner, Art Unit 2128