DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .  Pursuant to communications filed on 26 November 2019, this is a First Action Non-Final Rejection on the Merits.  Claims 1-20 are currently pending in the instant application.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 1, Applicant provides the claim limitation, “applying the updated classifier to at least a subset of the second sensor data to determine probabilities associated with a plurality of statuses of the robotic apparatus with respect to one or more objects in the environment,” however it is unclear, based on the currently provided claim language, how the “probabilities associated with a plurality of statuses of the robotic apparatus with respect to one or more objects in the environment” is determined.  Specifically, based on the currently provided claim language, there currently is no link (i.e. correlation) between the claimed “sensor data” and the determination/identification of “a plurality of statuses of the robotic apparatus” and/or “one or more objects in the environment”, and therefore it is unclear how the “updated classifier” applied to “at least a subset of the second sensor data” determines said “probabilities” and therefore claim 1 is rendered indefinite.  Accordingly, appropriate correction and/or clarification are earnestly solicited.
Regarding claims 2-7, these claims are either directly or indirectly dependent upon independent claim 1, and therefore these claims are also rejected under this section for at least their dependence upon a rejected base claim.  Accordingly, appropriate correction and/or clarification are earnestly solicited.
Regarding claim 8, Applicant provides the claim limitation, “applying the updated classifier to at least a subset of second sensor data indicating at least a portion of the environment to assess at least a status of the robotic device with respect to one or more objects in the environment” however it is unclear, based on the currently provided claim language, how the “a status of the robotic device with respect to one or more objects in the environment” is assessed.  Specifically, based on the currently provided claim language, there currently is no link (i.e. correlation) between the claimed “sensor data” and the determination/identification of “a status of the robotic device” and/or “one or more objects in the environment”, and therefore it is unclear how the “updated classifier” applied to “at least a subset of the second sensor data” assesses said “at least a status of the robotic device” and therefore claim 8 is rendered indefinite.  Accordingly, appropriate correction and/or clarification are earnestly solicited.
Regarding claims 9-14, these claims are either directly or indirectly dependent upon independent claim 8, and therefore these claims are also rejected under this section for at least their dependence upon a rejected base claim.  Accordingly, appropriate correction and/or clarification are earnestly solicited.
Regarding claim 15, Applicant provides the claim limitation, “apply the updated classifier to at least a subset of second sensor data obtained at a second time, to assess at least a status of a robotic end-effector” however it is unclear, based on the currently provided claim language, how the “a status of the robotic device with respect to one or more objects in the environment” is assessed.  Specifically, based on the currently provided claim language, there currently is no link (i.e. correlation) between the claimed “sensor data” and the determination/identification of “a status of the robotic device” and/or “one or more objects in the environment”, and therefore it is unclear how the “updated classifier” applied to “at least a subset of the second sensor data” assesses said “at least a status of the robotic device” and therefore claim 15 is rendered indefinite.  Accordingly, appropriate correction and/or clarification are earnestly solicited.
Regarding claims 16-20, these claims are either directly or indirectly dependent upon independent claim 15, and therefore these claims are also rejected under this section for at least their dependence upon a rejected base claim.  Accordingly, appropriate correction and/or clarification are earnestly solicited.

	Examiner notes wherein the claims have been addressed below in view of the prior art of record, as best understood by the Examiner, in light of the 35 USC 112(b), or second paragraph rejections provided herein.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Levine et al (US 2017/0334066 A1, hereinafter Levine).
Regarding claim 1, Levine discloses a computer-implemented method of controlling a robotic apparatus for manipulating objects, comprising: 
at a first point in time, obtaining first sensor data indicating at least a portion of an environment where the robotic apparatus resides (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, block 754, and as further discussed in at least paragraph 0105, wherein “the system identifies an image that captures one or more environmental objects in an environment of the robot”); 
updating a classifier by changing at least one parameter or state of the classifier based on at least a subset of the first sensor data (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, block 756, and further as discussed in at least paragraph 0106, wherein “the system applies the image and the candidate robot movement parameters (e.g., a candidate robot state and candidate action(s)) to a trained neural network.”); 
at a second point in time that succeeds the first point in time, obtaining second sensor data indicating at least a portion of the environment (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, blocks 756-758, and further as discussed in at least paragraphs 0105-0108 and 0110, wherein subsequent to block 754 identifying an image, the method of 700 proceeds to blocks 756-758 and generates another predicted image based on the additional candidate robot movement parameters); 
applying the updated classifier to at least a subset of the second sensor data to determine probabilities associated with a plurality of statuses of the robotic apparatus with respect to one or more objects in the environment (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, and as further discussed in at least paragraphs 0110-0112, wherein the method 700 “may repeat, each time utilizing the predicted image generated in an immediately preceding iteration as the image identified at that iteration of block 754 and applied at that iteration of block 756, thereby enabling predicted images to be generated for each of multiple time steps in the future. In this manner, the predicted images may be utilized to determine motion of object(s) over multiple time steps, conditioned on a current image and based on candidate robot movement parameters.”); 
determining a robotic action based, at least in part, on the probabilities associated with the plurality of statuses (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, block 762 and as further discussed in at least paragraphs 0110-0112, wherein “the system may determine, based on comparing the predicted image to the current image, that the motion caused by the candidate movement is desirable” and further “where multiple predicted images are determined for a candidate movement the system may determine, based on the predicted images and/or the current image, that the motion caused by the candidate movement is desirable.”); and 
causing the robotic apparatus to perform the robotic action (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, block 762 and as further discussed in at least paragraphs 0110-0112, wherein, “Based on determining the motion is desirable, the system may perform the candidate movement by, for example, providing one or more control commands to one or more actuators of the robot to effectuate the candidate movement.”).
Regarding claim 2, Levine further discloses wherein the first sensor data includes a first image and the second sensor data includes a second image (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, and as further discussed in at least paragraphs 0110-0112, wherein the method 700 “may repeat, each time utilizing the predicted image generated in an immediately preceding iteration as the image identified at that iteration of block 754 and applied at that iteration of block 756, thereby enabling predicted images to be generated for each of multiple time steps in the future.”).
Regarding claim 3, Levine further discloses wherein the first image and second image are consecutive images within a sequence of images (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, and as further discussed in at least paragraphs 0110-0112, wherein the method 700 “may repeat, each time utilizing the predicted image generated in an immediately preceding iteration as the image identified at that iteration of block 754 and applied at that iteration of block 756, thereby enabling predicted images to be generated for each of multiple time steps in the future.”).
Regarding claim 4, Levine further discloses wherein the robotic apparatus includes a robotic end-effector used to grasp one or more objects (Figure 1; at least as in paragraphs 0049-0052, wherein one or more robots 180A, 180B each include end effectors 182A, 182B, respectively, for grasping one or more objects).
Regarding claim 5, Levine further discloses wherein the plurality of statuses are defined in accordance with a quantity of objects grasped by the robotic end-effector (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, wherein the robot status includes whether or not the robot end effector is currently holding an object or not).  Examiner notes wherein given the broadest reasonable interpretation of the currently provided claim limitation, it may reasonably be construed wherein “a quantity of objects grasped by the robotic end-effector, may be a simple determination of whether or not the robotic end effector is holding an object or not, which is easily identified/determined by the system of Levine through the image recognition, machine learning process shown in at least Figure 7).
Regarding claim 6, Levine further discloses wherein determining the robotic action comprises applying at least one reinforcement learning policy, at least in part, to the probabilities associated with the plurality of statuses (Figures 5-7; at least as in paragraphs 0094-0101, specifically as shown in at least Figures 6A-6B, wherein deep machine learning models are utilized to determine a corresponding robotic action for the robotic apparatus).
Regarding claim 7, Levine further discloses wherein the reinforcement learning policy is further applied to information indicating the second point in time (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, wherein the process 700, is a recursive process, and further wherein the trained neural network may be continuously updated/refined based on the detected sensor data (i.e. image data).
Regarding claim 8, Levine discloses a non-transitory computer-readable medium storing contents that, when executed by one or more processors (at least as in paragraphs 0018 and 0101), cause the one or more processors to perform acts comprising: 
updating a classifier based on at least a subset of first sensor data indicating at least a portion of an environment where a robotic device (Figures 1 & 8, robot 180A, 180B, 840) resides (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, block 756, and further as discussed in at least paragraph 0106, wherein “the system applies the image and the candidate robot movement parameters (e.g., a candidate robot state and candidate action(s)) to a trained neural network.”); 
applying the updated classifier to at least a subset of second sensor data indicating at least a portion of the environment to assess at least a status of the robotic device with respect to one or more objects in the environment (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, and as further discussed in at least paragraphs 0110-0112, wherein the method 700 “may repeat, each time utilizing the predicted image generated in an immediately preceding iteration as the image identified at that iteration of block 754 and applied at that iteration of block 756, thereby enabling predicted images to be generated for each of multiple time steps in the future. In this manner, the predicted images may be utilized to determine motion of object(s) over multiple time steps, conditioned on a current image and based on candidate robot movement parameters.”); 
determining a robotic action based, at least in part, on the status assessed (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, block 762 and as further discussed in at least paragraphs 0110-0112, wherein “the system may determine, based on comparing the predicted image to the current image, that the motion caused by the candidate movement is desirable” and further “where multiple predicted images are determined for a candidate movement the system may determine, based on the predicted images and/or the current image, that the motion caused by the candidate movement is desirable.”); and 
causing the robotic device to perform the robotic action (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, block 762 and as further discussed in at least paragraphs 0110-0112, wherein, “Based on determining the motion is desirable, the system may perform the candidate movement by, for example, providing one or more control commands to one or more actuators of the robot to effectuate the candidate movement.”).
Regarding claim 9, Levine further discloses wherein the acts further comprise applying the classifier to at least a subset of the first sensor data to assess at least a status of the robotic device prior to applying the updated classifier (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, and as further discussed in at least paragraphs 0110-0112, wherein the method 700 “may repeat, each time utilizing the predicted image generated in an immediately preceding iteration as the image identified at that iteration of block 754 and applied at that iteration of block 756, thereby enabling predicted images to be generated for each of multiple time steps in the future.”).
Regarding claim 10, Levine further discloses wherein the acts comprise further updating the updated classifier based on at least a subset of the second sensor data (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, and as further discussed in at least paragraphs 0110-0112, wherein the method 700 “may repeat, each time utilizing the predicted image generated in an immediately preceding iteration as the image identified at that iteration of block 754 and applied at that iteration of block 756, thereby enabling predicted images to be generated for each of multiple time steps in the future.”).
Regarding claim 11, Levine further discloses wherein the acts further comprise applying the further updated classifier to at least a subset of third sensor data indicating at least a portion of the environment (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, and as further discussed in at least paragraphs 0110-0112, wherein the method 700 “may repeat, each time utilizing the predicted image generated in an immediately preceding iteration as the image identified at that iteration of block 754 and applied at that iteration of block 756, thereby enabling predicted images to be generated for each of multiple time steps in the future.”).
Regarding claim 12, Levine further discloses wherein determining the robotic action is based further on at least one reinforcement learning policy in accordance with reward values assigned to different status-action pairs (Figures 5-7; at least as in paragraphs 0094-0101, specifically as shown in at least Figures 6A-6B, wherein deep machine learning models are utilized to determine a corresponding robotic action for the robotic apparatus).  Examiner notes wherein the claimed “reward values assigned to different status-action pairs” are disclosed in Applicant’s specification, wherein “a table of reward values (“rewards”) can be pre-determined to associate with individual robotic actions with each grasping status.”.  Therefore, Examiner contends wherein the claimed “reward” or “reward values” are simply identifying a corresponding robotic action based on the current status of the robot, which is disclosed in the above referenced sections of Levine.
Regarding claim 13, Levine further discloses wherein the reward values include positive and negative numbers (Figures 5-7; at least as in paragraphs 0094-0101, specifically as shown in at least Figures 6A-6B, wherein deep machine learning models are utilized to determine a corresponding robotic action for the robotic apparatus).  As noted above, with respect to claim 12, Applicant’s claimed “reward values” are simply identifying a corresponding robotic action with the current status of the robot apparatus, and therefore, it would reasonably be construed wherein the claimed “positive number” correlates to a desired/identified robot action, and the claimed “negative number” correlates to one or more actions that are not desired or to be implemented by said robot apparatus.
Regarding claim 14, Levine further discloses wherein the classifier includes at least one convolutional neural network (CNN) and at least one long short-term memory (LSTM) network (Figures 5-7; at least as in paragraphs 0094-0101, specifically as shown in at least Figures 6A-6B, wherein deep machine learning models are utilized to determine a corresponding robotic action for the robotic apparatus, and specifically as in at least paragraph 0095, wherein “in the neural network 600 of FIGS. 6A and 6B, convolutional layer 661, convolutional LSTM layers 672-677, and convolutional layer 662 are utilized to process an image 601 (e.g., a camera captured image in an initial iteration, and a most recently predicted image in subsequent iterations).”).
Regarding claim 15, Levine discloses a system (Figure 8, robot control system 860; at least as in paragraphs 0101 and 0115), comprising: 
one or more processors (at least as in paragraphs 0018 and 0101); and 
memory storing contents that, when executed by the one or more processors (at least as in paragraphs 0018 and 0101), cause the system to: 
update a classifier based on at least a subset of first sensor data obtained at a first time (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, block 756, and further as discussed in at least paragraph 0106, wherein “the system applies the image and the candidate robot movement parameters (e.g., a candidate robot state and candidate action(s)) to a trained neural network.”); 
apply the updated classifier to at least a subset of second sensor data obtained at a second time, to assess at least a status of a robotic end-effector (Figure 1, robotic end effectors 182A, 182B) with respect to one or more objects (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, and as further discussed in at least paragraphs 0110-0112, wherein the method 700 “may repeat, each time utilizing the predicted image generated in an immediately preceding iteration as the image identified at that iteration of block 754 and applied at that iteration of block 756, thereby enabling predicted images to be generated for each of multiple time steps in the future. In this manner, the predicted images may be utilized to determine motion of object(s) over multiple time steps, conditioned on a current image and based on candidate robot movement parameters.”); 
determine a robotic action based, at least in part, on the status assessed (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, block 762 and as further discussed in at least paragraphs 0110-0112, wherein “the system may determine, based on comparing the predicted image to the current image, that the motion caused by the candidate movement is desirable” and further “where multiple predicted images are determined for a candidate movement the system may determine, based on the predicted images and/or the current image, that the motion caused by the candidate movement is desirable.”); and 
cause a robotic device including the robotic end-effector to perform the robotic action (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, block 762 and as further discussed in at least paragraphs 0110-0112, wherein, “Based on determining the motion is desirable, the system may perform the candidate movement by, for example, providing one or more control commands to one or more actuators of the robot to effectuate the candidate movement.”).
Regarding claim 16, Levine further discloses wherein the robotic action includes at least one of (a) abort - stop a current grasp with the robotic end-effector, and retry, (b) continue - wait for third sensor data to be obtained, or (c) stow - stow at least one item grasped by the robotic end- effector (Figures 1 & 5-7; at least as in paragraphs 0001, 0047-0048 and 0101-0112).Examiner notes wherein based on the method(s) provided by Figures 5 & 7 (and corresponding related text) a robot action is determined, and further wherein said one or more robot actions include the well known action of “utilize a grasping end effector…to pick up an object from a first location, move the object to a second location, and drop off the object at the second location.”  Examiner therefore contends wherein at least this well-known teaching of a robotic action for picking an object with the end effector and moving said object to a second location and subsequently dropping off said object at said second location, is construed as a “stow” robotic action.
Regarding claim 17, Levine further discloses wherein first time precedes the second time (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, and as further discussed in at least paragraphs 0110-0112, wherein the method 700 “may repeat, each time utilizing the predicted image generated in an immediately preceding iteration as the image identified at that iteration of block 754 and applied at that iteration of block 756, thereby enabling predicted images to be generated for each of multiple time steps in the future.”).
Regarding claim 18, Levine further discloses wherein the first time and the second time are two points within a time sequence of sensor data obtained (Figures 5-7; at least as in paragraphs 0080, 0094-0098 and 0101-0112, specifically as in at least Figure 7, and as further discussed in at least paragraphs 0110-0112, wherein the method 700 “may repeat, each time utilizing the predicted image generated in an immediately preceding iteration as the image identified at that iteration of block 754 and applied at that iteration of block 756, thereby enabling predicted images to be generated for each of multiple time steps in the future.”).
Regarding claim 19, Levine further discloses wherein determining the robotic action includes applying a trained reinforcement learning agent (Figures 5-7; at least as in paragraphs 0094-0101, specifically as shown in at least Figures 6A-6B, wherein deep machine learning models are utilized to determine a corresponding robotic action for the robotic apparatus). (Figures 5-7; at least as in paragraphs 0094-0101, specifically as shown in at least Figures 6A-6B, wherein deep machine learning models are utilized to determine a corresponding robotic action for the robotic apparatus, and specifically as in at least paragraph 0095, wherein “in the neural network 600 of FIGS. 6A and 6B, convolutional layer 661, convolutional LSTM layers 672-677, and convolutional layer 662 are utilized to process an image 601 (e.g., a camera captured image in an initial iteration, and a most recently predicted image in subsequent iterations).”)
Regarding claim 20, Levine further discloses wherein the reinforcement learning agent is trained independently from training of the classifier (Figures 5-7; at least as in paragraphs 0094-0101, specifically as shown in at least Figures 6A-6B, wherein deep machine learning models are utilized to determine a corresponding robotic action for the robotic apparatus, and specifically as in at least paragraph 0095, wherein “in the neural network 600 of FIGS. 6A and 6B, convolutional layer 661, convolutional LSTM layers 672-677, and convolutional layer 662 are utilized to process an image 601 (e.g., a camera captured image in an initial iteration, and a most recently predicted image in subsequent iterations)”).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See attached PTO-892 – Notice of References Cited form.  Examiner additionally notes the following references, which are in the same field of endeavor as the instant invention, and also read on many of the currently provided claim limitations;
US 2019/0126472 A1, issued to Tunyasuvunakool et al, which is directed towards reinforcement and imitation learning for a task including a neural network control system for controlling an agent (i.e. robot, autonomous vehicle, etc.) to perform a task in a real-world environment based on both image data and proprioceptive data describing the configuration of the agent.
US 2019/0130216 A1, issued to Tomioka et al, which is directed towards an information processing apparatus and corresponding method that employs machine learning techniques, which may be applied to a robotic manipulator with an end effector, for handling one or more objects detected in an environment, in which said robotic manipulator is located.
US 2017/0106542 A1, issued to Wolf et al, which is directed towards a robotic manipulator for handling one or more objects that employs neural networks and LSTM models for controlling said robotic manipulator.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONATHAN L SAMPLE whose telephone number is (571)270-5925. The examiner can normally be reached Monday-Friday 7:00am-4:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Adam Mott can be reached on 571-270-5376. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JONATHAN L SAMPLE/Primary Examiner, Art Unit 3664