DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Pursuant to communications filed on 07/15/2019, this is a First Action Non-Final Rejection on the Merits. Claims 1-20 are currently pending in the instant application.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/03/2019 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the Examiner.
                                   Examiner's Note
Examiner has cited particular paragraphs and/or columns / lines numbers or figures in the reference(s) as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant, in preparing the responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner. Applicant is reminded 
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-10, 12-18, and 20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Porter et al (US 10,766,137), hereinafter “Porter”.
Regarding claims 1, 12 and 20, Porter discloses a system, the associated non-transitory CRM and the associated method (e.g. via an artificial intelligence system for modeling and evaluating robotic success at task performance, as shown in figure 2A), comprising: 

    PNG
    media_image1.png
    893
    562
    media_image1.png
    Greyscale

at least one computer hardware processor (fig. 2A: robotic control system 220 having the processor 208 – see col. 8, lines 29-31; col. 8, lines 59-67); and 
at least one non-transitory computer-readable storage medium (fig. 2A: memory 206 and data repository 218 – col. 8, lines 31-43) storing (220/208), cause the at least one computer hardware processor (220/208) to perform:
   
    PNG
    media_image2.png
    1058
    814
    media_image2.png
    Greyscale

receiving, from one or more sensors (fig. 2A and 3: observation system 215), sensor data relating to a robot (fig. 2A: robotic system 210 and figs. 3-4: robotic system 110) (col. 7, lines 25-31; col. 7, lines 54-66);

    PNG
    media_image3.png
    916
    683
    media_image3.png
    Greyscale

generating, using a statistical model (e.g. machine learning), based on the sensor data, first control information for the robot (210) to accomplish a task (see figs. 3-4 depicting block 304 – see col. 13, lines 58-64 disclosing the reward predictor 236 builds a model of task based on the output of the classifier. As described above, this can be accomplished via machine learning, for example through Bayesian inference or a deep neural network. The resulting reward function can include a number of weighted parameters that influence the level of success achieved during task performance. See col. 7, lines 54-66); 
transmitting, to the robot (210/110), the first control information for execution of the task (col. 7, lines 24-53; ; col. 8, lines 5-23; col. 10, lines 7-23); and 
receiving, from the robot (210/110), a result of execution of the task (see figs. 3-4 depicting the interaction with robot, particularly Block 402 for evaluation of the result) (see col. 1, lines 49-60; col. 13, lines 22-57; col. 14, line 63 to col. 15, line 13).  
          Regarding claims 2 and 13, Porter discloses, wherein the processor-executable instructions cause the at least one computer hardware processor (220/208) to further perform: 
in response to the result of execution of the task being unsuccessful: receiving, from a user, input relating to second control information for the robot to accomplish the task (see col. 6, lines 31-42; col. 10, lines 7-23; col. 10, lines 36-45; col. 13, lines 22-64); 
transmitting, to the robot, the second control information for execution of the task (see col. 6, lines 31-42; col. 10, lines 7-23; col. 10, lines 36-45; col. 13, lines 22-64); 
(see col. 6, lines 31-42; col. 10, lines 7-23; col. 10, lines 36-45; col. 13, lines 22-64); and
updating the statistical model based on the sensor data, the second control information, and the result of execution of the task (see col. 14, lines 42-64 disclosing the update step; col. 10, lines 7-23; col. 10, lines 36-45; col. 13, lines 22-64).  
Regarding claims 3 and 14, Porter discloses, wherein the processor-executable instructions cause the at least one computer hardware processor (220/208) to further perform: 
in response to the result of execution of the task being unsuccessful: 
updating a count of unsuccessful executions of tasks (see col. 15, lines 42-64 disclosing the update; and 
in response to the count of unsuccessful executions exceeding a threshold, receiving, from the user, the input relating to the second control information for the robot to accomplish the task (see col. 14, line 63 to col. 15, line 13 disclosing the threshold; see col. 15, lines 42-64 disclosing the threshold and the update).  
Regarding claims 4 and 15, Porter discloses, wherein the processor-executable instructions cause the at least one computer hardware processor (220/208) to further perform: 
(e.g. machine learning), a confidence value for the first control information; in response to the confidence value not exceeding a confidence threshold, receiving, from the user, the input relating to the second control information for the robot to accomplish the task (see figs. 3-4 depicting block 304 – see col. 13, lines 58-64 disclosing the reward predictor 236 builds a model of task based on the output of the classifier. As described above, this can be accomplished via machine learning, for example through Bayesian inference or a deep neural network. The resulting reward function can include a number of weighted parameters that influence the level of success achieved during task performance. See col. 11, lines 5-24 disclosing the rewards and weighting that are interpreted similar to the confidence values); and 
in response to the confidence value exceeding the confidence threshold, transmitting, to the robot, the first control information for execution of the task(see figs. 3-4 depicting block 304 – see col. 13, lines 58-64 disclosing the reward predictor 236 builds a model of task based on the output of the classifier. As described above, this can be accomplished via machine learning, for example through Bayesian inference or a deep neural network. The resulting reward function can include a number of weighted parameters that influence the level of success achieved during task performance. See col. 11, lines 5-24 disclosing the rewards and weighting that are interpreted similar to the confidence values).  
Regarding claims 5-6 and 16, Porter discloses, wherein the first control information relates to a grasp pose for an end effector of the robot (210/110) (fig. 1A-D; see col. 3, lines 10-35); and wherein the grasp pose comprises a position vector and an orientation vector for the end effector of the robot (see fig. 1A-D; see col. 3, lines 10-35; col. 8, lines 5-16 disclosing the position data of the robotic device; see also col. 12, lines 46-51).  
Regarding claim 7, Porter discloses, wherein the statistical model (e.g. machine learning) comprises a convolutional neural network (see col. 2, line 55 to col. 3, line 9 disclosing the convolutional neural network 125 as shown in fig. 1D) including an input layer, one or more convolution layers, one or more pooling layers, one or more dense layers, and an output layer (see col. 5, lines 1-32 disclosing the layers). 
              
    PNG
    media_image4.png
    686
    775
    media_image4.png
    Greyscale

Regarding claims 8-9 and 17, Porter discloses, wherein the result of execution of the task indicates whether execution of the task was successful or unsuccessful (see fig. 3: Block 301 for the successful task – see col. 2, lines 27-55; col. 13, lines 22-46); and wherein the result of execution of the task is based on an indication from a user regarding whether the execution of the task was successful or unsuccessful (see fig. 3: Block 301 for the successful task - see col. 2, lines 27-55; col. 13, lines 22-46).  
Regarding claims 10 and 18, Porter discloses, wherein the task relates to a grasp pose (see fig. 1A-D; see col. 3, lines 10-35; col. 8, lines 5-16 disclosing the position data of the robotic device; see also col. 12, lines 46-51), wherein a torque across an end effector of the robot is measured, and wherein the result of execution of the task is-25- successful or unsuccessful based on whether the measured torque exceeds or does not exceed a torque threshold (see col. 6, lines 15-52 disclosing the torque sensor as a recorded observed data for further learning).  
Allowable Subject Matter
Claims 11 and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
1. – US 10,919,152 to Kalouche – Which is directed to a system comprising: an imitation learning engine; an operator system controller coupled to an image capturing device, the operator system controller configured to: process an image of a subject, captured by the image capturing device, using a machine learning algorithm to identify one or more body parts of the subject during execution of a task; and generate, based on the one or more body parts of the subject, body pose information of the subject in the captured image, the body pose information indicating a pose or motion trajectory of the subject in the captured image; and a robotic system controller communicating with the operator system controller over a network, the robotic system controller coupled to a second image capturing device, the robotic system controller configured to: receive one or more images of a robot and/or an environment surrounding the robot, captured by the second image capturing device, during execution of the task; generate one or more pose and/or motion commands by processing the body pose information received from the operator system controller; control one or more actuators of the robot according to the one or more pose and/or motion commands to cause the robot to take a pose or motion trajectory corresponding to the pose or motion trajectory of the subject in the captured 
2. – US 2021/0237266 to Kalashnikov et al – Which is directed to Using large-scale reinforcement learning to train a policy model that can be utilized by a robot in performing a robotic task in which the robot interacts with one or more environmental objects. In various implementations, off-policy deep reinforcement learning is used to train the policy model, and the off-policy deep reinforcement learning is based on self-supervised data collection. The policy model can be a neural network model. Implementations of the reinforcement learning utilized in training the neural network model utilize a continuous-action variant of Q-learning. Through techniques disclosed herein, implementations can learn policies that generalize effectively to previously unseen objects, previously unseen environments, etc.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jaime Figueroa whose telephone number is (571)270-7620.  The examiner can normally be reached on Monday-Friday 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jeffrey A. Burke can be reached on 5712703844.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR 




/JAIME FIGUEROA/ Primary Examiner, Art Unit 3664-B