DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.     The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
2.      The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


3.      Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Nagarajan et al., US 10,981,272 in view of Aloimonos et al., US 2016/0221190.

     Regarding claim 1, Nagarajan teaches of a computer-implemented method for teaching a robot a task in a cluttered environment (See Fig.2; Col.5 line 22 to col.6 line19; Col.6 line 60 to col.7 line 63; Col.8 line 32 to col.9 line 4; Col.14 line 39 to 47; Col.17 lines 47 to 67 which discloses the teaching/training of a robot task in an environment with multiple objects), comprising: 
      receiving an input (See Fig.2; Col.5 line 22 to col.6 line19; Col.6 line 60 to col.7 line 63; Col.8 line 32 to col.9 line 4; Col.14 line 39 to 47; Col.17 lines 47 to 67 which discloses of receiving sensor inputs including camera images as well as speech/audio such as a command of “bring me the cup”);
      parsing the input to identify a task and a target object name (See Fig.2; Col.5 line 22 to col.6 line19; Col.6 line 60 to col.9 line 4; Col.14 line 39 to 47; Col.17 lines 47 to 67 which discloses of identifying the object of the cup and identifying the task associated with the cup such as grasping based on the input of the voice command and/or camera image identifying a cup);
     Nagarajan is silent with respect to receiving a set of time-series images depicts a demonstration of the task; based on the target object name, identifying a target object within the set of time-series images; identifying a timing of at least one physical movement associated with the target object within the set of time-series images; filtering the set of time-series images based on the target object and the timing of the at least one physical movement; and evaluating the filtered set of time-series images to isolate one or more skill parameters associated with performing the task.
     However, in the same field of endeavor, Aloimonos teaches of receiving a set of time-series images depicts a demonstration of the task (See [0021] video demonstrations for the robot); based on the target object name, identifying a target object within the set of time-series images (See [0025]-[0026] and [0093] which discloses of parsing the input by segmenting the video in time and extracting the objects, tools and movement and also discloses that audio may be also in addition be used to identify the object and task involved such that the robot may perform the task); identifying a timing of at least one physical movement associated with the target object within the set of time-series images (See Aloimonos, [0027], [0036], and [0079] segmenting video in time such as changes in grasp/movement); filtering the set of time-series images based on the target object and the timing of the at least one physical movement (Aloimonos, [0052]-[0057] pixel patch of the frame by frame based on the object and changes in grasp movements); and evaluating the filtered set of time-series images to isolate one or more skill parameters associated with performing the task (See [0021], [0025]-[0027]; [0031], [0036], [0042], [0066], [0049]-[0058]-[0061], and [0079] classifying the object and action and learning the grasp type for performing the task from segmenting the frames).
      It would have been obvious to one of ordinary skill in the art before the time effective filing date of the claimed invention to have modified the teachings of Nagarajan to have incorporated the teachings of Aloimonos for the mere benefit of being able to have a robot perform a task in a manner as desired.

      Regarding claim 2, the combination teaches the method of claim 1, wherein the set of time-series images are RGB-D images (See Aloimonos, [0054]).
     Regarding claim 3, the combination teaches the method of claim 1, wherein filtering the set of time-series images further comprises spatially filtering the set of time-series images based on the target object (See Nagarajan, Col.5 line 22 to ol.9 line 4; Col.14 line 39 to Col.17 line 67 bounding box in images for object to contour object for shape and size; Aloimonos, [0052]-[0057] pixel patch of the frame based on the object).
     Regarding claim 4, the combination teaches the method of claim 3, wherein spatially filtering the set of time-series images based on the target object further comprises spatially filtering the set of time-series images to identify one or more voxels associated with the target object (See Nagarajan, Col.5 line 22 to ol.9 line 4; Col.14 line 39 to Col.17 line 67 bounding box in images for object to contour object for shape and size and of three dimensions and three dimensional coordinates; Aloimonos, [0052]-[0057] pixel patch of the frame based on the object).
     Regarding claim 5, the combination teaches the method of claim 1, further comprising: detecting a plurality of objects within the set of time-series images; and based on the target object name, identifying a target object among the plurality of objects within the set of time-series images (See Nagarajan, Col.5 line 22 to col.6 line19; Col.6 line 60 to col.7 line 63; Col.8 line 32 to col.9 line 4; Col.14 line 39 to 47; Col.17 lines 47 to 67 classifying and identifying the objects in the video; Aloimonos, [0021], [0025]-[0026]; [0031], [0042], [0066], [0049]-[0050], [0058]-0061], and [0089] classifying the objects in the video in order to be able to perform the same task).
      Regarding claim 6, the combination teaches the method of claim 1, further comprising: parsing the input to identify an object attribute and based on the target object name and the object attribute, identifying the target object within the set of time-series images (See Nagarajan, Col.5 line 22 to col.6 line19; Col.6 line 60 to col.7 line 63; Col.8 line 32 to col.9 line 4; Col.14 line 39 to 47; Col.17 lines 47 to 67 classification module for classifying and identifying the objects based on at least shape/dimensions/colors; Aloimonos, [0021], [0025]-[0026]; [0031], [0042], [0066], [0049]-[0050], [0058]-0061], and [0089] classifying objects within the video).
      Regarding claim 7, the combination teaches the method of claim 1, wherein filtering the set of time-series images further comprises temporally filtering the set of time-series images based on the timing of the at least one physical movement (See Nagarajan, Col.5 line 22 to col.6 line19; Col.6 line 60 to col.7 line 63; Col.8 line 32 to col.9 line 4; Col.14 line 39 to 47; Col.17 lines 47 to 67 which discloses of filtering of at least based on before the grasp, before the grasp, or after the grasp; Aloimonos, [0027], [0036], and [0079] segmenting video in time such as changes in grasp/movement).
      Regarding claim 8, the combination teaches the method of claim 7, wherein temporally filtering the set of time-series images based on the timing of the at least one physical movement further comprises temporally filtering the set of time-series images to identify one or more voxels associated with times in which a human hand approaches or leaves the target object (See Nagarajan, Col.5 line 22 to col.6 line19; Col.6 line 60 to col.7 line 63; Col.8 line 32 to col.9 line 4; Col.14 line 39 to 47; Col.17 lines 47 to 67 grabbing /retrieving a cup; Aloimonos, [0021], [0025]-[0027]; [0031], [0036]-[0037], [0042], [0066], [0049]-[0053], [0058]-[0061], and [0079] knowing which tools to grasp and changing tools would mean the human hand approaches or leave an object).
      Regarding claim 9, the combination teaches the method of claim 1, wherein the at least one physical movement is associated with one of a grasp task or a release task (See Nagarajan, Col.5 line 22 to col.6 line19; Col.6 line 60 to col.7 line 63; Col.8 line 32 to col.9 line 4; Col.14 line 39 to 47; Col.17 lines 47 to 67 multiple grasp command/tasks; Aloimonos, [0062]-[0073] grasp types and hand).
      Regarding claim 10, the combination teaches the method of claim 1, wherein the task is a sequence of tasks (See Aloimonos, [0040]-[0042] and [0073] sequence of commands).
       Regarding claim 11, the combination teaches the method of claim 1, further comprising: encoding at least the one or more skill parameters as a task model (See Nagarajan, Col.5 line 22 to col.6 line19; Col.6 line 60 to col.7 line 63; Col.8 line 32 to col.9 line 4; Col.14 line 39 to 47; Col.17 lines 47 to 67 using the task models to execute the grasping commands; Aloimonos, [0021], [0025]-[0026]; [0031], [0042], [0066], [0049]-[0050], [0058]-0061], and [0089] which discloses of the encoded tasks with parameters).
       Regarding claim 12, the combination teaches the method of claim 11, further comprising: decoding the task model to calculate one or more motor commands corresponding to at least the one or more skill parameters for performing the task by a robot (See Nagarajan, Col.5 line 22 to col.6 line19; Col.6 line 60 to col.7 line 63; Col.8 line 32 to col.9 line 4; Col.14 line 39 to 47; Col.17 lines 47 to 67 using the task models to execute the grasping commands; Aloimonos, [0021], [0025]-[0026]; [0031], [0042], [0066], [0049]-[0050], [0058]-0061], and [0089] which discloses of the encoded tasks with parameters whereby in order to execute encoded instructions, a decoding process must occur).
       Regarding claim 13, Nagarajan teaches of a system comprising: 
       at least one processor; and at least one memory communicatively coupled to the at least one processor and having computer-executable instructions stored thereon, the computer-executable instructions when executed by the at least one processor (See Fig.2; Col.5 line 22 to col.6 line19; Col.6 line 60 to col.7 line 63; Col.8 line 32 to col.9 line 4; Col.14 line 39 to 47; Col.17 lines 47 to 67) causing the system to: 
      receive a verbal cue (See Fig.2; Col.5 line 22 to col.6 line19; Col.6 line 60 to col.7 line 63; Col.8 line 32 to col.9 line 4; Col.14 line 39 to 47; Col.17 lines 47 to 67 which speech/audio such as a command of “bring me the cup”);
      parse the verbal cue to identify a task and a target object name (See Fig.2; Col.5 line 22 to col.6 line19; Col.6 line 60 to col.9 line 4; Col.14 line 39 to 47; Col.17 lines 47 to 67 which discloses of identifying the object of the cup and identifying the task associated with the cup such as grasping based on the input of the voice command and/or camera image identifying a cup);
      Nagarajan is silent with respect to receive a set of time-series images depicting a demonstration of the task; detect a plurality of objects within the set of time-series images; based on the target object name, identify a target object from among the plurality of objects within the set of time-series images; identify a timing of at least one physical movement associated with the target object within the set of time-series images; filter the set of time-series images based on the target object and the timing of the at least one physical movement; and evaluate the filtered set of time-series images to identify one or more skill parameters associated with performing the task.
      However, in the same field of endeavor, Aloimonos teaches of to receive a set of time-series images depicting a demonstration of the task (See [0021] video demonstrations for the robot); detect a plurality of objects within the set of time-series images (See Aloimonos, [0021], [0025]-[0026]; [0031], [0042], [0066], [0049]-[0050], [0058]-0061], and [0089] classifying the objects in the video in order to be able to perform the same task); based on the target object name, identify a target object from among the plurality of objects within the set of time-series images (See [0025]-[0026] and [0093] which discloses of parsing the input by segmenting the video in time and extracting the objects, tools and movement and also discloses that audio may be also in addition be used to identify the object and task involved such that the robot may perform the task); identify a timing of at least one physical movement associated with the target object within the set of time-series images (See Aloimonos, [0027], [0036], and [0079] segmenting video in time such as changes in grasp/movement); filter the set of time-series images based on the target object and the timing of the at least one physical movement (See Aloimonos, [0052]-[0057] pixel patch of the frame by frame based on the object and changes in grasp movements); and evaluate the filtered set of time-series images to identify one or more skill parameters associated with performing the task (See [0021], [0025]-[0027]; [0031], [0036], [0042], [0066], [0049]-[0058]-[0061], and [0079] classifying the object and action and learning the grasp type for performing the task from segmenting the frames).
        It would have been obvious to one of ordinary skill in the art before the time effective filing date of the claimed invention to have modified the teachings of Nagarajan to have incorporated the teachings of Aloimonos for the mere benefit of being able to have a robot perform a task in a manner as desired.

       Regarding claim 14, the claim has been analyzed and rejected for the same reasons set forth in the rejection of claim 2.
       Regarding claim 15, the claim has been analyzed and rejected for the same reasons set forth in the rejection of claim 3.
       Regarding claim 16, the claim has been analyzed and rejected for the same reasons set forth in the rejection of claim 4.
       Regarding claim 17, the claim has been analyzed and rejected for the same reasons set forth in the rejection of claim 7.
      Regarding claim 18, the claim has been analyzed and rejected for the same reasons set forth in the rejection of claim 8.
       Regarding claim 19, the claim has been analyzed and rejected for the same reasons set forth in the rejection of claim 9.
       Regarding claim 20, the claim has been analyzed and rejected for the same reasons set forth in the rejections of claims 1 and 11.

Contact
4.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ricky Chin whose telephone number is 571-270-3753. The examiner can normally be reached on M-F 8:30-6:00.
	If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Benjamin Bruckart can be reached on 571-272-3982. The fax phone number for the organization where this application or proceeding is assigned is 703-872-9306. 
           Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
	

/Ricky Chin/
Primary Examiner 
AU 2423
(571) 270-3753
Ricky.Chin@uspto.gov