DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
This action is in reply to the response filed on 4/27/2022.
Claims 1, 5, 15, 17, and 21 are amended.
Claims 1-28 are currently pending and have been examined. 
This action is made FINAL.

Response to Arguments
The Amendment filed 4/27/2022 has been entered. Claims 1-28 remain pending in the application. Applicant’s amendments to the Specification and Claims have overcome each and every objection and 112(b) rejection set forth in the Non-Final Office Action mailed 12/27/2021.
The terminal disclaimer was disapproved. This application was filed on or after September 16, 2012.  The person who signed the terminal disclaimer is not the applicant, the patentee or an attorney or agent of record. See 37 CFR 1.321(a) and (b). Please submit a Power of Attorney and resubmit the terminal disclaimer.  No fee is required for filing the terminal disclaimer again. (Note: Power of Attorney (PoA) can be given to a customer number, wherein all practitioners listed under the customer number have PoA. If PoA is given to a list of practitioners by registration number, the list may not comprise more than 10 practitioners, or a separate paper signed by a 37 CFR 1.33(b) party must be in the record identifying which of the practitioners, up to 10, are recognized as having PoA. A representative of the assignee, who is not of record, cannot sign the TD unless it is established that the representative is a party authorized to act on behalf of the assignee.). 
Regarding the double patenting rejection, the amendments to claim 1 have overcome the double patenting rejection filed 12/27/2021. However, a new obviousness type double patenting rejection is made of Claim 1 in view of U.S. Patent No. 11130236 and relevant prior art (see rejection) since the terminal disclaimer filed 4/27/2022 was disapproved. Additionally, another obviousness type double patenting rejection is made in view of application 17/018674 (published 3/17/2022) and relevant prior art (see rejection).
The argument on page 14, regarding the use of Boca is unpersuasive. Despite the human hand not physically touching the workpiece in the demonstration of Boca, the operation to be performed on the workpiece is demonstrated by the human hand. And therefore, teaches the claim limitation “demonstrating the operation on a workpiece by a human hand.” Accordingly, Boca remains as the primary reference of the 103 rejections made. 
The arguments on pages 14-15, regarding the combination of Boca and Sager for the amended claim 1 limitations are persuasive. Therefore, the rejection has been withdrawn.  However, upon further consideration due to the amendments, a new ground(s) of rejection is made in view of Boca (US 20150314442 A1), Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), and Sager (US 5040056 A).
The arguments on pages 18-19, regarding the combination of Boca, Sager, and Itkowitz for the amended claim 15 limitations are persuasive. Therefore, the rejection has been withdrawn.  However, upon further consideration based on the new amendments, a new ground(s) of rejection is made in view of Boca (US 20150314442 A1), Butler (US 20160307032 A1), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), and Sager (US 5040056 A).
The arguments on pages 19-20, regarding the combination of Boca, Sager, and Itkowitz for the amended claim 17 limitations are persuasive. Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Boca (US 20150314442 A1), Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), and Sager (US 5040056 A).
The argument on pages 20-21, regarding the use of Itkowitz is unpersuasive. Despite fiducial markers being located on the key points of a human hand, 3 dimensional positions are determined for the key identifiable points on the hand in the images. Accordingly, the Itkowitz reference remains in use for some 103 rejections. 
All the dependent claims are also rejected in view of the new prior art rejections made on the parent claims.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-2, 11-14, 17, and 26-28 is/are rejected under 35 U.S.C. 103 as being unpatentable over Boca (US 20150314442 A1) in view of Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), and Sager (US 5040056 A).

Regarding Claim 1, 
Boca teaches
A method for programming a robot to perform an operation by human demonstration, said method comprising (“There is described below the use of hand gestures to teach a path to be followed by the industrial robot 12 in performing work on workpiece 14.” [0021]):
demonstrating the operation on a workpiece by a human hand (“the instructions to the robot 12 that will be assembled from the hand gestures from the one or two hands seen by the camera and as described herein the object being pointed to, that is the scene data to create the path and instructions to be followed by the robot 12 when the robot performs work on the workpiece 14. For example, one hand is used to teach a robot target and the other hand is used to generate a grab or drop instruction. It is up to the robot operator to associate a particular hand gesture with a particular instruction.” [0023]; Examiner Interpretation: Despite the human hand not physically touching the workpiece in the demonstration, the operation to be performed on the workpiece is demonstrated by the human hand.);
analyzing camera images of the hand demonstrating the operation on the workpiece, by a computer (“the image of the location pointing hand gesture of step 304 and the associated location on the object are captured by the camera 11 and sent to the computation device 13 for processing. At step 308, the computation device 13 calculates from the image the corresponding location and orientation of the robot tool in the robot scene.” See at least [0034]),
a move step where hand pose and workpiece pose are determined at a plurality of points defining a move path (“At step 310 the calculated location and orientation of the robot tool are sent to the computation device. Query 312 asks if more location points are needed to complete the robot path. Query 312 can be another gesture. If the answer is yes, the method 300 asks at query 314 if there is a need to reposition the camera. If the answer to query 314 is no, then the method 300 returns to step 304 where the operator makes the hand gesture associated with the next location point. While not shown in FIG. 3, if the answer to query 314 is yes, then the method 300 returns to step 302 where the camera is repositioned. If the answer to query 312 is no, then method 300 ends since no more robot path points have to be acquired.” [0039]; Examiner Interpretation: Hand pose also corresponds to workpiece pose (see [0024]).)
generating robot motion commands, based on the demonstration data … to cause the robot to perform the operation on the new workpiece (“creating robot instructions from the gestures by using the gesture context to the scene data from the same image or as additional data or extra processing to calculate/generate robot instructions (step 704 and optional step 706), storing the created instructions (step 708), asking if more created instructions are needed (step 710) and in step 712 sending the created instructions to the robot if no more created instructions are needed and performing in FIG. 7b all of the steps shown in FIG. 7a except the step 712 of sending the created instructions to the robot. The optional step 706 in these flowcharts of providing the scene 3D model to convert the gesture to a robot instruction step 704 is only needed if the scene will be subtracted from the image of the gesture.” [0052]; “In general a robot move instruction has information about the robot tool and coordinate system used for the robot target” [0057]);
and performing the operation on the new workpiece by the robot (“that is the scene data to create the path and instructions to be followed by the robot 12 when the robot performs work on the workpiece 14.” [0023]; “By work is meant those actions performed by a robot such as painting, grinding, polishing, deburring, welding etc. that make a physical change to the workpiece and those interactions that a robot has with a workpiece such as picking up the workpiece from one location and moving it to another location or inserting the workpiece into a specific location that does not physically change the workpiece.” [0003]).

Boca does not explicitly teach
to create demonstration data where the demonstration data defines a pick, move and place operation including a grasping step where hand pose and workpiece pose are determined when the hand grasps the workpiece, … and a place step where the workpiece pose is determined when the workpiece becomes stationary after the move step, 
where the demonstration data includes a hand coordinate frame and a gripper coordinate frame corresponding to the hand coordinate frame, 
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper
analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece; … and the initial position and orientation of the new workpiece,

However, Yusuke teaches
create demonstration data where the demonstration data defines a pick, move and place operation including a grasping step where hand pose and workpiece pose are determined when the hand grasps the workpiece, a move step …, and a place step where the workpiece pose is determined when the workpiece becomes stationary after the move step … and a gripper coordinate frame corresponding to the hand (“The control unit 16 assumes that the user's two fingers (hereinafter, the user's (human) finger is referred to as a “finger” and distinguished from the finger unit 14 or the finger members 14A and 14B of the robot 10) grips the workpiece. When recognized, the position of the finger in the work coordinate system is recognized (step S33). That is, the control unit 16 recognizes that the finger has gripped the workpiece from the finger position and the workpiece position, and converts the finger position at that time to the workpiece coordinate system. The control unit 16 stores the position in the workpiece coordinate system at this time in the storage unit 17 as gripping position information. Next, the user holding the workpiece with two fingers moves the workpiece to a desired location and releases the workpiece (releases the finger from the workpiece). The control unit 16 recognizes that the finger has moved away from the workpiece based on the image from the camera 15, recognizes the position and posture of the workpiece in the robot coordinate system at that time as the position and posture of the movement destination of the workpiece, is stored (step S34). With the above operation, the control unit 16 stores information on the position of the workpiece in the workpiece coordinate system and the position and posture of the workpiece in the robot coordinate system. As a result, preparation for instructing the work to be performed on the work and having the robot 10 perform the work on the work is completed. Note that the position and orientation of the movement destination of the workpiece may be recognized based on the position and orientation of the workpiece when it is recognized that the movement of the workpiece has stopped for a predetermined time or more based on the captured image. In the above embodiment, the gripping position of the workpiece is determined based on the gripping position when gripping the workpiece.” See at least page 3, line 48 to page 4, line 14.),
	 It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of Boca to further include the teachings of Yusuke to quickly and easily teach robots workpiece operations (See at least “problem to be solved” on page 1.).

Yusuke also does not explicitly teach
where the demonstration data includes a hand coordinate frame and a gripper coordinate frame corresponding to the hand coordinate frame, 
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper
analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece; … and the initial position and orientation of the new workpiece,

However, Kofman teaches
where the demonstration data includes a hand coordinate frame and a gripper coordinate frame corresponding to the hand coordinate frame (“The orientation of the hand of the operator is used to control the orientation of the robot-manipulator end-effector and is computed from the 3-D coordinates of the centroids of the three hand markers as shown in Fig. 3. Firstly, the midpoint of the line segment joining the thumb and index-finger marker centroids, T and I, respectively, is defined as M (Fig. 3(a)). A coordinate system X o YoZo with origin at wrist Wis then defined by a translation of the local-site global reference coordinate system XY Z to the wrist [Fig. 3(b)]. Through yaw, pitch, and roll rotations, explained below, the final axes X3Y3Z3 to be used to determine the tool axes of the robot-end-effector are obtained with X3 collinear with WM, WT I coplanar with X3Y3, and T lying in the first quadrant of X3Y3, as shown in Fig. 3(b). The yaw-pitch-roll tool rotation angles are determined directly from the hand rotation angles of WM and TI as follows: yaw rotation a of coordinate system XoYoZo about Zo to X1Y1Z1, pitch rotation (3 of X1Y1Z1 about Y1 to X2Y2Z2, shown in Fig. 3(c) using -(3 for clarity, and roll rotation I of X2Y2Z2 about X2 to X3Y3Z3, as shown in Fig. 3(d).” See at least Pg. 4, Col. 2, lines 1-20; Fig. 3 shows the hand coordinate frame which corresponds to the robot coordinate frame.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of Boca and Yusuke to further include the teachings of Kofman regarding corresponding coordinate frames, to remotely control a robot based on position and orientation of a human operator’s hand in a demonstration of the operation without the restraints of sensors and wires on the human hand. See at least the introduction on Pgs. 1-2.

Kofman also does not explicitly teach
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper
analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece; … and the initial position and orientation of the new workpiece,
However, JETTÉ teaches
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper (“In this specific embodiment, the vacuum cup is made of a flexible, resilient material, and the relative distance between the robot and the workpiece held by the robot can vary based on this flexibility and operating conditions. Such variations in the relative distance between a given robot and the workpiece it holds was a source of positioning uncertainty in the reference frame of the robots. This gripper type was found to provide satisfactory gripping capability in the embodiment shown in FIG. 1, but it will be understood that other gripper types can be used in other embodiments. Moreover, more than one gripper, possibly of different gripper types, can be used as the end effector per robot if desired. For instance, a clamp gripper can be used in addition to a vacuum cup for a given robot, or for all robots, for instance. The gripper type or types can vary from one robot to another within a given workpiece holding system embodiment. Indeed, the exact type of gripper can be selected from the following general categories: impactive—e.g. jaws, clamps or claws which physically grasp by direct impact upon the object; ingressive—pins, needles or hackles which physically penetrate the surface of the object (e.g. an aperture or bore of the workpiece); astrictive—forces applied to the objects surface (e.g. by vacuum, magneto- or electroadhesion); and contigutive—requiring direct contact for adhesion to take place (e.g. surface tension or freezing).” [0043-0044])
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of Boca, Yusuke, and Kofman to further include the teachings of JETTÉ so that gripper type can vary as needed for different application requirements (see at least [0043-0044]).

JETTÉ also does not explicitly teach
	analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece; … and the initial position and orientation of the new workpiece,
However, Sager teaches
“This invention provides a method and apparatus which uses a vision-equipped robotic system to locate, identify and determine the orientation of objects, and to pick them up and transfer them to a moving or stationary destination.” Col. 1, lines 50-54; Examiner Interpretation: The determined locations and orientations of the objects are initial locations and orientations because they are in that position before being picked up. They are new workpieces since they are different from the demonstrated workpiece.
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of Boca, Yusuke, Kofman, and JETTÉ to further include the teachings of Sager to increase the flexibility of the robot to pick up randomly positioned objects (“known methods and apparatus are generally not effective with randomly positioned and randomly oriented objects. This is typically the case with objects that are deposited onto a conveyor belt, such as flat components that are asymmetrical about at least one axis. For these parts, the system must locate them on the moving conveyor belt and also determine their orientation. This requires a relatively sophisticated vision system.” See at least Col. 1, lines 17-24).

Regarding Claim 2,
Modified Boca teaches
	The method according to Claim 1
Boca further teaches
	wherein demonstrating the operation on the workpiece by the human hand and performing the operation on the new workpiece by the robot are both performed in a robotic work cell, and the camera images are taken by a single camera (“There is described below the use of hand gestures to teach a path to be followed by the industrial robot 12 in performing work on workpiece 14. As shown in FIG. 2a, an operator 16 uses hand gestures to point to a location in the robot workspace. The camera, which is a 3D vision sensor 11, is attached to the robot and takes an image of the hand gesture and the relationship of the operator's hand 16a to the workpiece 14. It should be appreciated that the workpiece 14 while shown in FIG. 2a may not be in the view seen by the camera. The workpiece image may have been taken at a different time and as is described below the image of the hand without the workpiece and the workpiece without the hand need to be referenced to a common coordinate system. FIG. 2b shows one example of the image and the relationship of the operator's hand 16a to the workpiece 14.” See at least [0021-0022] and figs. 2a and 2b.; Examiner Interpretation: The demonstration is performed in the robotic workcell when hand gestures are pointed at a location within the robotic workspace. From Fig. 2a, you can see the hand within the workspace of the robot.).

Regarding Claim 11,
Modified Boca teaches
The method according to Claim 1
Boca does not explicitly teach
wherein the new workpiece, before the operation by the robot, rides on a conveyor, and the initial position of the new workpiece is a function of a conveyor position index.
However, Sager teaches
“This invention provides a method and apparatus which uses a vision-equipped robotic system to locate, identify and determine the orientation of objects, and to pick them up and transfer them to a moving or stationary destination. A video camera periodically records images of objects located on a moving conveyor belt. The images are identified and their position and orientation is recorded in a moving conveyor belt coordinate system. The information is transmitted to a motion control device associated with a first robot. The motion control device coordinates the robot with the moving belt coordinate system and instructs the robot's arm to pick up certain objects” See at least Col. 1, lines 50-62; Examiner Interpretation: The use of the moving conveyor belt coordinate system to identify location and orientation is interpreted to be the same as using a function of a conveyor position index.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Sager to increase the flexibility of the robot to pick up randomly positioned objects (“known methods and apparatus are generally not effective with randomly positioned and randomly oriented objects. This is typically the case with objects that are deposited onto a conveyor belt, such as flat components that are asymmetrical about at least one axis. For these parts, the system must locate them on the moving conveyor belt and also determine their orientation. This requires a relatively sophisticated vision system.” See at least Col. 1, lines 17-24).

Regarding Claim 12,
Modified Boca teaches
	The method according to Claim 1
Boca further teaches
wherein generating robot motion commands includes generating commands, by a robot controller having a processor and memory (“At step 514, the identified gesture is stored in the memory of the computation device 13 or in the absence of such a device in the memory of the robot controller 15.” [0050]; “creating robot instructions from the gestures by using the gesture context to the scene data from the same image or as additional data or extra processing to calculate/generate robot instructions (step 704 and optional step 706), storing the created instructions (step 708), asking if more created instructions are needed (step 710) and in step 712 sending the created instructions to the robot if no more created instructions are needed and performing in FIG. 7b all of the steps shown in FIG. 7a except the step 712 of sending the created instructions to the robot. The optional step 706 in these flowcharts of providing the scene 3D model to convert the gesture to a robot instruction step 704 is only needed if the scene will be subtracted from the image of the gesture.” [0052]; “In general a robot move instruction has information about the robot tool and coordinate system used for the robot target” [0057]),
to cause a robot gripper to move to a grasping position and orientation based on … and position and orientation of the gripper relative to the workpiece contained in the demonstration data (“the hand and finger location and orientation can be used to calculate the corresponding location and orientation of the robot tool in the robot scene.” [0038]; “location, orientation and associated action can be sent to the robot individually or all at once at the end of the teaching process; … with the image of the scene the part can be recognized and then the processing of the gesture has to be in relationship to the part; the robot targets can be defined relative to a part coordinate system” [0061-0064]; “there will be other points along the path between the start and stop points at which the robot will perform work such as follow a path, pick up an object, drop an object and a unique gesture will be associated with each of these intermediate points.” [0026]).

Boca does not explicitly teach
based on the initial position and orientation of the new workpiece
However, Sager teaches
to cause a robot gripper to move to a grasping position and orientation based on the initial position and orientation of the new workpiece (“The motion controller will go ahead and direct the robot to pick-up that object after accounting for its movement during the time it takes for the robot to reach it.” Col. 7, lines 14-17; “the motion controller considers the time it would take the robot arm to move from its current position to the location of the object” Col. 7, lines 2-5;“The motion controller considers the object orientation in picking up the object” Col. 7, lines 60-61).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Sager to increase the flexibility of the robot to pick up randomly positioned objects (“known methods and apparatus are generally not effective with randomly positioned and randomly oriented objects. This is typically the case with objects that are deposited onto a conveyor belt, such as flat components that are asymmetrical about at least one axis. For these parts, the system must locate them on the moving conveyor belt and also determine their orientation. This requires a relatively sophisticated vision system.” See at least Col. 1, lines 17-24).

Regarding Claim 13,
Modified Boca teaches
	The method according to Claim 12
Boca further teaches
wherein generating robot motion commands further includes generating commands causing the robot gripper to move the … workpiece from the grasping position to other positions
contained in the demonstration data (“there will be other points along the path between the start and stop points at which the robot will perform work such as follow a path, pick up an object, drop an object and a unique gesture will be associated with each of these intermediate points.” [0026]).
Boca does not explicitly teach
the new workpiece
However, Sager teaches
Wherein generating robot motion commands further includes generating commands causing the robot gripper to move the new workpiece from the grasping position to other positions (“This invention provides a method and apparatus which uses a vision-equipped robotic system to locate, identify and determine the orientation of objects, and to pick them up and transfer them to a moving or stationary destination.” Col. 1, lines 50-54)
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Sager to increase the flexibility of the robot to pick up randomly positioned objects (“known methods and apparatus are generally not effective with randomly positioned and randomly oriented objects. This is typically the case with objects that are deposited onto a conveyor belt, such as flat components that are asymmetrical about at least one axis. For these parts, the system must locate them on the moving conveyor belt and also determine their orientation. This requires a relatively sophisticated vision system.” See at least Col. 1, lines 17-24).

Regarding Claims 14 and 28,
Modified Boca teaches
	The method according to Claim 12
	The system according to Claim 17
Boca does not explicitly teach
	wherein the robot gripper is a finger-type gripper or a surface gripper using suction or magnetic force.
However, JETTÉ teaches
The gripper type or types can vary from one robot to another within a given workpiece holding system embodiment. Indeed, the exact type of gripper can be selected from the following general categories: impactive—e.g. jaws, clamps or claws which physically grasp by direct impact upon the object; ingressive—pins, needles or hackles which physically penetrate the surface of the object (e.g. an aperture or bore of the workpiece); astrictive—forces applied to the objects surface (e.g. by vacuum, magneto- or electroadhesion); and contigutive—requiring direct contact for adhesion to take place (e.g. surface tension or freezing).” [0043-0044])
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of JETTÉ so that gripper type can vary as needed for different application requirements (see at least [0043-0044]).

Regarding Claim 17,
Boca teaches
A system for programming a robot to perform an operation by human demonstration, said system comprising (“There is described below the use of hand gestures to teach a path to be followed by the industrial robot 12 in performing work on workpiece 14.” [0021]):
a camera (“The camera, which is a 3D vision sensor 11, is attached to the robot” [0021]);
an industrial robot (“FIG. 1 shows a block diagram for a robot system with an industrial robot which is used to perform work on a workpiece.” [0008]);
and a robot controller having a processor and memory, said controller being in communication with the robot and receiving images from the camera, said controller being configured to perform steps including (Fig. 1 shows the robot controller 15 communicates to the robot 12 and the vision sensor 11 communicates to the computation device 13.; “The image is used by computation device 13 to calculate the corresponding location and orientation (robot target) on the part/scene of interest. The robot target is sent to the robot controller 15 or the computation device 13.” [0022]; Examiner Interpretation: The robot controller and computation device of Boca are both interpreted together to be a robot controller.):
analyzing camera images of the hand demonstrating the operation on the workpiece (“the image of the location pointing hand gesture of step 304 and the associated location on the object are captured by the camera 11 and sent to the computation device 13 for processing. At step 308, the computation device 13 calculates from the image the corresponding location and orientation of the robot tool in the robot scene.” See at least [0034]; Examiner Interpretation: The location of the robot tool corresponding to the taught locations is the demonstration data. Despite the human hand not physically touching the workpiece in the demonstration, the operation to be performed on the workpiece is demonstrated by the human hand),
a move step where hand pose and workpiece pose are determined at a plurality of points defining a move path (“At step 310 the calculated location and orientation of the robot tool are sent to the computation device. Query 312 asks if more location points are needed to complete the robot path. Query 312 can be another gesture. If the answer is yes, the method 300 asks at query 314 if there is a need to reposition the camera. If the answer to query 314 is no, then the method 300 returns to step 304 where the operator makes the hand gesture associated with the next location point. While not shown in FIG. 3, if the answer to query 314 is yes, then the method 300 returns to step 302 where the camera is repositioned. If the answer to query 312 is no, then method 300 ends since no more robot path points have to be acquired.” [0039]; Examiner Interpretation: Hand pose also corresponds to workpiece pose (see [0024]).)
generating robot motion commands, based on the demonstration data … to cause the robot to perform the operation on the new workpiece (“creating robot instructions from the gestures by using the gesture context to the scene data from the same image or as additional data or extra processing to calculate/generate robot instructions (step 704 and optional step 706), storing the created instructions (step 708), asking if more created instructions are needed (step 710) and in step 712 sending the created instructions to the robot if no more created instructions are needed and performing in FIG. 7b all of the steps shown in FIG. 7a except the step 712 of sending the created instructions to the robot. The optional step 706 in these flowcharts of providing the scene 3D model to convert the gesture to a robot instruction step 704 is only needed if the scene will be subtracted from the image of the gesture.” [0052]; “In general a robot move instruction has information about the robot tool and coordinate system used for the robot target” [0057]);
and performing the operation on the new workpiece by the robot (“that is the scene data to create the path and instructions to be followed by the robot 12 when the robot performs work on the workpiece 14.” [0023]; “By work is meant those actions performed by a robot such as painting, grinding, polishing, deburring, welding etc. that make a physical change to the workpiece and those interactions that a robot has with a workpiece such as picking up the workpiece from one location and moving it to another location or inserting the workpiece into a specific location that does not physically change the workpiece.” [0003]).

Boca does not explicitly teach
to create demonstration data, where the demonstration data defines a pick, move and place operation including a grasping step where hand pose and workpiece pose are determined when the hand grasps the workpiece, … and a place step where the workpiece pose is determined when the workpiece becomes stationary after the move step, 
where the demonstration data includes a hand coordinate frame and a gripper coordinate frame corresponding to the hand coordinate frame, 
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper
analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece; … and the initial position and orientation of the new workpiece,
However, Yusuke teaches
create demonstration data where the demonstration data defines a pick, move and place operation including a grasping step where hand pose and workpiece pose are determined when the hand grasps the workpiece, a move step …, and a place step where the workpiece pose is determined when the workpiece becomes stationary after the move step … and a gripper coordinate frame corresponding to the hand (“The control unit 16 assumes that the user's two fingers (hereinafter, the user's (human) finger is referred to as a “finger” and distinguished from the finger unit 14 or the finger members 14A and 14B of the robot 10) grips the workpiece. When recognized, the position of the finger in the work coordinate system is recognized (step S33). That is, the control unit 16 recognizes that the finger has gripped the workpiece from the finger position and the workpiece position, and converts the finger position at that time to the workpiece coordinate system. The control unit 16 stores the position in the workpiece coordinate system at this time in the storage unit 17 as gripping position information. Next, the user holding the workpiece with two fingers moves the workpiece to a desired location and releases the workpiece (releases the finger from the workpiece). The control unit 16 recognizes that the finger has moved away from the workpiece based on the image from the camera 15, recognizes the position and posture of the workpiece in the robot coordinate system at that time as the position and posture of the movement destination of the workpiece, is stored (step S34). With the above operation, the control unit 16 stores information on the position of the workpiece in the workpiece coordinate system and the position and posture of the workpiece in the robot coordinate system. As a result, preparation for instructing the work to be performed on the work and having the robot 10 perform the work on the work is completed. Note that the position and orientation of the movement destination of the workpiece may be recognized based on the position and orientation of the workpiece when it is recognized that the movement of the workpiece has stopped for a predetermined time or more based on the captured image. In the above embodiment, the gripping position of the workpiece is determined based on the gripping position when gripping the workpiece.” See at least page 3, line 48 to page 4, line 14.),
	 It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of Boca to further include the teachings of Yusuke to quickly and easily teach robots workpiece operations (See at least “problem to be solved” on page 1.).

Yusuke also does not explicitly teach
where the demonstration data includes a hand coordinate frame and a gripper coordinate frame corresponding to the hand coordinate frame, 
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper
analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece; … and the initial position and orientation of the new workpiece,

However, Kofman teaches
where the demonstration data includes a hand coordinate frame and a gripper coordinate frame corresponding to the hand coordinate frame (“The orientation of the hand of the operator is used to control the orientation of the robot-manipulator end-effector and is computed from the 3-D coordinates of the centroids of the three hand markers as shown in Fig. 3. Firstly, the midpoint of the line segment joining the thumb and index-finger marker centroids, T and I, respectively, is defined as M (Fig. 3(a)). A coordinate system X o YoZo with origin at wrist Wis then defined by a translation of the local-site global reference coordinate system XY Z to the wrist [Fig. 3(b)]. Through yaw, pitch, and roll rotations, explained below, the final axes X3Y3Z3 to be used to determine the tool axes of the robot-end-effector are obtained with X3 collinear with WM, WT I coplanar with X3Y3, and T lying in the first quadrant of X3Y3, as shown in Fig. 3(b). The yaw-pitch-roll tool rotation angles are determined directly from the hand rotation angles of WM and TI as follows: yaw rotation a of coordinate system XoYoZo about Zo to X1Y1Z1, pitch rotation (3 of X1Y1Z1 about Y1 to X2Y2Z2, shown in Fig. 3(c) using -(3 for clarity, and roll rotation I of X2Y2Z2 about X2 to X3Y3Z3, as shown in Fig. 3(d).” See at least Pg. 4, Col. 2, lines 1-20; Fig. 3 shows the hand coordinate frame which corresponds to the robot coordinate frame.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of Boca and Yusuke to further include the teachings of Kofman regarding corresponding coordinate frames, to remotely control a robot based on position and orientation of a human operator’s hand in a demonstration of the operation without the restraints of sensors and wires on the human hand. See at least the introduction on Pgs. 1-2.

Kofman also does not explicitly teach
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper
analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece; … and the initial position and orientation of the new workpiece,
However, JETTÉ teaches
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper (“In this specific embodiment, the vacuum cup is made of a flexible, resilient material, and the relative distance between the robot and the workpiece held by the robot can vary based on this flexibility and operating conditions. Such variations in the relative distance between a given robot and the workpiece it holds was a source of positioning uncertainty in the reference frame of the robots. This gripper type was found to provide satisfactory gripping capability in the embodiment shown in FIG. 1, but it will be understood that other gripper types can be used in other embodiments. Moreover, more than one gripper, possibly of different gripper types, can be used as the end effector per robot if desired. For instance, a clamp gripper can be used in addition to a vacuum cup for a given robot, or for all robots, for instance. The gripper type or types can vary from one robot to another within a given workpiece holding system embodiment. Indeed, the exact type of gripper can be selected from the following general categories: impactive—e.g. jaws, clamps or claws which physically grasp by direct impact upon the object; ingressive—pins, needles or hackles which physically penetrate the surface of the object (e.g. an aperture or bore of the workpiece); astrictive—forces applied to the objects surface (e.g. by vacuum, magneto- or electroadhesion); and contigutive—requiring direct contact for adhesion to take place (e.g. surface tension or freezing).” [0043-0044])
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of Boca, Yusuke, and Kofman to further include the teachings of JETTÉ so that gripper type can vary as needed for different application requirements (see at least [0043-0044]).

JETTÉ also does not explicitly teach
analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece; … and the initial position and orientation of the new workpiece,
However, Sager teaches
“This invention provides a method and apparatus which uses a vision-equipped robotic system to locate, identify and determine the orientation of objects, and to pick them up and transfer them to a moving or stationary destination.” Col. 1, lines 50-54; Examiner Interpretation: The determined locations and orientations of the objects are initial locations and orientations because they are in that position before being picked up. They are new workpieces since they are different from the demonstrated workpiece.
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of Boca, Yusuke, Kofman, and JETTÉ to further include the teachings of Sager to increase the flexibility of the robot to pick up randomly positioned objects (“known methods and apparatus are generally not effective with randomly positioned and randomly oriented objects. This is typically the case with objects that are deposited onto a conveyor belt, such as flat components that are asymmetrical about at least one axis. For these parts, the system must locate them on the moving conveyor belt and also determine their orientation. This requires a relatively sophisticated vision system.” See at least Col. 1, lines 17-24).

Regarding Claim 26,
Modified Boca teaches
The system according to Claim 17
Boca does not explicitly teach
wherein the new workpiece, before the operation by the robot, rides on a conveyor, and the initial position of the new workpiece is a function of a conveyor position index.
However, Sager teaches
“This invention provides a method and apparatus which uses a vision-equipped robotic system to locate, identify and determine the orientation of objects, and to pick them up and transfer them to a moving or stationary destination. A video camera periodically records images of objects located on a moving conveyor belt. The images are identified and their position and orientation is recorded in a moving conveyor belt coordinate system. The information is transmitted to a motion control device associated with a first robot. The motion control device coordinates the robot with the moving belt coordinate system and instructs the robot's arm to pick up certain objects” See at least Col. 1, lines 50-62; Examiner Interpretation: The use of the moving conveyor belt coordinate system to identify location and orientation is interpreted to be the same as using a function of a conveyor position index.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Sager to increase the flexibility of the robot to pick up randomly positioned objects (“known methods and apparatus are generally not effective with randomly positioned and randomly oriented objects. This is typically the case with objects that are deposited onto a conveyor belt, such as flat components that are asymmetrical about at least one axis. For these parts, the system must locate them on the moving conveyor belt and also determine their orientation. This requires a relatively sophisticated vision system.” See at least Col. 1, lines 17-24).

Regarding Claim 27,
Modified Boca teaches
	The system according to Claim 17
Boca further teaches
	wherein generating robot motion commands includes generating commands to cause a robot gripper to move to a grasping position and orientation based on the initial position and orientation of the … workpiece and position and orientation of the gripper relative to the workpiece contained in the demonstration data (“the hand and finger location and orientation can be used to calculate the corresponding location and orientation of the robot tool in the robot scene.” [0038]; “location, orientation and associated action can be sent to the robot individually or all at once at the end of the teaching process; … with the image of the scene the part can be recognized and then the processing of the gesture has to be in relationship to the part; the robot targets can be defined relative to a part coordinate system” [0061-0064]; “there will be other points along the path between the start and stop points at which the robot will perform work such as follow a path, pick up an object, drop an object and a unique gesture will be associated with each of these intermediate points.” [0026]),
and generating commands to cause the robot gripper to move the … workpiece from the grasping position to other positions contained in the demonstration data (“there will be other points along the path between the start and stop points at which the robot will perform work such as follow a path, pick up an object, drop an object and a unique gesture will be associated with each of these intermediate points.” [0026]).
Boca does not explicitly teach
the new workpiece
However, Sager teaches
Wherein generating robot motion commands includes generating commands to cause a robot gripper to move to a grasping position and orientation based on the initial position and orientation of the new workpiece (“The motion controller will go ahead and direct the robot to pick-up that object after accounting for its movement during the time it takes for the robot to reach it.” Col. 7, lines 14-17; “the motion controller considers the time it would take the robot arm to move from its current position to the location of the object” Col. 7, lines 2-5;“The motion controller considers the object orientation in picking up the object” Col. 7, lines 60-61).
Wherein generating robot motion commands further includes generating commands causing the robot gripper to move the new workpiece from the grasping position to other positions (“This invention provides a method and apparatus which uses a vision-equipped robotic system to locate, identify and determine the orientation of objects, and to pick them up and transfer them to a moving or stationary destination.” Col. 1, lines 50-54).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Sager to increase the flexibility of the robot to pick up randomly positioned objects (“known methods and apparatus are generally not effective with randomly positioned and randomly oriented objects. This is typically the case with objects that are deposited onto a conveyor belt, such as flat components that are asymmetrical about at least one axis. For these parts, the system must locate them on the moving conveyor belt and also determine their orientation. This requires a relatively sophisticated vision system.” See at least Col. 1, lines 17-24).

Claim 3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Boca (US 20150314442 A1) in view of Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), Sager (US 5040056 A), and Itkowitz (US 9901402 B2).

Regarding Claim 3,
Modified Boca teaches
The method according to Claim 2
Boca further teaches
wherein the camera is a three- dimensional camera (“The camera, which is a 3D vision sensor 11” [0021])

Boca does not explicitly teach
which directly captures images and X, Y and Z coordinates of a plurality of identifiable points on the hand in the images.
However, Itkowitz teaches
“a data glove 501 (FIG. 5) or bare hand 502 is used, and fiducial markers 511 are attached to the thumb and index finger of glove 501 (and/or to other digits of the glove) that the surgeon is going to wear and/or directly to the skin of hand 502. Again, redundant markers can be used to accommodate self-occlusion. Fiducial markers also can be placed on other fingers to enable more user interface features through specifically defined hand gestures. The three-dimensional locations of the fiducial markers are computed by triangulation of multiple cameras having a common field of view.” Col. 15, lines 37-47; “Other tracking technologies that are suitable for use include, but are not limited to, inertial tracking, depth camera tracking” Col. 16, lines 18-20.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Itkowitz to allow for remote operation of a robot without the use of additional hardware that would take the focus off the operation (“In this aspect, after being placed in a gesture detection mode of operation, hand tracking controller 130 detects a hand gesture pose, or a hand gesture pose and a hand gesture trajectory. Controller 130 maps hand gesture poses to certain system mode control commands, and similarly controller 130 maps hand gesture trajectories to other system mode control commands. Note that the mapping of poses and trajectories is independent and so this is different from, for example, manual signal language tracking. The ability to generate system commands and to control system 100 using hand gesture poses and hand gesture trajectories, in place of manipulating switches, numerous foot pedals, etc. as in known minimally invasive surgical systems, provides greater ease of use of system 100 for the surgeon. When a surgeon is standing, the use of hand gesture poses and hand gesture trajectories to control system 100 makes it is unnecessary for the surgeon to take the surgeon's eyes off the patient and/or viewing screen and to search for a foot petal or a switch when the surgeon wants to change the system mode. Finally, the elimination of the various switches and foot pedals reduces the floor space required by the minimally invasive teleoperated surgical system.” See at least Col. 11, lines 26-47).

Claim 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Boca (US 20150314442 A1) in view of Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), Sager (US 5040056 A), Itkowitz (US 9901402 B2), and Luo (US 20150100910 A1).

Regarding Claim 4,
Modified Boca teaches
The method according to Claim 2
Boca does not explicitly teach 
wherein the camera is a two- dimensional camera, and where true lengths of a plurality of segments of digits of the human hand have been previously determined using a hand size image analysis step.
However, Luo teaches
“FIG. 14 depicts how the simplified exemplary user's hand or hands may be photographed by the device's camera or other camera, and this image information may be used to refine the default parameters of the biomechanical and/or anatomical model of the user's hand, in accordance with one embodiment of the present invention.” [0204]; “the user may put each hand on background (1400), and take a photo of the hand(s) (1402) with either the computerized device's camera or other camera. This image may then be analyzed, preferably by an image analysis program. The background image will help correct for any image distortions caused by different camera angles, and the like. The user hand image analysis may be done onboard the user's handheld computerized device, but it need not be. In an alternative embodiment, the user may upload one or more images of the hand taken by any imaging device to an external image analyzer, such as a remote internet server. In either event, the image analyzer will analyze the user's skin or hand outline appearance (1404), deduce the most probable lengths one or more bones of the user's hand, such as the user's various finger and thumb bones, and send this data or other data to correct the default biomechanical and/or anatomical model of the user's hand(s) back to the user's computerized device, such as for example during calibration step 906 referenced in FIG. 9 above.” [0205]; Examiner Interpretation: The camera is a 2-D camera that takes standard 2-D photographs.
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Luo to improve the accuracy of hand gesture identification and tracking by using a personalized anatomical hand model in place of a default hand model (“In some embodiments, to improve accuracy (that is to replace standard human hand biomechanical and/or anatomical model default parameters with actual user calibration parameters), it will be useful to acquire an image of the user's hands, and to employ various image processing and analysis techniques to analyze this image of the user's one or more hands to better estimate the relative length of the various bones of the user's hands. Indeed, in the event that the user has lost one or more fingers, the system may then use this information to make corresponding changes in its biomechanical and/or anatomical model of the human hand. In other words, the model may include calibration information associated with an image of at least a portion of the hand of the user.” [0203]. Also see at least [0010-0018]).

Claim 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Boca (US 20150314442 A1) in view of Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), Sager (US 5040056 A), Luo (US 20150100910 A1), Itkowitz (US 9901402 B2), and Fitzgibbon (US 20200226786 A1).

Regarding Claim 5
Modified Boca teaches
The method according to Claim 4
Boca does not explicitly teach
wherein the hand size image analysis step includes providing an image of the human hand on a fiducial marker grid,
analyzing the image to compute transformations from a marker coordinate system to a screen coordinate system,
processing the image in a neural network convolution layer to identify key points
on the human hand in the image,
using the transformations to compute coordinates of the key points in the marker coordinate system,
and calculating the true lengths of the segments of the digits of the human hand.
However, Luo further teaches
The hand size image analysis step includes providing an image of the human hand on a grid of fiducial markers (“FIG. 14 depicts how the simplified exemplary user's hand or hands may be photographed by the device's camera or other camera, and this image information may be used to refine the default parameters of the biomechanical and/or anatomical model of the user's hand, in accordance with one embodiment of the present invention. In acquiring such images, often it is useful to have the system provide a standardized background, such as a series of distance markings, grid, graph paper, and the like (1400) in order to better calibrate the image of the hand and correct for image distortions. This standardized background may additionally include various color, shades of gray, and resolution test targets as well. The background may be conveniently provided by, for example, electronically providing one or more background image sheets (e.g. a jpeg, png, pdf or other image file) for printing on the user's printer.” [0204]; Examiner Interpretation: The use of the standardized background is the same as if a grid of ArUco markers were used.)
and calculating the true lengths of the segments of the digits of the human hand (“the user may put each hand on background (1400), and take a photo of the hand(s) (1402) with either the computerized device's camera or other camera. This image may then be analyzed, preferably by an image analysis program. The background image will help correct for any image distortions caused by different camera angles, and the like. The user hand image analysis may be done onboard the user's handheld computerized device, but it need not be. In an alternative embodiment, the user may upload one or more images of the hand taken by any imaging device to an external image analyzer, such as a remote internet server. In either event, the image analyzer will analyze the user's skin or hand outline appearance (1404), deduce the most probable lengths one or more bones of the user's hand, such as the user's various finger and thumb bones, and send this data or other data to correct the default biomechanical and/or anatomical model of the user's hand(s) back to the user's computerized device, such as for example during calibration step 906 referenced in FIG. 9 above.” [0205]).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Luo to improve the accuracy of hand gesture identification and tracking by using a personalized anatomical hand model in place of a default hand model (“In some embodiments, to improve accuracy (that is to replace standard human hand biomechanical and/or anatomical model default parameters with actual user calibration parameters), it will be useful to acquire an image of the user's hands, and to employ various image processing and analysis techniques to analyze this image of the user's one or more hands to better estimate the relative length of the various bones of the user's hands. Indeed, in the event that the user has lost one or more fingers, the system may then use this information to make corresponding changes in its biomechanical and/or anatomical model of the human hand. In other words, the model may include calibration information associated with an image of at least a portion of the hand of the user.” [0203]. Also see at least [0010-0018]).

Luo also does not explicitly teach
analyzing the image to compute transformations from a marker coordinate system to a screen coordinate system,
processing the image in a neural network convolution layer to identify key points
on the human hand in the image,
using the transformations to compute coordinates of the key points in the marker coordinate system,
However, Itkowitz teaches
Analyzing the image to compute transformations from a marker coordinate system to a screen coordinate system (“The three-dimensional reconstruction accuracy relies heavily on the accuracy of camera calibration. Some fiducial markers attached to known locations on the surgeon's console can be used to determine the extrinsic parameters (rotation and translation) of multiple cameras with respect to the surgeon's console. This process can be done automatically. Active fiducial markers can be used for the calibration fiducial markers since such markers are only turned on during a calibration process and before the procedure. During the procedure, the calibration fiducial markers are turned off to avoid confusion with the fiducial markers used to localize the surgeon's hands.” Col. 16, lines 4-15; Examiner Interpretation: The determined rotation and translation of a camera from the surgeons console with fiducial markers at known locations is the transformation from a marker coordinate system to a screen coordinate system.)
And using the transformations to compute coordinates of the key points in the marker coordinate system (“FIG. 7 is an illustration of sensor 212 mounted on forefinger 292B with a location 713 in tracking coordinate system 750, and a sensor 211 mounted on thumb 292A with a location 711 in tracking coordinate system 750.” Col. 17, lines 28-31; Examiner Interpretation: The tracking coordinate system is the same as the marker coordinate system.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Itkowitz to allow for remote operation of a robot without the use of additional hardware that would take the focus off the operation by demonstrating the operation using human hands to a camera  (“In this aspect, after being placed in a gesture detection mode of operation, hand tracking controller 130 detects a hand gesture pose, or a hand gesture pose and a hand gesture trajectory. Controller 130 maps hand gesture poses to certain system mode control commands, and similarly controller 130 maps hand gesture trajectories to other system mode control commands. Note that the mapping of poses and trajectories is independent and so this is different from, for example, manual signal language tracking. The ability to generate system commands and to control system 100 using hand gesture poses and hand gesture trajectories, in place of manipulating switches, numerous foot pedals, etc. as in known minimally invasive surgical systems, provides greater ease of use of system 100 for the surgeon. When a surgeon is standing, the use of hand gesture poses and hand gesture trajectories to control system 100 makes it is unnecessary for the surgeon to take the surgeon's eyes off the patient and/or viewing screen and to search for a foot petal or a switch when the surgeon wants to change the system mode. Finally, the elimination of the various switches and foot pedals reduces the floor space required by the minimally invasive teleoperated surgical system.” See at least Col. 11, lines 26-47).
Itkowitz also does not explicitly teach
processing the image in a neural network convolution layer to identify key points
on the human hand in the image,
However, Fitzgibbon teaches
processing the image in a neural network convolution layer to identify key points (“the trained machine learning system comprises a neural network such as a convolutional neural network (CNN) or other type of neural network. There is a set of specified keypoints with known locations relative to the object and expressed in object coordinates. FIG. 8 is a schematic diagram of an example convolutional neural network architecture which is used in some cases but which is not intended to limit the scope of the technology. The input to the neural network is a frame of sensor data 800 so that the first layer 802 of the neural network comprises a three dimensional array of nodes which holds raw pixel values of the frame of sensor data, such as an image with three color channels. A second layer 804 of the neural network comprises a convolutional layer. It computes outputs of nodes that are connected to local regions in the input layer 802. Although only one convolution layer 804 is shown in FIG. 8, in some examples, there are a plurality of convolution layers connected in series, such as 5 to 20 convolutional layers. A third layer 806 of the neural network comprises a rectified linear unit (RELU) layer which applies an activation function. A fourth layer of the neural network 808 is a pooling layer which computes a downsampling and a fifth layer of the neural network 810 is a fully connected layer to compute a probability map 812 corresponding to the frame of sensor data, where each image element location in the probability map indicates a probability that the image element depicts each of the specified keypoints." [0086])
on the human hand in the image (“FIG. 2A is a schematic diagram of an object, which in this case is a hand 200, where the hand is raised in the air with the palm facing the viewer and with the fingers generally outstretched. The thumb and forefinger are moved towards one another to form a pinch gesture. Four keypoints are indicated as small circles 202, 204, 206, 208. Three of the keypoints, 202, 206, 208 are regular keypoints and one of the keypoints 204 is a floating keypoint. The floating keypoint is at a defined location, such as around two to five centimeters above a center of the back of the hand, where the back of the hand is opposite the palm of the hand. The regular keypoints are at defined locations such as on the knuckle where the little finger joins the hand (see keypoint 202), on the knuckle where the forefinger joins the hand (see keypoint 206), in the center of the wrist where the wrist joins the hand (see keypoint 208).” [0036]).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Fitzgibbon for identifying 3-D positions of keypoints of an object captured by a 2d image (“there is an apparatus for detecting position and orientation of an object. The apparatus comprises a memory storing at least one frame of captured sensor data depicting the object. The apparatus also comprises a trained machine learning system configured to receive the frame of the sensor data and to compute a plurality of two dimensional positions in the frame. Each predicted two dimensional position is a position of sensor data in the frame depicting a keypoint, where a keypoint is a pre-specified 3D position relative to the object. At least one of the keypoints is a floating keypoint depicting a pre-specified position relative to the object, lying inside or outside the object's surface. The apparatus comprises a pose detector which computes the three dimensional position and orientation of the object using the predicted two dimensional positions and outputs the computed three dimensional position and orientation.” See at least [0006])

Claim 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Boca (US 20150314442 A1) in view of Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), Sager (US 5040056 A), Luo (US 20150100910 A1), Fitzgibbon (US 20200226786 A1), and Iqbal (US 10929654 B2).

Regarding Claim 6,
Modified Boca teaches
	The method according to Claim 4
Boca does not explicitly teach
	wherein analyzing camera images of the hand demonstrating the operation on the workpiece includes processing the images in a neural network convolution layer to identify key points
	on the human hand in the images,
	performing a Point-n-Perspective calculation using the key points on the human hand in the images 
and the previously determined true lengths of the plurality of segments of the digits of the human hand, and calculating a three-dimensional pose of the plurality of segments.
However, Fitzgibbon teaches
Wherein analyzing camera images of the hand demonstrating the operation on the workpiece includes processing the images in a neural network convolution layer to identify key points (“the trained machine learning system comprises a neural network such as a convolutional neural network (CNN) or other type of neural network. There is a set of specified keypoints with known locations relative to the object and expressed in object coordinates. FIG. 8 is a schematic diagram of an example convolutional neural network architecture which is used in some cases but which is not intended to limit the scope of the technology. The input to the neural network is a frame of sensor data 800 so that the first layer 802 of the neural network comprises a three dimensional array of nodes which holds raw pixel values of the frame of sensor data, such as an image with three color channels. A second layer 804 of the neural network comprises a convolutional layer. It computes outputs of nodes that are connected to local regions in the input layer 802. Although only one convolution layer 804 is shown in FIG. 8, in some examples, there are a plurality of convolution layers connected in series, such as 5 to 20 convolutional layers. A third layer 806 of the neural network comprises a rectified linear unit (RELU) layer which applies an activation function. A fourth layer of the neural network 808 is a pooling layer which computes a downsampling and a fifth layer of the neural network 810 is a fully connected layer to compute a probability map 812 corresponding to the frame of sensor data, where each image element location in the probability map indicates a probability that the image element depicts each of the specified keypoints." [0086])
	on the human hand in the images (“FIG. 2A is a schematic diagram of an object, which in this case is a hand 200, where the hand is raised in the air with the palm facing the viewer and with the fingers generally outstretched. The thumb and forefinger are moved towards one another to form a pinch gesture. Four keypoints are indicated as small circles 202, 204, 206, 208. Three of the keypoints, 202, 206, 208 are regular keypoints and one of the keypoints 204 is a floating keypoint. The floating keypoint is at a defined location, such as around two to five centimeters above a center of the back of the hand, where the back of the hand is opposite the palm of the hand. The regular keypoints are at defined locations such as on the knuckle where the little finger joins the hand (see keypoint 202), on the knuckle where the forefinger joins the hand (see keypoint 206), in the center of the wrist where the wrist joins the hand (see keypoint 208).” [0036]),
	performing a Point-n-Perspective calculation using the key points on the human hand in the images (“The predicted 2D positions are input to the pose detector 104 which computes the pose (i.e. the position and orientation) of the object using the predicted 2D positions. In some cases the pose detector uses a closed form solution 306 to compute the pose such as by using a well-known perspective number point (PnP) algorithm as explained below. In some cases the pose detector uses a optimization 308 to compute the pose. A PnP algorithm takes a plurality, n, of 3D points in a reference frame of the object together with their corresponding 2D image projections. In addition, the PnP algorithm knows the intrinsic camera parameters such as the camera focal length, principal image point and skew parameter. The task of the PnP algorithm is to find the values of the matrix R and vector T (which express the rotation and translation of the object to convert it from object coordinates to world coordinates, and which give the pose of the object in world coordinates i.e. its position and orientation) from the following well known perspective projection model for cameras” [0042-0043])
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Fitzgibbon for identifying 3-D positions of keypoints of an object captured by a 2d image (“there is an apparatus for detecting position and orientation of an object. The apparatus comprises a memory storing at least one frame of captured sensor data depicting the object. The apparatus also comprises a trained machine learning system configured to receive the frame of the sensor data and to compute a plurality of two dimensional positions in the frame. Each predicted two dimensional position is a position of sensor data in the frame depicting a keypoint, where a keypoint is a pre-specified 3D position relative to the object. At least one of the keypoints is a floating keypoint depicting a pre-specified position relative to the object, lying inside or outside the object's surface. The apparatus comprises a pose detector which computes the three dimensional position and orientation of the object using the predicted two dimensional positions and outputs the computed three dimensional position and orientation.” See at least [0006])
Fitzgibbon also does not explicitly teach
and the previously determined true lengths of the plurality of segments of the digits of the human hand, and calculating a three-dimensional pose of the plurality of segments.
However, Iqbal teaches
and the previously determined true lengths of the plurality of segments of the digits of the human hand, and calculating a three-dimensional pose of the plurality of segments (“the 3D pose reconstruction system 100 may be used to reconstruct the 3D pose from the normalized 2.5D representation of the pose. FIG. 1B illustrates a conceptual diagram of a scaled pose, in accordance with an embodiment. In one embodiment, a scaled pose refers to a scale normalized pose 105 of a hand having a length C for a bone between a pair of keypoints n and m. The scale normalized pose 105 is defined by 2.5D keypoint locations that, when processed by the 3D pose reconstruction unit 110 produces a scale normalized 3D pose {circumflex over (P)}.” See at least Col. 5, lines 26-36 and fig. 1B. Examiner Interpretation: The hand length C is a previously determined true length).
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Iqbal to estimate a 3D pose of a human hand with the use of a single 2D camera for the use of human-computer interaction while reducing the impact of the hand’s appearance variation, complex poses, and self-occlusions (“Estimating a 3D pose of an object, such as a hand or body (human, animal, robot, etc.), from a 2D image is useful for human-computer interaction. Hand pose can be represented by a fixed set of points in 3D space, usually joints, called landmarks or keypoints. Estimating the 3D pose accurately is a difficult task due to the large amounts of appearance variation, self-occlusions, and complexity of articulated hand poses. 3D hand pose estimation escalates the difficulties even further because a depth of each of the hand keypoints also has to be estimated. Conventional techniques for determining locations of the landmarks of a hand in 3D space include one or more of multi-view camera systems, depth sensors, and color markers/gloves. Each of the conventional techniques requires a constrained environment and/or specialized equipment. Furthermore, environmental conditions such as sunlight, occlusions, and complexity of non-rigid hand poses present challenges to landmark detection and determination. There is a need for addressing these issues and/or other issues associated with the prior art.” See at least Col. 1, lines 21-40.)

Claims 7 and 18-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Boca (US 20150314442 A1) in view of Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), Sager (US 5040056 A), Itkowitz (US 9901402 B2), and Sun (US 9321176 B1).

Regarding Claims 7 and 18,
Modified Boca teaches
The method according to Claim 1
The system according to Claim 17
Boca does not explicitly teach
wherein analyzing camera images of the hand demonstrating the operation includes identifying locations of a plurality of points on the hand,
including a tip, a base knuckle, and a second knuckle
of each of a thumb
and a forefinger.
However, Itkowitz teaches
Wherein analyzing camera images of the hand demonstrating the operation includes identifying locations of a plurality of points on the hand (Fig. 5 shows a plurality of points on the hand; “a data glove 501 (FIG. 5) or bare hand 502 is used, and fiducial markers 511 are attached to the thumb and index finger of glove 501 (and/or to other digits of the glove) that the surgeon is going to wear and/or directly to the skin of hand 502. Again, redundant markers can be used to accommodate self-occlusion. Fiducial markers also can be placed on other fingers to enable more user interface features through specifically defined hand gestures. The three-dimensional locations of the fiducial markers are computed by triangulation of multiple cameras having a common field of view.” Col. 15, lines 37-47),
including a tip, a base knuckle, and a second knuckle of a forefinger and a tip and base knuckle of a thumb (Fig. 5 shows fiducial markers located on the finger tips, base knuckles, and second knuckles of the index fingers and on the tips and base knuckles of the thumb.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Itkowitz to allow for remote operation of a robot without the use of additional hardware that would take the focus off the operation (“In this aspect, after being placed in a gesture detection mode of operation, hand tracking controller 130 detects a hand gesture pose, or a hand gesture pose and a hand gesture trajectory. Controller 130 maps hand gesture poses to certain system mode control commands, and similarly controller 130 maps hand gesture trajectories to other system mode control commands. Note that the mapping of poses and trajectories is independent and so this is different from, for example, manual signal language tracking. The ability to generate system commands and to control system 100 using hand gesture poses and hand gesture trajectories, in place of manipulating switches, numerous foot pedals, etc. as in known minimally invasive surgical systems, provides greater ease of use of system 100 for the surgeon. When a surgeon is standing, the use of hand gesture poses and hand gesture trajectories to control system 100 makes it is unnecessary for the surgeon to take the surgeon's eyes off the patient and/or viewing screen and to search for a foot petal or a switch when the surgeon wants to change the system mode. Finally, the elimination of the various switches and foot pedals reduces the floor space required by the minimally invasive teleoperated surgical system.” See at least Col. 11, lines 26-47).
Itkowitz also does not explicitly teach
	A second knuckle … of a thumb
However, Sun teaches
	A second knuckle of a thumb (“Let u.sub.hεcustom character.sup.6 denote the vector representing the position and orientation of the center of a human thumb relative to an object coordinate, as shown in FIG. 5. u.sub.h is obtained by a motion capture system from a human demonstration. u.sub.h can be mapped to u.sub.r, the center of the thumb fingertip of the robotic hand, by linear translation.” See at least Col. 7, lines 38-43; Examiner Interpretation: The center of the thumb described by Sun is the second knuckle of the thumb.).
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Sun to teach the robot how to more naturally grasp an object by way of human demonstration (“It is natural for a robot to learn grasp and manipulation skills from humans because humans can handle the dexterous tasks easily. Humans tend to manipulate an object in an optimal way, in terms of stability and energy conservation, by adjusting their motions and contact forces according the object shape and material hardness. The approach in which a robot learns from observing humans grasp objects is called learning from demonstration (LfD). LfD has been a powerful mechanism for a teaching robot new tasks by observing people's demonstrations without any reprogramming. With the learning results, a robot can mimic human motions by reproducing movements similar to the demonstration. The LfD technique avoids a complex mathematic model for hands and objects, and provides useful task information from the demonstrations. The way of demonstration includes guidance on the robot body and execution on the teacher body. Guidance on the robot body avoids correspondence problems but is less intuitive from the teacher's perspective, because the user would lose a first-hand feeling. It also raises difficulties in the human control of a high dimensional motion of the robotic hand with multi-fingers. In contrast, a demonstration performed by a human body is more intuitive, because it requires much less effort than is needed in controlling a robotic hand. Also, humans have good senses on their own muscles and skin.” See at least Col. 1, line 47 to Col. 2, line 4.). It is known that for a human to grasp an object with a single hand, their thumb is used for the most natural grasp.

Regarding Claim 19,
Modified Boca teaches
The system according to Claim 18
Boca further teaches
wherein the camera is a three-dimensional camera (“The camera, which is a 3D vision sensor 11” [0021])
Boca does not explicitly teach
which directly captures images and X, Y and Z coordinates of a plurality of identifiable points on the hand in the images.
However, Itkowitz teaches
	“a data glove 501 (FIG. 5) or bare hand 502 is used, and fiducial markers 511 are attached to the thumb and index finger of glove 501 (and/or to other digits of the glove) that the surgeon is going to wear and/or directly to the skin of hand 502. Again, redundant markers can be used to accommodate self-occlusion. Fiducial markers also can be placed on other fingers to enable more user interface features through specifically defined hand gestures. The three-dimensional locations of the fiducial markers are computed by triangulation of multiple cameras having a common field of view.” Col. 15, lines 37-47; “Other tracking technologies that are suitable for use include, but are not limited to, inertial tracking, depth camera tracking” Col. 16, lines 18-20.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Itkowitz to allow for remote operation of a robot without the use of additional hardware that would take the focus off the operation (“In this aspect, after being placed in a gesture detection mode of operation, hand tracking controller 130 detects a hand gesture pose, or a hand gesture pose and a hand gesture trajectory. Controller 130 maps hand gesture poses to certain system mode control commands, and similarly controller 130 maps hand gesture trajectories to other system mode control commands. Note that the mapping of poses and trajectories is independent and so this is different from, for example, manual signal language tracking. The ability to generate system commands and to control system 100 using hand gesture poses and hand gesture trajectories, in place of manipulating switches, numerous foot pedals, etc. as in known minimally invasive surgical systems, provides greater ease of use of system 100 for the surgeon. When a surgeon is standing, the use of hand gesture poses and hand gesture trajectories to control system 100 makes it is unnecessary for the surgeon to take the surgeon's eyes off the patient and/or viewing screen and to search for a foot petal or a switch when the surgeon wants to change the system mode. Finally, the elimination of the various switches and foot pedals reduces the floor space required by the minimally invasive teleoperated surgical system.” See at least Col. 11, lines 26-47).

Claims 8-9 and 23-24 is/are rejected under 35 U.S.C. 103 as being unpatentable over Boca (US 20150314442 A1) in view of Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), Sager (US 5040056 A), Itkowitz (US 9901402 B2), Sun (US 9321176 B1), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), and Yunde (NPL: “Hand Action Perception for Robot Programming”).

Regarding Claims 8 and 23,
Modified Boca teaches,
The method according to Claim 7
The system according to Claim 18
Boca does not explicitly teach
wherein the demonstration data includes, at the grasping step of the operation, position and orientation of the hand coordinate frame, the gripper coordinate frame corresponding to the hand coordinate frame,
and a workpiece coordinate frame.
However, Kofman teaches
Wherein the demonstration data includes, at the grasping step of the operation, position and orientation of the hand coordinate frame, the gripper coordinate frame corresponding to the hand coordinate frame (“The orientation of the hand of the operator is used to control the orientation of the robot-manipulator end-effector and is computed from the 3-D coordinates of the centroids of the three hand markers as shown in Fig. 3. Firstly, the midpoint of the line segment joining the thumb and index-finger marker centroids, T and I, respectively, is defined as M (Fig. 3(a)). A coordinate system X o YoZo with origin at wrist Wis then defined by a translation of the local-site global reference coordinate system XY Z to the wrist [Fig. 3(b)]. Through yaw, pitch, and roll rotations, explained below, the final axes X3Y3Z3 to be used to determine the tool axes of the robot-end-effector are obtained with X3 collinear with WM, WT I coplanar with X3Y3, and T lying in the first quadrant of X3Y3, as shown in Fig. 3(b). The yaw-pitch-roll tool rotation angles are determined directly from the hand rotation angles of WM and TI as follows: yaw rotation a of coordinate system XoYoZo about Zo to X1Y1Z1, pitch rotation (3 of X1Y1Z1 about Y1 to X2Y2Z2, shown in Fig. 3(c) using -(3 for clarity, and roll rotation I of X2Y2Z2 about X2 to X3Y3Z3, as shown in Fig. 3(d).” See at least Pg. 4, Col. 2, lines 1-20; Fig. 3 shows the hand coordinate frame which corresponds to the robot coordinate frame.; Fig. 1 shows a human demonstrating a grasp corresponding to the robot which is actually grasping an object.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Kofman regarding corresponding coordinate frames, to remotely control a robot based on position and orientation of a human operator’s hand in a demonstration of the operation without the restraints of sensors and wires on the human hand. See at least the introduction on Pgs. 1-2.

Kofman also does not explicitly teach
and a workpiece coordinate frame.
However, Yunde teaches
A workpiece coordinate frame (Hand objects in the workspace are tracked and the hand as well as the objects are given coordinate frames in which transformations between them and the world frame are performed to track their positions and orientations. See at least Pg. 3, Col. 2 and Fig. 4.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Yunde to teach robot operations by a human demonstration manipulating an object so that the same object manipulation can be performed by the robot (“we describe a more general framework of hand action perception using depth image sequences. We aim to build a robot system which should learn not only the initial poses and final destinations of the objects and the order of assembly, but also should perceive how objects move and how they are manipulated. Thus, tracking and understanding both object motion and human hand action will be crucial for a robot to learn a general assembly task. Our system has two multibaseline stereo vision systems with the same configuration located respectively in a human world and a robot world. The human instructor must simply demonstrate the task in front of the vision system in the human world, no dataglove or special markings are necessary. The recorded image sequences are used to recover a depth image sequence for model-based human hand and object tracking to form perceptual data streams. The data streams are segmented and precisely interpreted to create a task sequence of the description of the human hand action and object motion for generating the robot control sequence or reporting what is going on in the workspace. This paper follows the assembly plan from observation (APO) paradigm proposed by Ikeuchi and Suehiro[7].” See at least introduction on Pg. 1.).

Regarding Claims 9 and 24,
Modified Boca teaches
The method according to Claim 8
The system according to Claim 23
Boca further teaches
for the plurality of points in the move step of the operation (“At step 310 the calculated location and orientation of the robot tool are sent to the computation device. Query 312 asks if more location points are needed to complete the robot path. Query 312 can be another gesture. If the answer is yes, the method 300 asks at query 314 if there is a need to reposition the camera. If the answer to query 314 is no, then the method 300 returns to step 304 where the operator makes the hand gesture associated with the next location point. While not shown in FIG. 3, if the answer to query 314 is yes, then the method 300 returns to step 302 where the camera is repositioned. If the answer to query 312 is no, then method 300 ends since no more robot path points have to be acquired.” [0039]),
Boca does not explicitly teach
wherein the demonstration data further includes positions of the hand coordinate frame and the workpiece coordinate frame
and position and orientation of the workpiece coordinate frame for the place step of the operation.
However, Yunde teaches
wherein the demonstration data further includes positions of the hand coordinate frame and the workpiece coordinate frame (Hand objects in the workspace are tracked and the hand as well as the objects are given coordinate frames in which transformations between them and the world frame are performed to track their positions and orientations. See at least Pg. 3, Col. 2 and Fig. 4.)
for intermediate steps of the operation (“The robot system should learn not only the initial poses and destinations of the objects and the orders of assembly, but also should perceive how objects move for replicating the task. For this purpose, the system has to perceive the trajectories of all objects with orientations at each point in the sequence for forming perceptual data streams. This is sufficient for simple tasks and structured environment. In general, the system has to track not only objects but also the human hand in able to robustly and completely understand a given task for the purpose of efficiently replicating the task, Otherwise, an APO approach will encounter difficulties, especially with partial occlusion case and motions such as screw-turning a bolt. After tracking, the perceptual data streams of each object and each part of the hand are created for interpretation” See at least Pg. 2, Col. 2.),
and position and orientation of the workpiece coordinate frame for the place step of the operation (“The frame kem at which the hand stops manipulating the object occurs when the hand is departing from the manipulated object.” See at least Pg. 5, Col. 2, lines 12-15.; Examiner Interpretation: The frame kem is the workpiece coordinate frame which has a known position and orientation.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Yunde to teach robot operations by a human demonstration manipulating an object so that the same object manipulation can be performed by the robot (“we describe a more general framework of hand action perception using depth image sequences. We aim to build a robot system which should learn not only the initial poses and final destinations of the objects and the order of assembly, but also should perceive how objects move and how they are manipulated. Thus, tracking and understanding both object motion and human hand action will be crucial for a robot to learn a general assembly task. Our system has two multibaseline stereo vision systems with the same configuration located respectively in a human world and a robot world. The human instructor must simply demonstrate the task in front of the vision system in the human world, no dataglove or special markings are necessary. The recorded image sequences are used to recover a depth image sequence for model-based human hand and object tracking to form perceptual data streams. The data streams are segmented and precisely interpreted to create a task sequence of the description of the human hand action and object motion for generating the robot control sequence or reporting what is going on in the workspace. This paper follows the assembly plan from observation (APO) paradigm proposed by Ikeuchi and Suehiro[7].” See at least introduction on Pg. 1.). 

Claims 10 and 25 is/are rejected under 35 U.S.C. 103 as being unpatentable over Boca (US 20150314442 A1) in view of Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), Sager (US 5040056 A), Itkowitz (US 9901402 B2), Sun (US 9321176 B1), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), and Yunde (NPL: “Hand Action Perception for Robot Programming”), and Pham (NPL: “A proposal of extracting of motion primitives by analyzing tracked data of hand motion from human demonstration”).

Regarding Claims 10 and 25,
Modified Boca teaches
The method according to Claim 8
The system according to Claim 23
Boca does not explicitly teach
wherein the hand coordinate frame has an origin at a point midway between the base knuckles of the thumb and forefinger,
a Z axis passing through a point midway between the tips of the thumb and forefinger,
and a Y axis normal to a plane containing the thumb and forefinger.
However, Pham teaches
 Wherein the hand coordinate frame has an origin at a point midway between the base knuckles of the thumb and forefinger (Fig. 4 Shows the origin of the hand frame being located at (B), between the base of the thumb and forefinger.; “We define three important points including: the center point of red color area on the part between the index finger and the thumb, denoted by point B” See Pg. 2, Col. 2, lines 27-30),
and a Y axis normal to a plane containing the thumb and forefinger (Fig. 4 shows an axis perpendicular to the plane containing the thumb and forefinger with the orange axis pointing towards the top left of the page that corresponds to the red axis pointing upwards on the hand in the first image.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Pham to allow for robot programming by demonstration with the use of an inexpensive sensor that can use hand motion data to help the robot execute actions in new situations where the gripper orientation may need to be different than a previous situation to successfully complete the task. See at least the section “Overview of method” on Pg. 2.
Pham also does not explicitly teach
a Z axis passing through a point midway between the tips of the thumb and forefinger,
However, Kofman teaches
 A Z axis passing through a point midway between the tips of the thumb and forefinger (Fig. 3 shows the hand coordinate frame with the X axis passing through the midway point (M) between the thumb (T) and index finger (I) fingertips. See at least Pg. 4, Col. 2 and Fig. 2(b) and Fig. 3.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Kofman to remotely control a robot based on position and orientation of a human operator’s hand in a demonstration of the operation without the restraints of sensors and wires on the human hand. See at least the introduction on Pgs. 1-2.

Claim 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Boca (US 20150314442 A1) in view of Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), Sager (US 5040056 A), Sun (US 9321176 B1), Itkowitz (US 9901402 B2), and Luo (US 20150100910 A1).

Regarding Claim 20,
Modified Boca teaches
The system according to Claim 18
Boca does not explicitly teach 
wherein the camera is a two- dimensional camera, and where true lengths of a plurality of segments of digits of the human hand have been previously determined using a hand size image analysis step.
However, Luo teaches
“FIG. 14 depicts how the simplified exemplary user's hand or hands may be photographed by the device's camera or other camera, and this image information may be used to refine the default parameters of the biomechanical and/or anatomical model of the user's hand, in accordance with one embodiment of the present invention.” [0204]; “the user may put each hand on background (1400), and take a photo of the hand(s) (1402) with either the computerized device's camera or other camera. This image may then be analyzed, preferably by an image analysis program. The background image will help correct for any image distortions caused by different camera angles, and the like. The user hand image analysis may be done onboard the user's handheld computerized device, but it need not be. In an alternative embodiment, the user may upload one or more images of the hand taken by any imaging device to an external image analyzer, such as a remote internet server. In either event, the image analyzer will analyze the user's skin or hand outline appearance (1404), deduce the most probable lengths one or more bones of the user's hand, such as the user's various finger and thumb bones, and send this data or other data to correct the default biomechanical and/or anatomical model of the user's hand(s) back to the user's computerized device, such as for example during calibration step 906 referenced in FIG. 9 above.” [0205]; Examiner Interpretation: The camera is a 2-D camera that takes standard 2-D photographs.
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Luo to improve the accuracy of hand gesture identification and tracking by using a personalized anatomical hand model in place of a default hand model (“In some embodiments, to improve accuracy (that is to replace standard human hand biomechanical and/or anatomical model default parameters with actual user calibration parameters), it will be useful to acquire an image of the user's hands, and to employ various image processing and analysis techniques to analyze this image of the user's one or more hands to better estimate the relative length of the various bones of the user's hands. Indeed, in the event that the user has lost one or more fingers, the system may then use this information to make corresponding changes in its biomechanical and/or anatomical model of the human hand. In other words, the model may include calibration information associated with an image of at least a portion of the hand of the user.” [0203]. Also see at least [0010-0018]).

Claim 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Boca (US 20150314442 A1) in view of Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), Sager (US 5040056 A), Sun (US 9321176 B1), Itkowitz (US 9901402 B2), Luo (US 20150100910 A1), and Fitzgibbon (US 20200226786 A1).

Regarding Claim 21,
Modified Boca teaches
The system according to Claim 20
Boca does not explicitly teach
wherein the hand size image analysis step includes providing an image of the human hand on a fiducial marker grid,
analyzing the image to compute transformations from a marker coordinate system to a screen coordinate system,
processing the image in a neural network convolution layer to identify key points
on the human hand in the image,
using the transformations to compute coordinates of the key points in the marker coordinate system,
and calculating the true lengths of the segments of the digits of the human hand.
However, Luo further teaches
The hand size image analysis step includes providing an image of the human hand on a grid of fiducial markers (“FIG. 14 depicts how the simplified exemplary user's hand or hands may be photographed by the device's camera or other camera, and this image information may be used to refine the default parameters of the biomechanical and/or anatomical model of the user's hand, in accordance with one embodiment of the present invention. In acquiring such images, often it is useful to have the system provide a standardized background, such as a series of distance markings, grid, graph paper, and the like (1400) in order to better calibrate the image of the hand and correct for image distortions. This standardized background may additionally include various color, shades of gray, and resolution test targets as well. The background may be conveniently provided by, for example, electronically providing one or more background image sheets (e.g. a jpeg, png, pdf or other image file) for printing on the user's printer.” [0204]; Examiner Interpretation: The use of the standardized background is the same as if a grid of ArUco markers were used.)
and calculating the true lengths of the segments of the digits of the human hand (“the user may put each hand on background (1400), and take a photo of the hand(s) (1402) with either the computerized device's camera or other camera. This image may then be analyzed, preferably by an image analysis program. The background image will help correct for any image distortions caused by different camera angles, and the like. The user hand image analysis may be done onboard the user's handheld computerized device, but it need not be. In an alternative embodiment, the user may upload one or more images of the hand taken by any imaging device to an external image analyzer, such as a remote internet server. In either event, the image analyzer will analyze the user's skin or hand outline appearance (1404), deduce the most probable lengths one or more bones of the user's hand, such as the user's various finger and thumb bones, and send this data or other data to correct the default biomechanical and/or anatomical model of the user's hand(s) back to the user's computerized device, such as for example during calibration step 906 referenced in FIG. 9 above.” [0205]).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Luo to improve the accuracy of hand gesture identification and tracking by using a personalized anatomical hand model in place of a default hand model (“In some embodiments, to improve accuracy (that is to replace standard human hand biomechanical and/or anatomical model default parameters with actual user calibration parameters), it will be useful to acquire an image of the user's hands, and to employ various image processing and analysis techniques to analyze this image of the user's one or more hands to better estimate the relative length of the various bones of the user's hands. Indeed, in the event that the user has lost one or more fingers, the system may then use this information to make corresponding changes in its biomechanical and/or anatomical model of the human hand. In other words, the model may include calibration information associated with an image of at least a portion of the hand of the user.” [0203]. Also see at least [0010-0018]).

Luo also does not explicitly teach
analyzing the image to compute transformations from a marker coordinate system to a screen coordinate system,
processing the image in a neural network convolution layer to identify key points
on the human hand in the image,
using the transformations to compute coordinates of the key points in the marker coordinate system,
However, Itkowitz teaches
Analyzing the image to compute transformations from a marker coordinate system to a screen coordinate system (“The three-dimensional reconstruction accuracy relies heavily on the accuracy of camera calibration. Some fiducial markers attached to known locations on the surgeon's console can be used to determine the extrinsic parameters (rotation and translation) of multiple cameras with respect to the surgeon's console. This process can be done automatically. Active fiducial markers can be used for the calibration fiducial markers since such markers are only turned on during a calibration process and before the procedure. During the procedure, the calibration fiducial markers are turned off to avoid confusion with the fiducial markers used to localize the surgeon's hands.” Col. 16, lines 4-15; Examiner Interpretation: The determined rotation and translation of a camera from the surgeons console with fiducial markers at known locations is the transformation from a marker coordinate system to a screen coordinate system.)
And using the transformations to compute coordinates of the key points in the marker coordinate system (“FIG. 7 is an illustration of sensor 212 mounted on forefinger 292B with a location 713 in tracking coordinate system 750, and a sensor 211 mounted on thumb 292A with a location 711 in tracking coordinate system 750.” Col. 17, lines 28-31; Examiner Interpretation: The tracking coordinate system is the same as the marker coordinate system.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Itkowitz to allow for remote operation of a robot without the use of additional hardware that would take the focus off the operation by demonstrating the operation using human hands to a camera  (“In this aspect, after being placed in a gesture detection mode of operation, hand tracking controller 130 detects a hand gesture pose, or a hand gesture pose and a hand gesture trajectory. Controller 130 maps hand gesture poses to certain system mode control commands, and similarly controller 130 maps hand gesture trajectories to other system mode control commands. Note that the mapping of poses and trajectories is independent and so this is different from, for example, manual signal language tracking. The ability to generate system commands and to control system 100 using hand gesture poses and hand gesture trajectories, in place of manipulating switches, numerous foot pedals, etc. as in known minimally invasive surgical systems, provides greater ease of use of system 100 for the surgeon. When a surgeon is standing, the use of hand gesture poses and hand gesture trajectories to control system 100 makes it is unnecessary for the surgeon to take the surgeon's eyes off the patient and/or viewing screen and to search for a foot petal or a switch when the surgeon wants to change the system mode. Finally, the elimination of the various switches and foot pedals reduces the floor space required by the minimally invasive teleoperated surgical system.” See at least Col. 11, lines 26-47).
Itkowitz also does not explicitly teach
processing the image in a neural network convolution layer to identify key points
on the human hand in the image,
However, Fitzgibbon teaches
processing the image in a neural network convolution layer to identify key points (“the trained machine learning system comprises a neural network such as a convolutional neural network (CNN) or other type of neural network. There is a set of specified keypoints with known locations relative to the object and expressed in object coordinates. FIG. 8 is a schematic diagram of an example convolutional neural network architecture which is used in some cases but which is not intended to limit the scope of the technology. The input to the neural network is a frame of sensor data 800 so that the first layer 802 of the neural network comprises a three dimensional array of nodes which holds raw pixel values of the frame of sensor data, such as an image with three color channels. A second layer 804 of the neural network comprises a convolutional layer. It computes outputs of nodes that are connected to local regions in the input layer 802. Although only one convolution layer 804 is shown in FIG. 8, in some examples, there are a plurality of convolution layers connected in series, such as 5 to 20 convolutional layers. A third layer 806 of the neural network comprises a rectified linear unit (RELU) layer which applies an activation function. A fourth layer of the neural network 808 is a pooling layer which computes a downsampling and a fifth layer of the neural network 810 is a fully connected layer to compute a probability map 812 corresponding to the frame of sensor data, where each image element location in the probability map indicates a probability that the image element depicts each of the specified keypoints." [0086])
on the human hand in the image (“FIG. 2A is a schematic diagram of an object, which in this case is a hand 200, where the hand is raised in the air with the palm facing the viewer and with the fingers generally outstretched. The thumb and forefinger are moved towards one another to form a pinch gesture. Four keypoints are indicated as small circles 202, 204, 206, 208. Three of the keypoints, 202, 206, 208 are regular keypoints and one of the keypoints 204 is a floating keypoint. The floating keypoint is at a defined location, such as around two to five centimeters above a center of the back of the hand, where the back of the hand is opposite the palm of the hand. The regular keypoints are at defined locations such as on the knuckle where the little finger joins the hand (see keypoint 202), on the knuckle where the forefinger joins the hand (see keypoint 206), in the center of the wrist where the wrist joins the hand (see keypoint 208).” [0036]).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Fitzgibbon for identifying 3-D positions of keypoints of an object captured by a 2d image (“there is an apparatus for detecting position and orientation of an object. The apparatus comprises a memory storing at least one frame of captured sensor data depicting the object. The apparatus also comprises a trained machine learning system configured to receive the frame of the sensor data and to compute a plurality of two dimensional positions in the frame. Each predicted two dimensional position is a position of sensor data in the frame depicting a keypoint, where a keypoint is a pre-specified 3D position relative to the object. At least one of the keypoints is a floating keypoint depicting a pre-specified position relative to the object, lying inside or outside the object's surface. The apparatus comprises a pose detector which computes the three dimensional position and orientation of the object using the predicted two dimensional positions and outputs the computed three dimensional position and orientation.” See at least [0006])

Claim 22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Boca (US 20150314442 A1) in view of Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), Sager (US 5040056 A), Sun (US 9321176 B1), Itkowitz (US 9901402 B2), Luo (US 20150100910 A1), and Fitzgibbon (US 20200226786 A1), and Iqbal (US 10929654 B2).

Regarding Claim 22,
Modified Boca teaches
	The system according to Claim 20
Boca does not explicitly teach
	wherein analyzing camera images of the hand demonstrating the operation on the workpiece includes processing the images in a neural network convolution layer to identify key points
	on the human hand in the images,
	performing a Point-n-Perspective calculation using the key points on the human hand in the images 
and the previously determined true lengths of the plurality of segments of the digits of the human hand, and calculating a three-dimensional pose of the plurality of segments.
However, Fitzgibbon teaches
Wherein analyzing camera images of the hand demonstrating the operation on the workpiece includes processing the images in a neural network convolution layer to identify key points (“the trained machine learning system comprises a neural network such as a convolutional neural network (CNN) or other type of neural network. There is a set of specified keypoints with known locations relative to the object and expressed in object coordinates. FIG. 8 is a schematic diagram of an example convolutional neural network architecture which is used in some cases but which is not intended to limit the scope of the technology. The input to the neural network is a frame of sensor data 800 so that the first layer 802 of the neural network comprises a three dimensional array of nodes which holds raw pixel values of the frame of sensor data, such as an image with three color channels. A second layer 804 of the neural network comprises a convolutional layer. It computes outputs of nodes that are connected to local regions in the input layer 802. Although only one convolution layer 804 is shown in FIG. 8, in some examples, there are a plurality of convolution layers connected in series, such as 5 to 20 convolutional layers. A third layer 806 of the neural network comprises a rectified linear unit (RELU) layer which applies an activation function. A fourth layer of the neural network 808 is a pooling layer which computes a downsampling and a fifth layer of the neural network 810 is a fully connected layer to compute a probability map 812 corresponding to the frame of sensor data, where each image element location in the probability map indicates a probability that the image element depicts each of the specified keypoints." [0086])
	on the human hand in the images (“FIG. 2A is a schematic diagram of an object, which in this case is a hand 200, where the hand is raised in the air with the palm facing the viewer and with the fingers generally outstretched. The thumb and forefinger are moved towards one another to form a pinch gesture. Four keypoints are indicated as small circles 202, 204, 206, 208. Three of the keypoints, 202, 206, 208 are regular keypoints and one of the keypoints 204 is a floating keypoint. The floating keypoint is at a defined location, such as around two to five centimeters above a center of the back of the hand, where the back of the hand is opposite the palm of the hand. The regular keypoints are at defined locations such as on the knuckle where the little finger joins the hand (see keypoint 202), on the knuckle where the forefinger joins the hand (see keypoint 206), in the center of the wrist where the wrist joins the hand (see keypoint 208).” [0036]),
	performing a Point-n-Perspective calculation using the key points on the human hand in the images (“The predicted 2D positions are input to the pose detector 104 which computes the pose (i.e. the position and orientation) of the object using the predicted 2D positions. In some cases the pose detector uses a closed form solution 306 to compute the pose such as by using a well-known perspective number point (PnP) algorithm as explained below. In some cases the pose detector uses a optimization 308 to compute the pose. A PnP algorithm takes a plurality, n, of 3D points in a reference frame of the object together with their corresponding 2D image projections. In addition, the PnP algorithm knows the intrinsic camera parameters such as the camera focal length, principal image point and skew parameter. The task of the PnP algorithm is to find the values of the matrix R and vector T (which express the rotation and translation of the object to convert it from object coordinates to world coordinates, and which give the pose of the object in world coordinates i.e. its position and orientation) from the following well known perspective projection model for cameras” [0042-0043])
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Fitzgibbon for identifying 3-D positions of keypoints of an object captured by a 2d image (“there is an apparatus for detecting position and orientation of an object. The apparatus comprises a memory storing at least one frame of captured sensor data depicting the object. The apparatus also comprises a trained machine learning system configured to receive the frame of the sensor data and to compute a plurality of two dimensional positions in the frame. Each predicted two dimensional position is a position of sensor data in the frame depicting a keypoint, where a keypoint is a pre-specified 3D position relative to the object. At least one of the keypoints is a floating keypoint depicting a pre-specified position relative to the object, lying inside or outside the object's surface. The apparatus comprises a pose detector which computes the three dimensional position and orientation of the object using the predicted two dimensional positions and outputs the computed three dimensional position and orientation.” See at least [0006])
Fitzgibbon also does not explicitly teach
and the previously determined true lengths of the plurality of segments of the digits of the human hand, and calculating a three-dimensional pose of the plurality of segments.
However, Iqbal teaches
and the previously determined true lengths of the plurality of segments of the digits of the human hand, and calculating a three-dimensional pose of the plurality of segments (“the 3D pose reconstruction system 100 may be used to reconstruct the 3D pose from the normalized 2.5D representation of the pose. FIG. 1B illustrates a conceptual diagram of a scaled pose, in accordance with an embodiment. In one embodiment, a scaled pose refers to a scale normalized pose 105 of a hand having a length C for a bone between a pair of keypoints n and m. The scale normalized pose 105 is defined by 2.5D keypoint locations that, when processed by the 3D pose reconstruction unit 110 produces a scale normalized 3D pose {circumflex over (P)}.” See at least Col. 5, lines 26-36 and fig. 1B. Examiner Interpretation: The hand length C is a previously determined true length).
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Iqbal to estimate a 3D pose of a human hand with the use of a single 2D camera for the use of human-computer interaction while reducing the impact of the hand’s appearance variation, complex poses, and self-occlusions (“Estimating a 3D pose of an object, such as a hand or body (human, animal, robot, etc.), from a 2D image is useful for human-computer interaction. Hand pose can be represented by a fixed set of points in 3D space, usually joints, called landmarks or keypoints. Estimating the 3D pose accurately is a difficult task due to the large amounts of appearance variation, self-occlusions, and complexity of articulated hand poses. 3D hand pose estimation escalates the difficulties even further because a depth of each of the hand keypoints also has to be estimated. Conventional techniques for determining locations of the landmarks of a hand in 3D space include one or more of multi-view camera systems, depth sensors, and color markers/gloves. Each of the conventional techniques requires a constrained environment and/or specialized equipment. Furthermore, environmental conditions such as sunlight, occlusions, and complexity of non-rigid hand poses present challenges to landmark detection and determination. There is a need for addressing these issues and/or other issues associated with the prior art.” See at least Col. 1, lines 21-40.)

Claims 15-16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Boca (US 20150314442 A1) in view of Butler (US 20160307032 A1), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), JETTÉ (US 20190210217 A1), and Sager (US 5040056 A).

Regarding Claim 15,
Boca teaches
A method for programming a robot to perform an operation by human demonstration, said method comprising (“There is described below the use of hand gestures to teach a path to be followed by the industrial robot 12 in performing work on workpiece 14.” [0021]):
demonstrating the operation on a workpiece by a human hand (“the instructions to the robot 12 that will be assembled from the hand gestures from the one or two hands seen by the camera and as described herein the object being pointed to, that is the scene data to create the path and instructions to be followed by the robot 12 when the robot performs work on the workpiece 14. For example, one hand is used to teach a robot target and the other hand is used to generate a grab or drop instruction. It is up to the robot operator to associate a particular hand gesture with a particular instruction.” [0023]; Examiner Interpretation: Despite the human hand not physically touching the workpiece in the demonstration, the operation to be performed on the workpiece is demonstrated by the human hand.);
analyzing camera images of the hand demonstrating the operation on the workpiece to create demonstration data (“the image of the location pointing hand gesture of step 304 and the associated location on the object are captured by the camera 11 and sent to the computation device 13 for processing. At step 308, the computation device 13 calculates from the image the corresponding location and orientation of the robot tool in the robot scene.” See at least [0034]; Examiner Interpretation: The location of the robot tool corresponding to the taught locations is the demonstration data.),
determining a position and orientation of the gripper (“the hand and finger location and orientation can be used to calculate the corresponding location and orientation of the robot tool in the robot scene.” [0038]; “location, orientation and associated action can be sent to the robot individually or all at once at the end of the teaching process; … with the image of the scene the part can be recognized and then the processing of the gesture has to be in relationship to the part; the robot targets can be defined relative to a part coordinate system” [0061-0064]);
generating robot motion commands, by a robot controller having a processor and memory, based on the demonstration data … to cause the robot to perform the operation on the new workpiece (“At step 514, the identified gesture is stored in the memory of the computation device 13 or in the absence of such a device in the memory of the robot controller 15.” [0050]; “creating robot instructions from the gestures by using the gesture context to the scene data from the same image or as additional data or extra processing to calculate/generate robot instructions (step 704 and optional step 706), storing the created instructions (step 708), asking if more created instructions are needed (step 710) and in step 712 sending the created instructions to the robot if no more created instructions are needed and performing in FIG. 7b all of the steps shown in FIG. 7a except the step 712 of sending the created instructions to the robot. The optional step 706 in these flowcharts of providing the scene 3D model to convert the gesture to a robot instruction step 704 is only needed if the scene will be subtracted from the image of the gesture.” [0052]; “In general a robot move instruction has information about the robot tool and coordinate system used for the robot target” [0057]),
and the position and orientation of the gripper relative to the workpiece contained in the demonstration data (“the hand and finger location and orientation can be used to calculate the corresponding location and orientation of the robot tool in the robot scene.” [0038]; “location, orientation and associated action can be sent to the robot individually or all at once at the end of the teaching process; … with the image of the scene the part can be recognized and then the processing of the gesture has to be in relationship to the part; the robot targets can be defined relative to a part coordinate system” [0061-0064]);
and performing the operation on the new workpiece by the robot (“that is the scene data to create the path and instructions to be followed by the robot 12 when the robot performs work on the workpiece 14.” [0023]; “By work is meant those actions performed by a robot such as painting, grinding, polishing, deburring, welding etc. that make a physical change to the workpiece and those interactions that a robot has with a workpiece such as picking up the workpiece from one location and moving it to another location or inserting the workpiece into a specific location that does not physically change the workpiece.” [0003]).

Boca does not explicitly teach
including analyzing image pixel data of the hand, from a camera, to identify tip, base knuckle and second knuckle points on a thumb and forefinger of the hand, applying pixel depth data from the camera to compute three-dimensional (3D) coordinates of the points on the thumb and forefinger identified in the pixel data, 
using the 3D coordinates of the points on the thumb and forefinger to compute a hand coordinate frame and a corresponding gripper coordinate frame, 
where the gripper coordinate frame represents a gripper type selected from a group including a finger- type gripper and a vacuum-type gripper,
gripper coordinate frame
analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece, … and the initial position and orientation of the new workpiece,
including adjusting the initial position and orientation of the new workpiece based on a conveyor position index;
including motion commands causing the gripper to move to a grasping position and orientation based on the initial position and orientation of the new workpiece

However, Butler teaches
analyzing image pixel data of the hand, from a camera, to identify tip, base knuckle and second knuckle points on a thumb and forefinger of the hand, applying pixel depth data from the camera to compute three-dimensional (3D) coordinates of the points on the thumb and forefinger identified in the pixel data (“An example provides an image processing method comprising receiving, from an infrared (IR) camera, a signal encoding an IR image including a plurality of IR pixels, each IR pixel specifying one or more IR parameters of that IR pixel, identifying, in the IR image, IR-skin pixels that image a human hand, for each IR-skin pixel, estimating a depth of a human hand portion imaged by that IR-skin pixel based on the IR parameters of that IR-skin pixel, and deriving a skeletal hand model including a plurality of hand joints, each hand joint defined with three independent position coordinates inferred from the estimated depths of each human hand portion. In such an example, deriving the skeletal hand model alternatively or additionally includes assembling a depth map including a depth for each IR-skin pixel, and identifying, based on one or both of the IR-skin pixels and the depth map, one or more anatomical features of the human hand. In such an example, deriving the skeletal hand model alternatively or additionally includes estimating, based on one or both of the anatomical features and the depth map, a position of a joint of the human hand.” [0084]; See fig. 5 for the identified tip, base knuckle and second knuckle points on a thumb and forefinger of the hand.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of Boca to further include the teachings of Butler to use human hand gestures for control of a computer application (see at least [0013]).

Butler also does not explicitly teach
using the 3D coordinates of the points on the thumb and forefinger to compute a hand coordinate frame and a corresponding gripper coordinate frame, 
where the gripper coordinate frame represents a gripper type selected from a group including a finger- type gripper and a vacuum-type gripper,
gripper coordinate frame
analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece, … and the initial position and orientation of the new workpiece,
including adjusting the initial position and orientation of the new workpiece based on a conveyor position index;
including motion commands causing the gripper to move to a grasping position and orientation based on the initial position and orientation of the new workpiece

However, Kofman teaches
A hand and gripper coordinate frame and using the 3D coordinates of the points on the thumb and forefinger to compute a hand coordinate frame and a corresponding gripper coordinate frame (Fig. 3 shows the hand coordinate frame with the X axis passing through the midway point (M) between the thumb (T) and index finger (I) fingertips. This coordinate frame corresponds to the robot coordinate frame. See at least Pg. 4, Col. 2 and Fig. 2(b) and Fig. 3).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca and Butler to further include the teachings of Kofman to remotely control a robot based on position and orientation of a human operator’s hand in a demonstration of the operation without the restraints of sensors and wires on the human hand. See at least the introduction on Pgs. 1-2.

Kofman also does not explicitly teach
where the gripper coordinate frame represents a gripper type selected from a group including a finger- type gripper and a vacuum-type gripper,
analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece, … and the initial position and orientation of the new workpiece,
including adjusting the initial position and orientation of the new workpiece based on a conveyor position index;
including motion commands causing the gripper to move to a grasping position and orientation based on the initial position and orientation of the new workpiece
However, JETTÉ teaches
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper (“In this specific embodiment, the vacuum cup is made of a flexible, resilient material, and the relative distance between the robot and the workpiece held by the robot can vary based on this flexibility and operating conditions. Such variations in the relative distance between a given robot and the workpiece it holds was a source of positioning uncertainty in the reference frame of the robots. This gripper type was found to provide satisfactory gripping capability in the embodiment shown in FIG. 1, but it will be understood that other gripper types can be used in other embodiments. Moreover, more than one gripper, possibly of different gripper types, can be used as the end effector per robot if desired. For instance, a clamp gripper can be used in addition to a vacuum cup for a given robot, or for all robots, for instance. The gripper type or types can vary from one robot to another within a given workpiece holding system embodiment. Indeed, the exact type of gripper can be selected from the following general categories: impactive—e.g. jaws, clamps or claws which physically grasp by direct impact upon the object; ingressive—pins, needles or hackles which physically penetrate the surface of the object (e.g. an aperture or bore of the workpiece); astrictive—forces applied to the objects surface (e.g. by vacuum, magneto- or electroadhesion); and contigutive—requiring direct contact for adhesion to take place (e.g. surface tension or freezing).” [0043-0044])
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of Boca, Butler, and Kofman to further include the teachings of JETTÉ so that gripper type can vary as needed for different application requirements (see at least [0043-0044]).

JETTÉ also does not explicitly teach
analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece, … and the initial position and orientation of the new workpiece,
including adjusting the initial position and orientation of the new workpiece based on a conveyor position index;
including motion commands causing the gripper to move to a grasping position and orientation based on the initial position and orientation of the new workpiece
However, Sager teaches
Analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece and generating robot motion commands based on the initial position and orientation of the new workpiece (“This invention provides a method and apparatus which uses a vision-equipped robotic system to locate, identify and determine the orientation of objects, and to pick them up and transfer them to a moving or stationary destination.” Col. 1, lines 50-54; Examiner Interpretation: The determined locations and orientations of the objects are initial locations and orientations because they are in that position before being picked up. They are new workpieces since they are different from the demonstrated workpiece.).
Adjusting the initial position and orientation of the new workpiece based on a conveyor position index (“A video camera periodically records images of objects located on a moving conveyor belt. The images are identified and their position and orientation is recorded in a moving conveyor belt coordinate system. The information is transmitted to a motion control device associated with a first robot. The motion control device coordinates the robot with the moving belt coordinate system and instructs the robot's arm to pick up certain objects” See at least Col. 1, lines 50-62; “In determining whether an object has moved past the pick-up window or has not yet moved into the pick-up window, the motion controller considers the time it would take the robot arm to move from its current position to the location of the object and the distance the object would travel on the belt during that time. In other words, an object that is in the pick-up window when the robot is prepared to pick it up, may move out of the pick-up window by the time the robot can reach it. The motion controller considers that movement and will not attempt to pick it up.” Col. 7, lines 1-11; Examiner Interpretation: The moving conveyor belt coordinate system is the conveyor position index and adjusting the initial position and orientation is done by accounting for the movement of the objects).
Motion commands causing the gripper to move to a grasping position and orientation based on the initial position and orientation of the new workpiece (“The motion controller will go ahead and direct the robot to pick-up that object after accounting for its movement during the time it takes for the robot to reach it.” See at least Col. 7, lines 14-17; “the motion controller considers the time it would take the robot arm to move from its current position to the location of the object” Col. 7, lines 2-5; “The motion controller considers the object orientation in picking up the object” Col. 7, lines 60-61).
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of Boca, Butler, Kofman, and JETTÉ to further include the teachings of Sager to increase the flexibility of the robot to pick up randomly positioned objects (“known methods and apparatus are generally not effective with randomly positioned and randomly oriented objects. This is typically the case with objects that are deposited onto a conveyor belt, such as flat components that are asymmetrical about at least one axis. For these parts, the system must locate them on the moving conveyor belt and also determine their orientation. This requires a relatively sophisticated vision system.” See at least Col. 1, lines 17-24).

Regarding Claim 16,
Modified Boca teaches
The method according to Claim 15
Boca further teaches
	wherein demonstrating the operation on the workpiece by the human hand and performing the operation on the new workpiece by the robot are both performed in a robotic work cell, and the camera provides the camera images of the hand demonstrating the operation on the workpiece (“There is described below the use of hand gestures to teach a path to be followed by the industrial robot 12 in performing work on workpiece 14. As shown in FIG. 2a, an operator 16 uses hand gestures to point to a location in the robot workspace. The camera, which is a 3D vision sensor 11, is attached to the robot and takes an image of the hand gesture and the relationship of the operator's hand 16a to the workpiece 14. It should be appreciated that the workpiece 14 while shown in FIG. 2a may not be in the view seen by the camera. The workpiece image may have been taken at a different time and as is described below the image of the hand without the workpiece and the workpiece without the hand need to be referenced to a common coordinate system. FIG. 2b shows one example of the image and the relationship of the operator's hand 16a to the workpiece 14.” See at least [0021-0022] and figs. 2a and 2b.; Examiner Interpretation: The demonstration is performed in the robotic workcell when hand gestures are pointed at a location within the robotic workspace. From Fig. 2a, you can see the hand within the workspace of the robot. Despite the human hand not physically touching the workpiece in the demonstration, the operation to be performed on the workpiece is demonstrated by the human hand.).

Boca does not explicitly teach
and the camera images of the new workpiece
However, Sager teaches
“This invention provides a method and apparatus which uses a vision-equipped robotic system to locate, identify and determine the orientation of objects, and to pick them up and transfer them to a moving or stationary destination. A video camera periodically records images of objects located on a moving conveyor belt.” Col. 1, lines 50-56; Examiner Interpretation: The determined locations and orientations of the objects are initial locations and orientations because they are in that position before being picked up. They are new workpieces since they are different from the demonstrated workpiece.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified Boca to further include the teachings of Sager to increase the flexibility of the robot to pick up randomly positioned objects (“known methods and apparatus are generally not effective with randomly positioned and randomly oriented objects. This is typically the case with objects that are deposited onto a conveyor belt, such as flat components that are asymmetrical about at least one axis. For these parts, the system must locate them on the moving conveyor belt and also determine their orientation. This requires a relatively sophisticated vision system.” See at least Col. 1, lines 17-24).

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claim 1 is rejected on the ground of obviousness type nonstatutory double patenting as being unpatentable over claim 7 of U.S. Patent No. 11130236 in view of Yusuke (translated JP2015044257A), Kofman (IDS: “Teleoperation of a robot manipulator using a vision-based human-robot interface”), and JETTÉ (US 20190210217 A1).
Claim
This application’s claim elements
Claim elements of patent no. 11130236
1
A method for programming a robot to perform an operation by human demonstration, said method comprising:
(claim 1) generate and optimize a movement path of the robot based on the extracted movement path of the fingers or the arms of the human
1
demonstrating the operation on a workpiece by a human hand; analyzing camera images of the hand demonstrating the operation on the workpiece,
(claim 1) process time-varying images of a first workpiece and fingers or arms of a human working on the first workpiece,
1
by a computer,
(claim 1) apparatus comprising: a processor
1
to create demonstration data, 
(claim 1) and thereby extract a movement path of the fingers or the arms of the human; 

(claim 7) the processor is configured to extract positions of the feature points of the second workpiece at predetermined intervals, wherein the robot controller is configured to: update an equation of motion of each feature point at the predetermined intervals based on the extracted positions of the feature points; calculate a position or posture of the second workpiece based on each feature point position calculated from the corresponding updated equation of motion;
1
analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece;
(claim 1)  generate a transform function for transformation from the first workpiece to a second workpiece based on feature points of the first workpiece and feature points of the second workpiece, the second workpiece being worked on by a robot;

(claim 3) The robot movement teaching apparatus according to claim 1, further comprising: a second visual sensor configured to capture an image of the second workpiece; and wherein the processor is configured to extract the feature points of the second workpiece from the image obtained by the second visual sensor.

(claim 7) wherein the processor is configured to extract positions of the feature points of the second workpiece at predetermined intervals, wherein the robot controller is configured to: update an equation of motion of each feature point at the predetermined intervals based on the extracted positions of the feature points; calculate a position or posture of the second workpiece based on each feature point position calculated from the corresponding updated equation of motion;
1
generating robot motion commands, based on the demonstration data and the initial position and orientation of the new workpiece, to cause the robot to perform the operation on the new workpiece;
(claim 1) and generate and optimize a movement path of the robot based on the extracted movement path of the fingers or the arms of the human and based on the generated transform function, wherein optimization of the movement path of the robot comprises minimizing a value of a function of a movement time for the robot to move along the generated movement path and an acceleration of each shaft of the robot. 

(claim 7) and control the robot based on the calculated position or posture of the second workpiece
1
and performing the operation on the new workpiece by the robot.
(claim 7) cause the robot to follow the second workpiece.


Patent 11130236 does not explicitly teach
where the demonstration data defines a pick, move and place operation including a grasping step where hand pose and workpiece pose are determined when the hand grasps the workpiece, and a place step where the workpiece pose is determined when the workpiece becomes stationary after the move step, 
where the demonstration data includes a hand coordinate frame and a gripper coordinate frame corresponding to the hand coordinate frame, 
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper;
However, Yusuke teaches
create demonstration data where the demonstration data defines a pick, move and place operation including a grasping step where hand pose and workpiece pose are determined when the hand grasps the workpiece, a move step …, and a place step where the workpiece pose is determined when the workpiece becomes stationary after the move step … and a gripper coordinate frame corresponding to the hand (“The control unit 16 assumes that the user's two fingers (hereinafter, the user's (human) finger is referred to as a “finger” and distinguished from the finger unit 14 or the finger members 14A and 14B of the robot 10) grips the workpiece. When recognized, the position of the finger in the work coordinate system is recognized (step S33). That is, the control unit 16 recognizes that the finger has gripped the workpiece from the finger position and the workpiece position, and converts the finger position at that time to the workpiece coordinate system. The control unit 16 stores the position in the workpiece coordinate system at this time in the storage unit 17 as gripping position information. Next, the user holding the workpiece with two fingers moves the workpiece to a desired location and releases the workpiece (releases the finger from the workpiece). The control unit 16 recognizes that the finger has moved away from the workpiece based on the image from the camera 15, recognizes the position and posture of the workpiece in the robot coordinate system at that time as the position and posture of the movement destination of the workpiece, is stored (step S34). With the above operation, the control unit 16 stores information on the position of the workpiece in the workpiece coordinate system and the position and posture of the workpiece in the robot coordinate system. As a result, preparation for instructing the work to be performed on the work and having the robot 10 perform the work on the work is completed. Note that the position and orientation of the movement destination of the workpiece may be recognized based on the position and orientation of the workpiece when it is recognized that the movement of the workpiece has stopped for a predetermined time or more based on the captured image. In the above embodiment, the gripping position of the workpiece is determined based on the gripping position when gripping the workpiece.” See at least page 3, line 48 to page 4, line 14.),
	 It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of patent 11130236 to further include the teachings of Yusuke to quickly and easily teach robots workpiece operations (See at least “problem to be solved” on page 1.).

Yusuke also does not explicitly teach
where the demonstration data includes a hand coordinate frame and a gripper coordinate frame corresponding to the hand coordinate frame, 
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper;
However, Kofman teaches
where the demonstration data includes a hand coordinate frame and a gripper coordinate frame corresponding to the hand coordinate frame (“The orientation of the hand of the operator is used to control the orientation of the robot-manipulator end-effector and is computed from the 3-D coordinates of the centroids of the three hand markers as shown in Fig. 3. Firstly, the midpoint of the line segment joining the thumb and index-finger marker centroids, T and I, respectively, is defined as M (Fig. 3(a)). A coordinate system X o YoZo with origin at wrist Wis then defined by a translation of the local-site global reference coordinate system XY Z to the wrist [Fig. 3(b)]. Through yaw, pitch, and roll rotations, explained below, the final axes X3Y3Z3 to be used to determine the tool axes of the robot-end-effector are obtained with X3 collinear with WM, WT I coplanar with X3Y3, and T lying in the first quadrant of X3Y3, as shown in Fig. 3(b). The yaw-pitch-roll tool rotation angles are determined directly from the hand rotation angles of WM and TI as follows: yaw rotation a of coordinate system XoYoZo about Zo to X1Y1Z1, pitch rotation (3 of X1Y1Z1 about Y1 to X2Y2Z2, shown in Fig. 3(c) using -(3 for clarity, and roll rotation I of X2Y2Z2 about X2 to X3Y3Z3, as shown in Fig. 3(d).” See at least Pg. 4, Col. 2, lines 1-20; Fig. 3 shows the hand coordinate frame which corresponds to the robot coordinate frame.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of patent 11130236 and Yusuke to further include the teachings of Kofman regarding corresponding coordinate frames, to remotely control a robot based on position and orientation of a human operator’s hand in a demonstration of the operation without the restraints of sensors and wires on the human hand. See at least the introduction on Pgs. 1-2.

Kofman also does not explicitly teach
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper;
However, JETTÉ teaches
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper (“In this specific embodiment, the vacuum cup is made of a flexible, resilient material, and the relative distance between the robot and the workpiece held by the robot can vary based on this flexibility and operating conditions. Such variations in the relative distance between a given robot and the workpiece it holds was a source of positioning uncertainty in the reference frame of the robots. This gripper type was found to provide satisfactory gripping capability in the embodiment shown in FIG. 1, but it will be understood that other gripper types can be used in other embodiments. Moreover, more than one gripper, possibly of different gripper types, can be used as the end effector per robot if desired. For instance, a clamp gripper can be used in addition to a vacuum cup for a given robot, or for all robots, for instance. The gripper type or types can vary from one robot to another within a given workpiece holding system embodiment. Indeed, the exact type of gripper can be selected from the following general categories: impactive—e.g. jaws, clamps or claws which physically grasp by direct impact upon the object; ingressive—pins, needles or hackles which physically penetrate the surface of the object (e.g. an aperture or bore of the workpiece); astrictive—forces applied to the objects surface (e.g. by vacuum, magneto- or electroadhesion); and contigutive—requiring direct contact for adhesion to take place (e.g. surface tension or freezing).” [0043-0044])
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of patent 11130236, Yusuke, and Kofman to further include the teachings of JETTÉ so that gripper type can vary as needed for different application requirements (see at least [0043-0044]).

Claim 1 is provisionally rejected on the ground of obviousness type nonstatutory double patenting as being unpatentable over claim 13 (filed 9/11/2020) of copending Application No. 17/018674 in view of claim 10 (filed 9/11/2020) and Yusuke (translated JP2015044257A), Boca (US 20150314442 A1), JETTÉ (US 20190210217 A1), and Sager (US 5040056 A).

Claim
This application’s claim elements
Claim elements application 17/018674
1
A method for programming a robot to perform an operation by human demonstration, said method comprising:
(Claim 12) A method for programming a robot to perform an operation by human demonstration
1
demonstrating the operation on a workpiece by a human hand; analyzing camera images of the hand demonstrating the operation on the workpiece, by a computer, to create demonstration data,
(Claim 12) demonstrating the operation on workpieces by a human using both hands; analyzing camera images of the hands demonstrating the operation on the workpieces, by a computer, to create demonstration data
1
where the demonstration data defines a pick, move 
(Claim 12) generating robot motion commands, based on the demonstration data

(Claim 13) wherein the demonstration data includes, at a grasping step of the operation, position and orientation of a hand coordinate frame, a gripper coordinate frame corresponding to the hand coordinate frame, and a workpiece coordinate frame.
1
where the demonstration data includes a hand coordinate frame and a gripper coordinate frame corresponding to the hand coordinate frame, 
(Claim 13) wherein the demonstration data includes, at a grasping step of the operation, position and orientation of a hand coordinate frame, a gripper coordinate frame corresponding to the hand coordinate frame, and a workpiece coordinate frame.
1
generating robot motion commands, based on the demonstration data 
(Claim 12) generating robot motion commands, based on the demonstration data, to cause the robot to perform the operation on the workpieces
1
and performing the operation on the new workpiece by the robot.
(Claim 12) and performing the operation on the workpiece by the robot.


17/018674 does not explicitly teach
	place operation
	hand pose and workpiece pose are determined when the hand grasps the workpiece, a move step where hand pose and workpiece pose are determined at a plurality of points defining a move path, and a place step where the workpiece pose is determined when the workpiece becomes stationary after the move step,
	where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper;
	analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece; generating robot motion commands, based on … the initial position and orientation of the new workpiece
However, Claim 10 teaches
place operation (“and the gripper poses and workpiece positions and poses are used by the robot teaching program to create workpiece pick-up and placement instructions for a robot.” See claim 10)
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of 17/018674 to further include the teachings of claim 10 so that a picked-up workpiece can be set down.

17/018674 claim 10 also does not explicitly teach
hand pose and workpiece pose are determined when the hand grasps the workpiece, a move step where hand pose and workpiece pose are determined at a plurality of points defining a move path, and a place step where the workpiece pose is determined when the workpiece becomes stationary after the move step,
	where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper;
	analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece; generating robot motion commands, based on … the initial position and orientation of the new workpiece
However, Yusuke teaches
create demonstration data where the demonstration data defines a pick, move and place operation including a grasping step where hand pose and workpiece pose are determined when the hand grasps the workpiece, a move step …, and a place step where the workpiece pose is determined when the workpiece becomes stationary after the move step … and a gripper coordinate frame corresponding to the hand (“The control unit 16 assumes that the user's two fingers (hereinafter, the user's (human) finger is referred to as a “finger” and distinguished from the finger unit 14 or the finger members 14A and 14B of the robot 10) grips the workpiece. When recognized, the position of the finger in the work coordinate system is recognized (step S33). That is, the control unit 16 recognizes that the finger has gripped the workpiece from the finger position and the workpiece position, and converts the finger position at that time to the workpiece coordinate system. The control unit 16 stores the position in the workpiece coordinate system at this time in the storage unit 17 as gripping position information. Next, the user holding the workpiece with two fingers moves the workpiece to a desired location and releases the workpiece (releases the finger from the workpiece). The control unit 16 recognizes that the finger has moved away from the workpiece based on the image from the camera 15, recognizes the position and posture of the workpiece in the robot coordinate system at that time as the position and posture of the movement destination of the workpiece, is stored (step S34). With the above operation, the control unit 16 stores information on the position of the workpiece in the workpiece coordinate system and the position and posture of the workpiece in the robot coordinate system. As a result, preparation for instructing the work to be performed on the work and having the robot 10 perform the work on the work is completed. Note that the position and orientation of the movement destination of the workpiece may be recognized based on the position and orientation of the workpiece when it is recognized that the movement of the workpiece has stopped for a predetermined time or more based on the captured image. In the above embodiment, the gripping position of the workpiece is determined based on the gripping position when gripping the workpiece.” See at least page 3, line 48 to page 4, line 14.),
	 It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of 17/018674 to further include the teachings of Yusuke to quickly and easily teach robots workpiece operations (See at least “problem to be solved” on page 1.).

Yusuke also does not explicitly teach
where hand pose and workpiece pose are determined at a plurality of points defining a move path
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper;
	analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece; generating robot motion commands, based on … the initial position and orientation of the new workpiece
However, Boca teaches
a move step where hand pose and workpiece pose are determined at a plurality of points defining a move path (“At step 310 the calculated location and orientation of the robot tool are sent to the computation device. Query 312 asks if more location points are needed to complete the robot path. Query 312 can be another gesture. If the answer is yes, the method 300 asks at query 314 if there is a need to reposition the camera. If the answer to query 314 is no, then the method 300 returns to step 304 where the operator makes the hand gesture associated with the next location point. While not shown in FIG. 3, if the answer to query 314 is yes, then the method 300 returns to step 302 where the camera is repositioned. If the answer to query 312 is no, then method 300 ends since no more robot path points have to be acquired.” [0039]; Examiner Interpretation: Hand pose also corresponds to workpiece pose (see [0024]).)
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of 17/018674 and Yusuke to further include the teachings of Boca to easily train a robot motion with basic hand demonstrations (see at least [0004]).

Boca also does not explicitly teach
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper;
	analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece; generating robot motion commands, based on … the initial position and orientation of the new workpiece
However, JETTÉ teaches
where the gripper coordinate frame represents a gripper type selected from a group including a finger-type gripper and a vacuum-type gripper (“In this specific embodiment, the vacuum cup is made of a flexible, resilient material, and the relative distance between the robot and the workpiece held by the robot can vary based on this flexibility and operating conditions. Such variations in the relative distance between a given robot and the workpiece it holds was a source of positioning uncertainty in the reference frame of the robots. This gripper type was found to provide satisfactory gripping capability in the embodiment shown in FIG. 1, but it will be understood that other gripper types can be used in other embodiments. Moreover, more than one gripper, possibly of different gripper types, can be used as the end effector per robot if desired. For instance, a clamp gripper can be used in addition to a vacuum cup for a given robot, or for all robots, for instance. The gripper type or types can vary from one robot to another within a given workpiece holding system embodiment. Indeed, the exact type of gripper can be selected from the following general categories: impactive—e.g. jaws, clamps or claws which physically grasp by direct impact upon the object; ingressive—pins, needles or hackles which physically penetrate the surface of the object (e.g. an aperture or bore of the workpiece); astrictive—forces applied to the objects surface (e.g. by vacuum, magneto- or electroadhesion); and contigutive—requiring direct contact for adhesion to take place (e.g. surface tension or freezing).” [0043-0044])
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of 17/018674, Yusuke, and Boca to further include the teachings of JETTÉ so that gripper type can vary as needed for different application requirements (see at least [0043-0044]).

JETTÉ also does not explicitly teach
analyzing camera images of a new workpiece to determine an initial position and orientation of the new workpiece; generating robot motion commands, based on … the initial position and orientation of the new workpiece
However, Sager teaches
“This invention provides a method and apparatus which uses a vision-equipped robotic system to locate, identify and determine the orientation of objects, and to pick them up and transfer them to a moving or stationary destination.” Col. 1, lines 50-54; Examiner Interpretation: The determined locations and orientations of the objects are initial locations and orientations because they are in that position before being picked up. They are new workpieces since they are different from the demonstrated workpiece.
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of 17/018674, Yusuke, Boca, and JETTÉ to further include the teachings of Sager to increase the flexibility of the robot to pick up randomly positioned objects (“known methods and apparatus are generally not effective with randomly positioned and randomly oriented objects. This is typically the case with objects that are deposited onto a conveyor belt, such as flat components that are asymmetrical about at least one axis. For these parts, the system must locate them on the moving conveyor belt and also determine their orientation. This requires a relatively sophisticated vision system.” See at least Col. 1, lines 17-24).

Claim 7 is provisionally rejected on the ground of obviousness type nonstatutory double patenting as being unpatentable over claims 13 (filed 9/11/2020) of copending Application No. 17/018674 in view of claims 7 and 10 (filed 9/11/2020) and Yusuke (translated JP2015044257A), Boca (US 20150314442 A1), JETTÉ (US 20190210217 A1), and Sager (US 5040056 A).

Regarding Claim 7,
Modified 17/018674 teaches
The method according to Claim 1
17/018674 Claims 12 and 13 do not teach
wherein analyzing camera images of the hand demonstrating the operation includes identifying locations of a plurality of points on the hand, including a tip, a base knuckle and a second knuckle of each of a thumb and a forefinger.
However, Claim 7 teaches
Claim
This application’s claim elements
Claim elements application 17/018674
7
wherein analyzing camera images of the hand demonstrating the operation includes identifying locations of a plurality of points on the hand, including a tip, a base knuckle and a second knuckle of each of a thumb and a forefinger.
(Claim 1) analyzing the sub-images by the second neural network to determine three-dimensional (3D) coordinates of a plurality of key points on the left and right hands;

(Claim 7) The method according to Claim 1 wherein the plurality of key points on the left and right hands include thumb tips, thumb knuckles, finger tips and finger knuckles.


	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to modify the teachings of modified 17/018674 to further include the teachings of claim 7 to use precise hand gestures that a human can quickly realize to control the robot precisely.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Handa (US 20210086364 A1) is pertinent because it discusses a tele operated robot hand involving corresponding hand and robot coordinate frames.

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Karston G Evans whose telephone number is (571)272-8480. The examiner can normally be reached Mon-Fri 9:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abby Lin can be reached on (571)270-3976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/K.G.E./Examiner, Art Unit 3664                 
                                                                                                                                                                                      /ABBY Y LIN/Supervisory Patent Examiner, Art Unit 3664