PNG
    media_image1.png
    172
    172
    media_image1.png
    Greyscale
United States Patent and Trademark Office
    
        
            
                                
            
        
    

Commissioner for Patents
United States Patent and Trademark Office
P.O. Box 1450
Alexandria, VA 22313-1450
www.uspto.gov











BEFORE THE PATENT TRIAL AND APPEAL BOARD


Application Number: 16/038,248
Filing Date: July 18, 2018
Appellant(s): Matthew Amacker et al.



__________________
Bryan L. Walker (Reg. No. 65,342)
For Appellant


EXAMINER’S ANSWER





This is in response to the appeal brief filed 03/01/2021.

(1) Grounds of Rejection to be Reviewed on Appeal
Every ground of rejection set forth in the Office action dated November 17, 2020 from which the appeal is taken is being maintained by the examiner except for the grounds of rejection (if any) listed under the subheading “WITHDRAWN REJECTIONS.”  New grounds of rejection (if any) are provided under the subheading “NEW GROUNDS OF REJECTION.”

The following ground(s) of rejection are applicable to the appealed claims.
Claims 1, 3-4, 8-13 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Hoffman et al. (9,283,674; IDS) in view of Grossmann et al. (2014/0064607).



Summary of teaching of references of independent claims
The following is a summary of teaching of the references that are relevant to teach the features of the independent claims 1, 8 and 16.

Teaching from Hoffman
Hoffman teaches that a method of operating a robot includes electronically receiving images and augmenting the images by overlaying a representation of the robot on the images.  The robot representation includes user-selectable portions.  The method includes electronically displaying the augmented images and receiving an indication of a selection of at least one user-selectable portion of the robot representation.  The method also includes electronically displaying an intent to command the selected at least one user-selectable portion of the robot representation, receiving an input representative of a user interaction with at least one user-selectable portion, and issuing a command to the robot based on the user interaction. (Hoffman: Abstract).
Referring to FIGS. 3A and 3B, in some implementations, a mobile robot 300 includes a robot body 310 (or chassis) that defines a forward drive direction F. The robot 300 also includes a drive system 315 and a sensor system 500, each supported by the robot body 310 and in communication with a controller 400 that coordinates operation and movement of the robot 300.  (Hoffman: c.9 L.31-37).
Therefore, the controller 400 controls the operation and movement of the robot 300 with a drive system 315 and a sensor system 500.
In some implementations, the robot 200 includes a controller 400 in communication with the drive system 215, 315 and any arm(s) 250, 250a, 250b and head(s) 260, 360 or gripper(s) 270 mounted on the arm(s) 250, 250a, 250b.  The controller 400 may issue drive commands to one or more motors driving the main tracks 220 and the flipper tracks 240, 240a, 240b or the legs 330.  Moreover, the controller 400 may issue rotational commands to a flipper motor to rotate the flippers 230 about the drive axis 15.  The controller 400 may include one or more computer processors and associated memory systems.  The controller 400 of the robot 200 may include a communication system 482, which includes, for example, a radio to communicate with a remote OCU 100 to receive commands and issue status and/or navigation information. (Hoffman: c.10 L.66-67 and c.11 L.1-12).
Thus, the controller controls the operation of the drive system, arm, head and gripper(s) with commands of motors of the appropriate system, arm, head and gripper.
In some implementations, the sensor system 500 includes an array of proximity sensors, one or more cameras 218, 219, 262 (e.g., stereo cameras, visible light camera, infrared camera, thermography, etc.), and/or one or more 3-D imaging sensors (e.g., volumetric point cloud imaging device) in communication with the controller 400 and arranged in one or more zones or portions of the robot 200 for detecting any nearby or intruding obstacles.  The proximity sensors may be converging infrared (IR) emitter-sensor elements, sonar sensors, and/or ultrasonic sensors that provide a signal to the controller 400 when an object 114 is within a given range of the robot 200.  If any of the sensors has a limited field of view, the controller 400 or the sensor system 500 can actuate the sensor in a side-to-side scanning manner to create a relatively wider field of view to perform robust ODOA. (Hoffman: c.24 L.36-51).
In some implementations, reasoning or control software, executable on a processor (e.g., of the robot controller 400), uses a combination of algorithms executed using various data types generated by the sensor system 500.  The reasoning software processes the data collected from the sensor system 500 and outputs data for making navigational decisions on where the robot 200 can move without colliding with an obstacle, for example.  By accumulating imaging data over time of the robot's surroundings, the reasoning software can in turn apply effective methods to selected segments of the sensed image(s) to improve simultaneous localization and mapping (SLAM). (Hoffman: c.24 L.51-62).
Therefore, the motors of the drive system, …, and gripper and the sensors, cameras and imaging system are controlled with commands from the controller 400. 

Teaching from Grossmann
Grossmann teaches that methods, systems, and computer program products to warp a depth map into alignment with an image, where the image sensor (e.g., camera) responsible for the image and depth sensor responsible for an original depth map are separated in space. In an embodiment, the warping of the depth map may be started before the original depth map has been completely read. Moreover, data from the warped depth map may be made available to an application before the entire warped depth map has been completely generated. Such a method and system may improve the speed of the overall process and/or reduce memory requirements.  (Grossmann: Abstract).
Depth maps and images together may constitute the primary input of many applications, such as video surveillance, video games (e.g. the Microsoft Kinect), hand gesture interpretation and other applications that take input unobtrusively from an un-instrumented user. Other related applications that take depth maps and images as input may include those that analyze the 3D environment around a sensor, for instance for autonomous control of a robot or vehicle or a safely monitoring system. (Grossmann: [0002]).
In some cases, the design of such applications may be easier if the depth map and image are registered or aligned, in the sense that the depth map is, or made to appear to be, produced by a depth sensor that is placed at the same physical location as the imaging sensor that produced the image.  When this is the case, the pixels of the depth map may be put into correspondence with the pixels of the image, and vice-versa. (Grossmann: [0003]).
Furthermore, Fig. 1 illustrates that the results of warping a depth map according to an embodiment.  An original depth map is shown at (a), and an image taken with a nearby camera is shown at (b).  The results of directly overlaying the depth map (a) over the image (b) is shown at (c).  Note the misalignment between the depth-map and image.  The overlay of a warped version of depth map (a) on the image (b) is shown at (d).  Note the smaller gaps and other small artifacts, but also that the depth map and image are relatively well aligned. (Grossmann: [0023] L.2-9).
Therefore, a warped depth map with an image is generated with reduced memory requirement.  The warped depth map overlaying an image is taken as a virtual .

 (2) Response to Argument
The appellant’s arguments have been fully considered but they are not persuasive.

A.	Claims 1, 9, and 16 Are Not Rendered Obvious By Hoffman in view of Grossmann
R1.	The appellant argued (regarding claims 1, 9 and 16) on p. 7 para. 2 that “Independent claim 1 recites language to, inter alia, "receive input data selecting the virtual representation of the object." The Final Office Action issued November 17, 2020 ("Final Office Action") on pp. 8-9 alleges that Hoffman discloses "receive input data selecting the virtual representation of the object". Hoffman col. 13 lines 65-67 is cited to provide that a user may select an object 114 in FIGS. 4J-M and thereby instruct the robot to pick up the object.”
The examiner disagreed respectfully.  The examiner applied Hoffman to teach the feature “receive input data selecting the virtual representation of the object (e.g., the user 10 touches the object 114 twice; Hoffman: c.13 L.67 and c.14 L.1. The controller 400 calculates the resolved motion of the robot 200 to get to the object 114 from its current position and grab the object 114.  Resolved motion is the process of detailing a task (e.g., a movement of the robot 200) into discrete motions.  For example, when the user 10 manipulates a robot representation 130 to grab an object 114 located at a distance from the robot 200, the robot 200 executes the task while avoiding walls, objects 114, and falling down a cliff (e.g., stairs).  The resolved motion algorithms consider input from the sensor system 500 to evaluate how the robot 200 will move from a current robot location to the destination.  ...  Therefore, a user 10, looking at a robot 200 in motion, sees a robot 200 moving across a room seamlessly and effortlessly. Hoffman: c.14 L.2-18.  Therefore, object 114 is selected as seen from the Fig. 4J and a resolved motion of the object 114 that can be moved to and grabbed by the robot 112 from current position. Image information of object 114 and sensor system 500 input to the resolved motion algorithm are received to determine the continuous motion of the object 114)”.  
It is obvious that object 114 selected can be reached by the robot moving across a room seamlessly and effortlessly from the current robot position.  If there is obstacles like wall, stairs between the object 114 and current robot position, the object 114 cannot be reached and grabbed if the obstacles cannot be avoided and object 114 cannot be selected.
Therefore, the selected object 114 is represented as a combination of image data of the object and can be reached (indicated by the resolved motion measure) and grabbed by the robot from its current position.  The combined data is taken as a virtual representation of the object 114.

R2.	The appellant argued on p. 8 para. 1 lines 1-10 that “claim 1 receives both visual and spatial data for the object. Specifically, claim 1 also recites "receive data from a visual detection sensor and a spatial detection sensor of the robot with respect to an object within an environment of the robot". Appellant notes, for example, that the gripper projection 116 in FIG. 41, while itself an overlay, only relates to a location of where the robot's arm will be grabbing (i.e., an overlain cursor). Hoffman does not provide for any selectable overlain data for the object 114 generated by spatial data obtained about the object 114 itself Thus, there is no virtual representation of the object to select. To interpret Hoffman as somehow receiving input data selecting the virtual representation of the object would stretch Hoffman beyond what it fairly teaches or suggests. For at least this reason, reversal of this rejection is respectfully requested.”
The examiner disagreed respectfully.  The examiner applied the reference of Hoffman to teach the feature “receive data (e.g., the user 10 touches the object 114 twice; Hoffman: c.13 L.67 and c.14 L.1 and Fig. 4J) from a visual detection sensor (e.g., The camera(s) 218, 219 may capture images and/or video of the robot environment for navigating the robot 200 and/or performing specialized tasks. Hoffman: c.9 L.28-30. The robot 200 may move the arm 250, 250a to position the head camera 262, 362 to provide a better view of the front of the robot 200 so the user 10 may better control the robot 200 due to better visibility of what is around the robot 200.  Hoffman: c.10 L.61-65. The gripper 270 includes a camera 272, the augmented image 121 includes a video feed from the gripper camera 272 displayed in a separate camera window 150.  The separate camera window 150 shows the video feedback as the gripper 270 is approaching the object 114.  The camera window 150 may display the projected target 116 in the video feed.  Hoffman: c.16 L.39-44) and a spatial detection sensor of the robot (e.g., The sensor system 500 may include one or more types of sensors supported by the robot body 210, which may include obstacle detection obstacle avoidance (ODOA) sensors, communication sensors, navigation sensors, etc. For example, these sensors may include, but are not limited to, proximity sensors, contact sensors, three-dimensional (3D) imaging/depth map sensors, a camera (e.g., visible light, infrared camera and/or stereo camera), sonar, radar, LIDAR (Light Detection And Ranging, which can entail optical remote sensing that measures properties of scattered light to find range and/or other information of a distant target), LADAR (Laser Detection and Ranging), etc. Hoffman: c.23 L.55-67) with respect to an object within an environment of the robot (e.g., The controller 400 calculates the resolved motion of the robot 200 to get to the object 114 from its current position and grab the object 114.  Resolved motion is the process of detailing a task (e.g., a movement of the robot 200) into discrete motions.  For example, when the user 10 manipulates a robot representation 130 to grab an object 114 located at a distance from the robot 200, the robot 200 executes the task while avoiding walls, objects 114, and falling down a cliff (e.g., stairs).  The resolved motion algorithms consider input from the sensor system 500 to evaluate how the robot 200 will move from a current robot location to the destination.  ...  Therefore, a user 10, looking at a robot 200 in motion, sees a robot 200 moving across a room seamlessly and effortlessly. Hoffman: c.14 L.2-18.  Therefore, object 114 is selected as seen from the Fig. 4J and a resolved motion of the object 114 that can be moved to and grabbed by the robot 200 from current position)”.
Therefore, image data of object 114 (of Fig. 4J of Hoffman) is received from the gripper camera and (resolved motion) distance obtained from input from the sensor system 500 used to determine if the robot can reach and grab the object 114 across the room seamlessly and effortlessly.
NOTE:  “The sensor system 500 may include one or more types of sensors supported by the robot body 210, which may include obstacle detection obstacle avoidance (ODOA) sensors, communication sensors, navigation sensors, etc. For example, these sensors may include, but are not limited to, proximity sensors, contact sensors, three-dimensional (3D) imaging/depth map sensors, a camera (e.g., visible light, infrared camera and/or stereo camera), sonar, radar, LIDAR (Light Detection And Ranging, which can entail optical remote sensing that measures properties of scattered light to find range and/or other information of a distant target), LADAR (Laser Detection and Ranging), etc.” (Hoffman: c.23 L.55-67).  
Therefore, the sensor system 500 is taken as spatial sensors.  The combination of image data and resolved motion (distance calculated from the resolved motion algorithm) from sensor system 500 is taken as a virtual representation of objects 114 (a representation of visual image and spatial information).
Alternatively, the feature is disclosed with the combined teaching of Hoffman and Grossmann by “Referring to FIGS. 2A, 2B, and 4J, in some examples, where the gripper 270 includes a camera 272, the augmented image 121 includes a video feed from the gripper camera 272 displayed in a separate camera window 150.  The separate camera window 150 shows the video feedback as the gripper 270 is approaching the object 114.  The camera window 150 may display the projected target 116 in the video feed.” (Hoffman: c.16 L.38-44 and Fig. 4J).  “The teleoperation software 101 displays on the display 110 a gripper projection 116 of the gripper position on the floor and indicates a height of the gripper 270 from the floor 17.  Therefore, if the user 10 lowers the tongs 274a, 274b simultaneously by swiping his/her fingers towards the south from position B to position A in the first and second quadrants 110a, 110b, the robot 200 lowers its gripper 270 towards the gripper projection 116.” (Hoffman: c.15 L.56-63).Therefore, the video in window 15 facilitates the grabbing of the object 114.  
From the teaching of Grossmann, the window 15 can be a warped depth map overlaying over the image of object (Grossmann: Fig. 1) 114 so that the tongs 274a and 274b can be determined to grab the object 114 without colliding with the floor 17 (of Hoffman).  “Disclosed herein are methods, systems, and computer program products to warp a depth map into alignment with an image, where the image sensor (e.g., camera) responsible for the image and depth sensor responsible for an original depth map are separated in space. In an embodiment, the warping of the depth map may be started before the original depth map has been completely read. Moreover, data from the warped depth map may be made available to an application before the entire warped depth map has been completely generated. Such a method and system may improve the speed of the overall process and/or reduce memory requirements.” (Grossmann: [0021]).
Therefore, the image information and depth (spatial) information of an object are combined by overlaying the depth information over the image information of the object and this is taken as a virtual representation of the object.
Thus, the image sensor (camera) is used to obtain the image (video) and a depth sensor is used to obtain the depth map of the object 114 and the floor 17.
Therefore, in both scenarios, the camera is the visual detection sensor and the depth sensor (of Grossmann) and sensor system 500 (of Hoffman) are taken as the spatial detection sensor.
virtual representation of the object to select”, please see R1 above.

R3.	The appellant argued on p. 8 para.2 and p. 9 para. 1 that “Claim l additionally recites language to, inter alia, "output instructions for the robot to interact with the object within the environment of the robot based upon the selection of the virtual representation of the object." The Final Office Action on pp. 9-10 relies upon similar portions of Hoffman to allegedly teach or suggest "output instructions for the robot to interact with the object within the environment of the robot based upon the selection of' and does not cite, although presumably relies upon, Grossman to allegedly teach or suggest "the virtual representation of the object". The Final Office Action on pp. 10-11 cites Grossman ¶¶ 0002-03 and 0023 for similar features in claim 1 to allegedly provide an overlay representation presentation. However, the combination of Hoffman and Grossman is itself deficient. Hoffman only provides for a user touch-selecting the visual representation of an object, and is silent regarding an overlay-based object selection. Moreover, even if Grossmann discloses a type of overlay, it similarly fails to teach or suggest overlay-based object selection.”
The examiner disagreed respectfully.  The examiner applied the reference of Hoffman to teach the feature “output instructions for the robot to interact with the object within the environment of the robot based upon the selection (The user 10 may command the robot 200 by manipulating the robot representation 130 and/or other inputs (e.g., touch gestures) on the OCU 100 and then view a preview (e.g., a simulation) of what the robot 200 should do, if the command(s) is/are executed on the robot 200.  Hoffman: c.17 L.66-67 and c.18 L.1-4.  The user 10 touches the object 114 twice, indicating that the robot 200 should go to the object 114 to grab it. The controller 400 calculates the resolved motion of the robot 200 to get to the object 114 from its current position and grab the object 114.  Resolved motion is the process of detailing a task (e.g., a movement of the robot 200) into discrete motions.  Hoffman: c.13 L.67 and c.14 L.1-6. The resolved motion algorithm produces a continuous motion that connects a starting point of the robot 200 to a destination point of the robot 200 while avoiding collisions with objects 114.  Therefore, a user 10, looking at a robot 200 in motion, sees a robot 200 moving across a room seamlessly and effortlessly. Hoffman: c.14 L.13-18.  It is obvious that when the user 10 grab the object to move from the starting point to the destination point of the robot 200, the information is sent to the robot such that the controller can calculate the resolved motion of the robot 200 to move the object 114 from the starting point to the destination point) of the virtual representation of the object (e.g., the image of object 114 clicked on the screen of Fig. 4J and a resolved motion that enables the robot to move from its current position to object 114 seamlessly and effortlessly; Hoffman: c.14 L.1-18)”
Therefore, input from sensor system 500 to resolved motion algorithm is used to calculate the resolved motion to produce a continuous motion that connects a starting point of the robot 200 to a destination point of the robot while avoiding collisions with objects 114.
Regarding the feature “overlay-based object selection”, please see R2 – Alternative (the combined teaching of Hoffman and Grossmann) where the virtual representation is taken as a depth (spatial) image overlay over image data of an object.

B.	Claim Is 8 Not Rendered Obvious By Hoffman in view of Grossmann
R4.	The appellant argued (regarding claim 8) on p. 10 para. 1lines 1-9 that “Hoffman col. 18 lines 19-21 explains that the "preview" (i.e., rendered flythrough) view 192a is rendered based upon a "robot map" 184, layout map 182, stored image data, or telepresence software 101. However, each of these options appears to be based upon pre-existing data previously obtained, rather than the virtual representation obtained by the robot's own spatial sensor as recited in claim 8, based upon "a virtual representation of the object by visually imposing the data received from the spatial detection sensor upon the data received from the visual detection sensor" as recited in claim 1 from which it depends. Thus, Hoffman does not teach or fairly suggest that the preview view 192a is based upon spatial data obtained by the robot 200.”
The examiner disagrees respectfully.  The examiner applied the reference of Hoffman to teach the feature “receive mapping data from the robot depicting the virtual representation of the environment of the robot obtained by the robot's own spatial sensor (e.g., Referring to FIGS. 5A-6B, in some implementations, the teleoperation software application 101 displays a hybrid three-dimensional image map 180 (hybrid map) on the display 110 as part of the augmented image 121.  The hybrid map 180 may be a combination of the remote view 120 displayed on the display 110 and a layout map 182, such as the 2-D, top-down map displayed in the map view 184.  In some examples, the layout map 182 is a map stored in memory on the controller 400, the cloud 22, or downloaded to the robot 200 via the network 20.  Hoffman: c.17 L.19-28.  FIG. 5A illustrates a remote video feed view 120 that a user 10 may see when the robot 200 is positioned in a hallway.  FIG. 5B illustrates an augmented hybrid map 180 displayable on the display 110, in which the layout map 182 is partially overlaid and modified to fit the remote view 120, indicating the room numbers and/or room types of the areas in the field of view of the robot 200.  The user 10 may maneuver and manipulate the robot 200 based on the hybrid map 180, which may also be a part of the augmented image 121.  Hoffman: c.17 L.29-37 and Figs. 5A-6B.  Therefore, OCU 100 displays the three-dimensional image map 180 received from the robot 200.  As discussed in R1 above, the object 114 is in a room which is one of the five rooms: Room 1 to Room 5, its position is obtained from layout map 182.  Once the location of object 114, for example Room 3 is obtained, the robot shall calculate the resolved motion using input from the sensor system 500 with reference to the map to avoid obstacles like walls, stairs and etc.).”
The robot in Figs. 5A-6B is in a hallway environment, the sensor system 500 keeps providing data of Room 3 in order to update the resolved motion to facilitate the navigation of the robot from position of icon 16 to that of icon 16a and to Room 3.
Therefore, the virtual representation of environment (Room 3) is obtained from both a camera (image of Room 3) and sensor system 500 (input data to obtain resolved motion for navigation).
Regarding the features of “a virtual representation of the object by visually imposing the data received from the spatial detection sensor upon the data received from the visual detection sensor” as recited in claim 1, please see R1 and R2 above for details.

R5.	The appellant argued on p. 11 para. 1 lines 1-8 that “As explained col. 18 lines 48-54, a virtual robot icon 16a may be used to show a projected camera field of view 322a. As can clearly be seen in FIG. 6F, the current camera field of view 322 relates to the robot icon 16, whereas the virtual robot icon 16a and the related projected camera field of view 322a is not being obtained by the robot. Thus the robot 200 (represented by robot icon 16) is not utilizing its camera and its spatial sensor "output, within the virtual representation of the environment of the robot, an indication of the action that the robot will perform" in which the virtual representation is obtained from "a spatial detection sensor of the robot" as recited in claim 1.”
The examiner disagreed respectfully.  The examiner applied the reference of Hoffman to teach the feature “output, within the virtual representation of the environment of the robot, an indication of the action that the robot will perform (e.g., In some implementations, as the user 10 drives the robot 200 along a corridor, the user 10 may invoke the preview mode by selecting the preview button 190 on the display 110.  For example, at a location 50 feet away from a turn in the corridor, the user 10 may invoke the preview button 190, causing the generation of a preview view 192a and stopping further movement of the robot 200 along the corridor.  The user 10 may continue, however, to virtually move the robot 200 in a preview mode.  The OCU display 110 may display the preview view 192a (e.g., a 3-D model) of the same corridor at the same position. Hoffman: c.18 L.55-65 and Fig. 6E, where the camera view 150 continues to display the remote view 120 from the robot camera 218, 219, 262, 362 in a picture-in-picture window.  Furthermore, the resolved motion algorithm produces a continuous motion that connects a starting point of the robot 200 (position 16) to a destination point of the robot 200 (position 16a) while avoiding collisions with objects 114.  Therefore, a user 10, looking at a robot 200 in motion, sees a robot 200 moving across a room seamlessly and effortlessly.  Hoffman: c.14 L.13-18.  As discussed in R4 above, as the robot 200 is navigating from icon 16 to icon 16a to a destination of Room 3, the current camera field of view changes to a view of Room 3 where the object 114 is located and the robot 200 will perform grabbing for object 114)”
Therefore, resolved motion is provided for navigating to Room 3 (environment) as well as for navigating to object 114 (object) provided the input data of the sensor system 500 for Room 3 and object 114 are provided to the resolved motion algorithm.

C.	Claim Is 10 Not Rendered Obvious By Hoffman in view of Grossmann
R6.	The appellant remarked (regarding claim 10) on p. 12 paras. 1-2 and p. 13 para. 1 L.1-3 that “Dependent claim 10 recites, inter alia, "the display device is configured to output a point cloud representation of the environment of the robot." The Final Office Action does not rely upon Grossmann and on p. 18 relies upon Hoffman col. 17 lines 19-28 and FIGS. 5A-6B to allegedly disclose a hybrid three-dimensional image map 180, as shown here in FIG. 5B.  Hoffman col. 17 lines 19-28 explains that the hybrid three-dimensional image map 180 depicted in FIG. SB may utilize a combination of a remote view 120 (i.e., camera view) and a layout map 182 based on a map view 184, which is shown here in FIG. 6B.  The layout map 182 and/or map view 184 appears to be based upon pre-existing data that the robot icon 16 has not yet encountered, rather than a virtual representation obtained by the robot's own spatial sensor as recited in claim 10”. 
As discussed in R4 above, the “virtual representation of the environment” is disclosed as that when the robot 200 in the hallway in Figs. 5A-6B, the navigation from icon 16 through icon 16a to Room 3, the combined information of Room 3 is the view of Room 3 and resolved motion of Room 3.
Hoffman further teaches that “The sensor system 500 includes an array of proximity sensors, one or more cameras 218, 219, 262 (e.g., stereo cameras, visible light camera, infrared camera, thermography, etc.), and/or one or more 3-D imaging sensors (e.g., volumetric point cloud imaging device) in communication with the controller 400 and arranged in one or more zones or portions of the robot 200 for detecting any nearby or intruding obstacles.” (Hoffman: c.24 L.36-43).  As the sensor system 500 provides point cloud data to the resolved motion algorithm, point cloud resolved motion information of Room 3 is provided to robot 200 to facilitate its navigation.  Hence, the “virtual representation of the environment” becomes “point cloud representation of the environment”.

R7.	The appellant argued (regarding claim 10) on p. 13 para. 1 that “The layout map 182 and/or map view 184 appears to be based upon pre-existing data that the robot icon 16 has not yet encountered, rather than a virtual representation obtained by the robot's own spatial sensor as recited in claim 10, based upon "a virtual representation of the object by visually imposing spatial data received from the [robot's] second sensor upon visual data received from the first sensor" as recited in claim 9 from which it depends. Thus, Hoffman does not teach or fairly suggest that the hybrid three-dimensional image map 180 is based upon spatial data obtained by the robot 200. Accordingly, Hoffman fails to teach or fairly suggest the features recited in claim 10. Since Hoffmann is solely relied upon to allegedly disclose these claim features, the combination of Hoffmann and Grossmann is necessarily deficient as well.”
The examiner disagrees respectfully.  The features of "a virtual representation of the object by visually imposing spatial data received from the [robot's] second sensor upon visual data received from the first sensor" recited in claim 9 is similar to the features of “a virtual representation of the object by visually imposing the data received from the spatial detection sensor upon the data received from the visual detection sensor” as recited in claim 1, please see R1 and R2 above for details on teaching the features except the two sensors.  It can be seen from R2 that the camera is the first sensor and the sensor system 500 is the second sensor.

D.	Claim Is 11 Not Rendered Obvious By Hoffman in view of Grossmann
R8.	The appellant remarked (regarding claim 11) on p. 14 para. 1 that “Dependent claim 11 recites, inter alia, "wherein the display device is further configured to generate an augmented reality view on the display device by utilizing the point cloud representation to overlay point cloud data and a label for an object displayed on the display device." The Final Office Action does not rely upon Grossmann and on p. 19 relies upon Hoffman col. 17 lines 19-28 and FIGS. 5A-6B to allegedly disclose a hybrid three-dimensional image map 180, again shown here in FIG. 5B.”
As discussed in R4, the “virtual representation of the environment” is disclosed as that when the robot 200 in the hallway in Figs. 5A-6B, the navigation from icon 16 through 
Hoffman further teaches that “The sensor system 500 includes an array of proximity sensors, one or more cameras 218, 219, 262 (e.g., stereo cameras, visible light camera, infrared camera, thermography, etc.), and/or one or more 3-D imaging sensors (e.g., volumetric point cloud imaging device) in communication with the controller 400 and arranged in one or more zones or portions of the robot 200 for detecting any nearby or intruding obstacles.” (Hoffman: c.24 L.36-43).  As the sensor system 500 provides point cloud data to the resolved motion algorithm, point cloud resolved motion information of Room 3 is provided to robot 200 to facilitate its navigation.  Hence, the combined information of Room 3 of image view of Room 3 plus resolved motion point clouds and this is interpreted as “point cloud representation of the environment”.  Since the combined information of Room 3 is adding the resolved motion point cloud to the image view of Room 3, in other words, it is “generate an augmented reality view on the display device by utilizing the point cloud representation to overlay point cloud data and a label for an object displayed on the display device”.

R9.	The appellant argued (regarding claim 11) on p. 14 para. 2 and p. 15 para. 1 that “The layout map 182 and/or map view 184 appears to be based upon pre-existing data that the robot icon 16 has not yet encountered, rather than utilizing the point cloud representation to overlay point cloud data as recited in claim 11, based upon "a virtual representation of the object by visually imposing spatial data received from the [robot's] second sensor upon visual data received from the first sensor'' as recited in claim 9 from which it ultimately depends. Thus, Hoffman does not teach or fairly suggest that the hybrid three-dimensional image map 180 is based upon point cloud data obtained by the robot 200. Accordingly, Hoffman fails to teach or fairly suggest the features recited in claim 11. Since Hoffmann is solely relied upon to allegedly disclose these claim features, the combination of Hoffmann and Grossmann is necessarily deficient as well.”
The examiner disagrees respectfully.  The features of "a virtual representation of the object by visually imposing spatial data received from the [robot's] second sensor upon visual data received from the first sensor" recited in claim 9 is same as the features of “a virtual representation of the object by visually imposing the data received from the spatial detection sensor upon the data received from the visual detection sensor” as recited in claim 1, please see R1 and R2 above for details on teaching the features.  It can be seen from R2 that the camera is the first sensor and the sensor system 500 is the second sensor.

For the above reasons, it is believed that the rejections should be sustained.
Respectfully submitted,
/SING-WAI WU/Primary Examiner, Art Unit 2611                                                                                                                                                                                                        
Conferees:
/KEE M TUNG/Supervisory Patent Examiner, Art Unit 2611                                                                                                                                                                                                        
/MARK K ZIMMERMAN/Supervisory Patent Examiner, Art Unit 2619                                                                                                                                                                                                        
Requirement to pay appeal forwarding fee.  In order to avoid dismissal of the instant appeal in any application or ex parte reexamination proceeding, 37 CFR 41.45 requires payment of an appeal forwarding fee within the time permitted by 37 CFR 41.45(a), unless appellant had timely paid the fee for filing a brief required by 37 CFR 41.20(b) in effect on March 18, 2013.