Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claim Objections
Claims 7 and 18 objected to because of the following informalities:
Claim 7 recites “computation” in line 1.  It appears miss spelling with “computing”.
Claim 18 recites “comprising,” instead of “comprising:” in line 1. It  
 Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:

2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
1.	Claims 1, 8, 9,16 are rejected under 35 U.S.C. 103 as being unpatentable over Smith et al, U.S Patent No.9088787 (“Smith”) in view of Gauglitz et al, U.S Patent Application Publication No. 20160358383 (“Gauglitz”)
Regarding independent claim 1, Smith teaches a remote assistance system (Fig. 5), comprising:
wearable visual enhancement device at a first location (col.6, lines 47-51 “FIG. 5 illustrates a block diagram of an HWD remote assistant system for network-based collaboration, training and/or maintenance in accordance with an embodiment. At a first location, User A 40 uses a wearable/mobile computing device 20.”) configured to :
scan a scene in a real world in a forward field-of-view of a first user (col.6, lines 47-53 “FIG. 5 illustrates a block diagram of an HWD remote assistant system for network-based collaboration, training and/or maintenance in accordance with an embodiment. At a first location, User A 40 uses a wearable/mobile computing device 20. With the device 20, User A scans an object 35 or scene using a 3D scanning device 22 and collects the scene data”), 
generate sensor data associated with one or more objects in the scene (col.4, lines 53-67 “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by User A,  and transmit the sensor data (col.6, lines 47-59 “FIG. 5 illustrates a block diagram of an HWD remote assistant system for network-based collaboration, training and/or maintenance in accordance with an embodiment. At a first location, User A 40 uses a wearable/mobile computing device 20. With the device 20, User A scans an object 35 or scene using a 3D scanning device 22 and collects the scene data. The captured 3D model may be communicated to a second location, such as over a network 45, to a computing device 60, or processor. In an embodiment, the captured 3D model of the object 35 may be communicated to a user B 50. The captured 3D model 65 may be received by the computing device 60 and displayed to the user B 50 via the display of the computing device 60”); and 
a computing system at a second location (col.6, lines 53-56 “The captured 3D model may be communicated to a second location, such as over a network 45, to a computing device 60, or processor. In an embodiment, the captured 3D model of the object 35 may be communicated to a user B 50”) configured to: 
receive the sensor data (col.6, lines 53-59 “The captured 3D model may be communicated to a second location, such as over a network 45, to a computing device 60, or processor. In an embodiment, the captured 3D model of the object 35 may be communicated to a user B 50. The captured 3D model 65 may be received by the computing device 60 and displayed to the user B 50 via the display of the computing device 60.”),
a 3D scene including 3D models of the one or more objects (col.6, lines 53-59 The captured 3D model may be communicated to a second location, such as over a network 45, to a computing device 60, or processor. In an embodiment, the captured 3D model of the object 35 may be communicated to a user B 50. The captured 3D model 65 may be received by the computing device 60 and displayed to the user B 50 via the display of the computing device 60.”), 
receive, via input by a second user, a mark associated with one of the 3D models, and transmit information that identifies the mark to the wearable visual enhancement device (col.6, lines 59-67-col.7, lines 1-5 “The computing device 60 may be configured to mark or annotate the received 3D model. By way of non-limiting example, User B 50 may use a mouse, keyboard or other user interface to enter marking(s) or annotation(s) directly on the captured 3D model 65. For illustrative purposes, a "1" has been marked on a surface of the object 35. The computing device , wherein the wearable visual enhancement device is further configured to display the mark adjacent to the object corresponding to the one of the 3D models (col.7, lines 6-23 “The received marked or annotated 3D model may be received by the wearable/mobile computing device 20 and displayed on display 23. The mark(s) or annotation(s) may be fixed to the location on the image entered by the user B 50. By way of non-limiting example, the mark(s) or annotation(s) add a layer at a location entered on the image. Furthermore, the wearable/mobile computing device 20 may be configured to mark or annotate the 3D model, via marking/annotation module 27, before sending the 3D model to the User B 50. By way of non-limiting example, User A 40, using a user interface 29, enters a mark or annotation for the user B 50. The marking/annotation module 27 may allow the user to enter textual mark(s) over the 3D model. The marking/annotation module 27 may allow the user A 40 to enter textual annotations. The user B 50 may enter free-form markings on the surface of the objects, text relative to a point on the surface of an object, and pre-defined shapes relative to a point on the surface of an object” ) Smith is understood to be silent on the remaining limitations of claim 1.
In the same field of endeavor, Gauglitz teaches a computing system at a second location  (0041] FIG. 2 shows an overview of the system architecture 200, d the remote user's system may be running on a commodity PC with Ubuntu. Since device hardware (camera and display), network communication, real-time processing, and background tasks are involved, both systems employ a host of components and threads.”) configured to: generate a 3D scene including 3D models of the one or more objects (¶0050 “A 3D surface model is constructed on the fly from the live video stream and from associated camera poses. Keyframes were selected based on a set of heuristics (good tracking quality, low device movement, minimum time interval & translational distance between keyframes), then detect and describe features in the new frame using SIFT. Four closest existing keyframes were chosen and matched against their features (one frame at a time) via an approximate nearest neighbor algorithm and collect matches that satisfy the epipolar constraint (which is known due to the received camera poses) within some tolerance as tentative 3D points. If a feature has previously been matched to features from other frames, we check for mutual epipolar consistency of all observations and merge them into a single 3D point if possible; otherwise, the two 3D points remain as competing hypotheses”; ¶0068 “The renderer renders the scene using the 3D model, the continually updated keyframes, the incoming live camera frame (including live camera pose), the virtual camera pose, and the annotations”),receive, via input by a second user, a mark associated with one of the 3D models (¶0066 “The remote user sets a marker by simply left-clicking into the view (irrespective if "live" or "decoupled"). The depth of the marker is derived from the 3D model, presuming that , and transmit information that identifies the mark to the wearable visual enhancement device (¶0064 “In addition to being able to control the viewpoint, the remote user can set and remove virtual annotations. Annotations are saved in 3D world coordinates, are shared with the local user's mobile device via the network, and immediately appear in all views of the world correctly anchored to their 3D world position (cf. FIGS. 1 and 3).”)
Therefore, it would have been obvious to a person of ordinary skill in the art at the time of invention to modify system, method for providing a remotely created augmented reality image of Smith with generating 3D scene by a computing system at a second location as seen in Gauglitz  because this modification would enable the remote user to navigate the scene and to create annotations in it that were then sent back and visualized to the local user in AR ( ¶0118 of Gauglitz)
Thus, the combination of Smith and Gauglitz teaches a remote assistance system, comprising: a wearable visual enhancement device at a first location configured to: scan a scene in a real world in a forward field-of-view of a first user, generate sensor data associated with one or more objects in the scene, and transmit the sensor data; and a computing system at a second location configured to: receive the sensor data, generate a 3D scene including 3D models of the one or more objects, receive, via input by a second user, a mark associated with one of the 3D models, and transmit information that identifies the mark to the wearable visual enhancement device, wherein the wearable visual enhancement device is further configured to display the mark adjacent to the object corresponding to the one of the 3D models.
Regarding claim 8, Smith and Gauglitz teach the remote assistance system of claim 1, wherein the computing system is further configured to adjust a virtual perception of the second user in the 3D scene in response to users inputs from the second user (col.6, lines 64-67-col.7, lines 1-5 of Smith “For illustrative purposes, a "1" has been marked on a surface of the object 35. The computing device 60 may allow the user viewpoint of the 3D model to be manipulated (e.g., rotated or moved). The computing device 60 may allow the viewpoint of the 3D model to be Changed. By way of non-limiting example, the 3D model may appear smaller or larger when the viewpoint is changed. The marked or annotated 3D model may be communicated (Step 625) from the second location back to the first location through network 45.”; ¶0060-0061 of Gaulitz “ The user can also zoom into and out of the view with the scroll wheel. Zooming is implemented as a change of the virtual camera's field of view (rather than dollying) to avoid having to deal with corrections for parallax or occlusions from objects behind the original camera position.  [0061] The present subject matter provides click to change viewpoint capabilities. When the user right-clicks into the view, we compute the 3D hit point, and subsequently find the camera whose optical axis is closest to this point (which may be the current camera as well). This camera is transitioned and yaw and pitch adapted such that the new view centers on the clicked-upon point. This allows the user to quickly center on a nearby point as well as quickly travel to a faraway point with a single click.”) In addition, the same motivation is used as the rejection for claim 1.
a method for remote assistance, comprising: 
scanning, by a wearable visual enhancement device at a first location, a scene in a real world in a forward field-of-view of a first user (col.6, lines 47-53 “FIG. 5 illustrates a block diagram of an HWD remote assistant system for network-based collaboration, training and/or maintenance in accordance with an embodiment. At a first location, User A 40 uses a wearable/mobile computing device 20. With the device 20, User A scans an object 35 or scene using a 3D scanning device 22 and collects the scene data”); 
generating, by the wearable visual enhancement device, sensor data associated with one or more objects in the scene (col.4, lines 53-67 “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by User A, depth/RGB transmission 34 to User B, and audio transmission 36a from User A to User B. The depth/RGB sensor provides raw color and depth date of an object at a certain level of discretization size, from which the 3D-modeling simulation engine (3D-MSE) 52 (as illustrated in FIG. 2 creates the 3D model. The depth/RGB transmission block 34 may communicate sensed depth/RGB data to the 3D modeling simulation engine 32. The depth/RGB transmission block 34 may include data associated with a 3D model of a real-world view of a scene through the lens of the HWD device 100 (as shown in FIG. 4). The scene may include at least one object. By way of a non-limiting example, a depth/RGB sensor 50 (illustrated in FIG. 3A) may include an ASUS.RTM. Xtion sensor.”; col.7, lines 23-28 “Additionally, as either the HWD device 100 or the object are moved, the computing system may maintain ;
 generating a 3D scene including 3D models of the one or more objects (col.6, lines 53-59 The captured 3D model may be communicated to a second location, such as over a network 45, to a computing device 60, or processor. In an embodiment, the captured 3D model of the object 35 may be communicated to a user B 50. The captured 3D model 65 may be received by the computing device 60 and displayed to the user B 50 via the display of the computing device 60.”);
 receiving, via input to the computing system by a second user, a mark associated with one of the 3D models(col.6, lines 59-67-col.7, lines 1-5 “The computing device 60 may be configured to mark or annotate the received 3D model. By way of non-limiting example, User B 50 may use a mouse, keyboard or other user interface to enter marking(s) or annotation(s) directly on the captured 3D model 65. For illustrative purposes, a "1" has been marked on a surface of the object 35. The computing device 60 may allow the user viewpoint of the 3D model to be manipulated (e.g., rotated or moved). The computing device 60 may allow the viewpoint of the 3D model to be Changed. By way of non-limiting example, the 3D model may appear smaller or larger when the viewpoint is changed. The marked or annotated 3D model may be communicated (Step 625) from the second location back to the first location through network 45); 
transmitting, by the computing system, information that identifies the mark to the wearable visual enhancement device(col.6, lines 59-67-col.7, lines 1-5 “The ; and displaying, by the wearable visual enhancement device, the mark adjacent to the object corresponding to the one of the 3D models (col.7, lines 6-23 “The received marked or annotated 3D model may be received by the wearable/mobile computing device 20 and displayed on display 23. The mark(s) or annotation(s) may be fixed to the location on the image entered by the user B 50. By way of non-limiting example, the mark(s) or annotation(s) add a layer at a location entered on the image. Furthermore, the wearable/mobile computing device 20 may be configured to mark or annotate the 3D model, via marking/annotation module 27, before sending the 3D model to the User B 50. By way of non-limiting example, User A 40, using a user interface 29, enters a mark or annotation for the user B 50. The marking/annotation module 27 may allow the user to enter textual mark(s) over the 3D model. The marking/annotation module 27 may allow the user A 40 to enter textual annotations. The user B 50 may enter free-form markings on the surface of the objects, text relative to a point on the surface of an object, and pre-defined shapes relative to a point on the surface of an object” ) Smith is understood to be silent on the remaining limitations of claim 9.
In the same field of endeavor, Gauglitz teaches generating, by a computing system at a second location, a 3D scene including 3D models of the one or more objects (¶0050 “A 3D surface model is constructed on the fly from the live video stream and from associated camera poses. Keyframes were selected based on a set of heuristics (good tracking quality, low device movement, minimum time interval & translational distance between keyframes), then detect and describe features in the new frame using SIFT. Four closest existing keyframes were chosen and matched against their features (one frame at a time) via an approximate nearest neighbor algorithm and collect matches that satisfy the epipolar constraint (which is known due to the received camera poses) within some tolerance as tentative 3D points. If a feature has previously been matched to features from other frames, we check for mutual epipolar consistency of all observations and merge them into a single 3D point if possible; otherwise, the two 3D points remain as competing hypotheses”; ¶0068 “The renderer renders the scene using the 3D model, the continually updated keyframes, the incoming live camera frame (including live camera pose), the virtual camera pose, and the annotations”) receiving, via input to the computing system by a second user, a mark associated with one of the 3D models(¶0066 “The remote user sets a marker by simply left-clicking into the view (irrespective if "live" or "decoupled"). The depth of the marker is derived from the 3D model, presuming that the user wants to mark things on physical surfaces rather than in mid-air. Pressing the space bar removes”), transmitting, by the computing system, information that identifies the mark to the wearable visual enhancement device (¶0064 “In addition to being able to control the viewpoint, the remote user can set and remove virtual annotations. Annotations are saved in 3D world coordinates, are shared with the local user's mobile device via the network, and immediately appear in all views of the world correctly anchored to their 3D world position (cf. FIGS. 1 and 3).”) In addition, the same motivation is used as the rejection for claim 1.
Thus, the combination of Smith and Gauglitz teaches a method for remote assistance, comprising: scanning, by a wearable visual enhancement device at a first location, a scene in a real world in a forward field-of-view of a first user; generating, by the wearable visual enhancement device, sensor data associated with one or more objects in the scene; generating, by a computing system at a second location, a 3D scene including 3D models of the one or more objects; receiving, via input to the computing system by a second user, a mark associated with one of the 3D models; transmitting, by the computing system, information that identifies the mark to the wearable visual enhancement device; and displaying, by the wearable visual enhancement device, the mark adjacent to the object corresponding to the one of the 3D models.
Regarding claim 16, Smith and Gauglitz teach the method of claim 9, further comprising adjusting, by the computing system, a virtual perception of the second user in the 3D scene in response to users inputs from the second user(col.6, lines 64-67-col.7, lines 1-5 of Smith “For illustrative purposes, a "1" has been marked on a surface of the object 35. The computing device 60 may allow the user viewpoint of the 3D model to be manipulated (e.g., rotated or moved). The computing device 60 may allow the viewpoint of the 3D model to be Changed. By way .
2.	Claims 2-3, 10-11 are rejected under 35 U.S.C. 103 as being unpatentable over Smith et al, U.S Patent No.9088787 (“Smith”) in view of Gauglitz et al, U.S Patent Application Publication No. 20160358383 (“Gauglitz”) further in view of Naimark, U.S Patent Application Publication No. 2013/0218461 (“Naimark”)
Regarding claim 2, Smith and Gauglitz teach the remote assistance system of claim 1, wherein the wearable visual enhancement device includes a camera configured to collect color information of a color image of the scene, a depth camera configured to collect distance information of a depth image of the scene (col. 4, lines 52-59 of Smith “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by vides raw color and depth date of an object at a certain level of discretization size, from which the 3D-modeling simulation engine (3D-MSE) 52 (as illustrated in FIG. 2 creates the 3D model”; col.5, lines 47-56 “The 3D-MSE may include, by way of non-limiting example, KinFu by Microsoft.RTM. which is an open-source application configured to provide 3D visualization and interaction. The 3D-MSE may process live depth data from a camera/sensor and create a Point Cloud and 3D models for real-time visualization and interaction. A graphics processing unit (GPU) 62, such as by way of non-limiting example a CUDA graphics processing unit, may be used to execute the open-source application. A visualization tool kit 63 may be provided.”), and an inertial measurement unit (IMU) configured to collect velocity of the wearable visual enhancement device (col.7, lines 23-28  of Smith “Additionally, as either the HWD device 100 or the object are moved, the computing system may maintain alignment of the mark or annotation at the fixed location on the object. Though not illustrated, alignment may be maintained with an alignment system or device, such as but not limited to inertial measurement unit (IMU).)”) Both Gauglitz and Smith are understood to be silent on the remaining limitations of claim 2.
In the same field of endeavor, Naimark teaches an inertial measurement unit (IMU) configured to collect acceleration and angular velocity of the device (¶0018 “The inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers. In other embodiments, additional sensing systems are used which also provide information about movement of the mobile device 102 in space. The IMU 113 provides inertial 
 Therefore, it would have been obvious to a person of ordinary skill in the art at the time of invention to modify system, method for providing a remotely created augmented reality image of Smith and generating 3D scene by a computing system at a second location of Gauglitz  with using the inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers as seen in Naimark because this modification would provide six degrees of freedom-3 translation-related values and 3 rotation-related values and determine the pose of the system relative to the real world objects viewed by the camera (¶0018 of Naimark).
Thus, the combination of Smith, Gauglitz and Naimark teaches wherein the wearable visual enhancement device includes a camera configured to collect color information of a color image of the scene, a depth camera configured to collect distance information of a depth image of the scene, and an inertial measurement unit (IMU) configured to collect acceleration and angular velocity of the wearable visual enhancement device.
the remote assistance system of claim 2, wherein the wearable visual enhancement device includes a tracker configured to generate degree of freedom (DoF) information at least partially based on the acceleration and angular velocity (col.7, lines 23-28 of Smith “Additionally, as either the HWD device 100 or the object are moved, the computing system may maintain alignment of the mark or annotation at the fixed location on the object. Though not illustrated, alignment may be maintained with an alignment system or device, such as but not limited to inertial measurement unit (IMU).)”; ¶0018 of Naimark “The inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers. In other embodiments, additional sensing systems are used which also provide information about movement of the mobile device 102 in space. The IMU 113 provides inertial motion parameters to the software 111. The IMU 113 is rigidly attached to the mobile device 102 and thereby provides an indication of the movement of the entire system and is used to determine the pose of the system relative to the real world objects 103A viewed by the camera 112. The inertial parameters provided by the IMU 113 include linear acceleration, angular velocity and gyroscopic orientation with respect to the ground. In one embodiment the sensors include at least 3 accelerometers and 3 gyroscopes mounted orthogonally. As such, the sensors provide six degrees of freedom--3 translation-related values and 3 rotation-related values” ) In addition, the same motivation is used as the rejection for claim 2.
Regarding claim 10, Smith and Gauglitz teaches the method of claim 9, further comprising: collecting, by a camera of the wearable visual enhancement device, color information of a color image of the scene; collecting, by a depth camera of the wearable visual enhancement device, distance information of a depth image of the scene (col. 4, lines 52-59 of Smith “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by User A, depth/RGB transmission 34 to User B, and audio transmission 36a from User A to User B. The depth/RGB sensor provides raw color and depth date of an object at a certain level of discretization size, from which the 3D-modeling simulation engine (3D-MSE) 52 (as illustrated in FIG. 2 creates the 3D model”; col.5, lines 47-56 “The 3D-MSE may include, by way of non-limiting example, KinFu by Microsoft.RTM. which is an open-source application configured to provide 3D visualization and interaction. The 3D-MSE may process live depth data from a camera/sensor and create a Point Cloud and 3D models for real-time visualization and interaction. A graphics processing unit (GPU) 62, such as by way of non-limiting example a CUDA graphics processing unit, may be used to execute the open-source application. A visualization tool kit 63 may be provided.”; and  collecting, by an inertial measurement unit (IMU), velocity of the wearable visual enhancement device (col.7, lines 23-28 “Additionally, as either the HWD device 100 or the object are moved, the computing system may maintain alignment of the mark or annotation at the fixed location on the object. Though not illustrated, alignment may be maintained with an alignment system or device, such as but not limited to inertial measurement unit (IMU).)”) In addition, the same motivation is used as the rejection for claim 2. Both Smith and Gauglitz are understood to be silent on the remaining limitations of claim 10.
an inertial measurement unit (IMU) configured to collect acceleration and angular velocity of the device (¶0018 “The inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers. In other embodiments, additional sensing systems are used which also provide information about movement of the mobile device 102 in space. The IMU 113 provides inertial motion parameters to the software 111. The IMU 113 is rigidly attached to the mobile device 102 and thereby provides an indication of the movement of the entire system and is used to determine the pose of the system relative to the real world objects 103A viewed by the camera 112. The inertial parameters provided by the IMU 113 include linear acceleration, angular velocity and gyroscopic orientation with respect to the ground. In one embodiment the sensors include at least 3 accelerometers and 3 gyroscopes mounted orthogonally. As such, the sensors provide six degrees of freedom--3 translation-related values and 3 rotation-related values”) In addition, the same motivation is used as the rejection for claim 2.
Thus, the combination of Smith, Gauglitz and Naimark teaches further comprising: collecting, by a camera of the wearable visual enhancement device, color information of a color image of the scene; collecting, by a depth camera of the wearable visual enhancement device, distance information of a depth image of the scene; and  20Attorney Docket No.: 81023-000035 collecting, by an inertial measurement unit (IMU), acceleration and angular velocity of the wearable visual enhancement device.
Regarding claim 11, Smith, Gauglitz and Naimark teach the method of claim 10, further comprising generating, by a tracker, degree of freedom (DoF) information at least partially based on the acceleration and angular velocity (col.7, lines 23-28 of Smith “Additionally, as either the HWD device 100 or the object are moved, the computing system may maintain alignment of the mark or annotation at the fixed location on the object. Though not illustrated, alignment may be maintained with an alignment system or device, such as but not limited to inertial measurement unit (IMU).)”; ¶0018 of Naimark “The inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers. In other embodiments, additional sensing systems are used which also provide information about movement of the mobile device 102 in space. The IMU 113 provides inertial motion parameters to the software 111. The IMU 113 is rigidly attached to the mobile device 102 and thereby provides an indication of the movement of the entire system and is used to determine the pose of the system relative to the real world objects 103A viewed by the camera 112. The inertial parameters provided by the IMU 113 include linear acceleration, angular velocity and gyroscopic orientation with respect to the ground. In one embodiment the sensors include at least 3 accelerometers and 3 gyroscopes mounted orthogonally. As such, the sensors provide six degrees of freedom--3 translation-related values and 3 rotation-related values” ) In addition, the same motivation is used as the rejection for claim 2.
3.	Claims 4, 6, 12, 14 are rejected under 35 U.S.C. 103 as being unpatentable over Smith et al, U.S Patent No.9088787 (“Smith”) in view of Gauglitz et al, U.S Patent Application Publication No. 20160358383 (“Gauglitz”) further in view of Naimark, U.S Patent Application Publication No. 2013/0218461 (“Naimark”) further in view of Xue et al, U.S Patent Application Publication No. 2020/0098186 (“Xue”) 
the remote assistance system of claim 3, wherein the wearable visual enhancement device includes a first communication unit configured to transmit the color information of the color image, and the distance information of the depth image to the computing system at the second location (col.4, lines 52-67 of Smtih “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by User A, depth/RGB transmission 34 to User B, and audio transmission 36a from User A to User B. The depth/RGB sensor provides raw color and depth date of an object at a certain level of discretization size, from which the 3D-modeling simulation engine (3D-MSE) 52 (as illustrated in FIG. 2 creates the 3D model. The depth/RGB transmission block 34 may communicate sensed depth/RGB data to the 3D modeling simulation engine 32. The depth/RGB transmission block 34 may include data associated with a 3D model of a real-world view of a scene through the lens of the HWD device 100 (as shown in FIG. 4). The scene may include at least one object. By way of a non-limiting example, a depth/RGB sensor 50 (illustrated in FIG. 3A) may include an ASUS.RTM. Xtion sensor.”; col.6, lines 47-59 of Smith  “FIG. 5 illustrates a block diagram of an HWD remote assistant system for network-based collaboration, training and/or maintenance in accordance with an embodiment. At a first location, User A 40 uses a wearable/mobile computing device 20. With the device 20, User A scans an object 35 or scene using a 3D scanning device 22 and collects the scene data. The captured 3D model may be communicated to a second location, such as over a network 45, to a computing device 60, or processor. In an embodiment, the captured 3D model 
In the same field of endeavor, Xue teaches wherein the wearable visual enhancement device includes a first communication unit configured to transmit the DoF information to the computing system at the second location  (¶0148 “From the XR server 900, compressed rendered frame video stream is provided to the HMD 910. From the HMD 910, pose information, including, for example, head location, orientation, and 6 -DoF information is provided to the XR server 900 for rendering frames. The downlink traffic from the XR server includes two video frames, for example, up to 300 KB per frame for each eye, every 16.7 ms if a 60 frames-per-second rate is maintained.”)
Therefore, it would have been obvious to a person of ordinary skill in the art at the time of invention to modify system, method for providing a remotely created augmented reality image of Smith and generating 3D scene by a computing system at a second location of Gauglitz  and using the inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers as seen in Naimark with transmit the DoF information to a computing system at the second location as seen in Xue because this modification would render one or more frames for display based on the received pose information at the server (¶0137 of Xue).
Thus, the combination of Smith, Gauglitz, Naimark and Xue teaches wherein the wearable visual enhancement device includes a first communication unit configured to transmit the DoF information, the color information of the color image, and the distance information of the depth image to the computing system at the second location.
Regarding claim 6, Smith, Gauglitz, Naimark and Xue teach the remote assistance system of claim 4, wherein the computing system includes a second communication unit configured to receive the DoF information, the color information, and the distance information (col.6, lines 53-59 of Smith “The captured 3D model may be communicated to a second location, such as over a network 45, to a computing device 60, or processor. In an embodiment, the captured 3D model of the object 35 may be communicated to a user B 50. The captured 3D model 65 may be received by the computing device 60 and displayed to the user B 50 via the display of the computing device 60.”;¶0043 of Gauglitz “Under the hood, the system runs a SLAM system and sends the tracked camera pose along with the encoded live video stream to the remote system.; ¶0049 “The network module receives the data stream from the local user's device, sends the incoming video data on to the decoder, and finally notifies the main module when a new frame (decoded image data+meta-data) is available”;  ¶0148 of Xue “From the XR server 900, compressed rendered frame video stream is provided to the HMD 910. From the HMD 910, pose information, including, for example, head location, orientation, and 6 -DoF information is provided to the XR server 900 for rendering frames. The downlink traffic from the XR server includes two video frames, for example, up to 300 KB per frame for each eye, every 16.7 ms if a 60 frames-per-second rate is maintained.”) In addition, the same motivation is used as the rejection for claim 4.
Regarding claim 12, Smith, Gauglitz and Naimark teach the method of claim 11, further comprising transmitting, by a first communication unit,  the color information of the color image, and the distance information of the depth image to the computing system at the second location (col.4, lines 52-67 of Smith “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by User A, depth/RGB transmission 34 to User B, and audio transmission 36a from User A to User B. The depth/RGB sensor provides raw color and depth date of an object at a certain level of discretization size, from which the 3D-modeling simulation engine (3D-MSE) 52 (as illustrated in FIG. 2 creates the 3D model. The depth/RGB transmission block 34 may communicate sensed depth/RGB data to the 3D modeling simulation engine 32. The depth/RGB transmission block 34 may include data associated with a 3D model of a real-world view of a scene through the lens of the HWD device 100 (as shown in FIG. 4). The scene may include at least one object. By way of a non-limiting example, a depth/RGB sensor 50 (illustrated in FIG. 3A) may include an ASUS.RTM. Xtion sensor.”; col.6, lines 47-59 of Smith “FIG. 5 illustrates a block diagram of an HWD remote assistant system for network-based collaboration, training and/or maintenance in accordance with an embodiment. At a first location, User A 40 uses a wearable/mobile computing device 20. With the device 20, User A scans an object 35 or scene using a 3D scanning device 22 and collects the scene data. The captured 3D model may be communicated to a second location, such as over a network 45, to a computing device 60, or processor. In an embodiment, the captured 3D model of the object 35 may be communicated to a user B 50”) Smith, Gauglitz and Naimark are silent on transmit the DoF information.
urther comprising transmitting, by a first communication unit, the DoF information, to the computing system at the second location(¶0148 “From the XR server 900, compressed rendered frame video stream is provided to the HMD 910. From the HMD 910, pose information, including, for example, head location, orientation, and 6 -DoF information is provided to the XR server 900 for rendering frames. The downlink traffic from the XR server includes two video frames, for example, up to 300 KB per frame for each eye, every 16.7 ms if a 60 frames-per-second rate is maintained.”) In addition, the same motivation is used as the rejection for claim 4.
Thus, the combination of Smith, Gauglitz, Naimark and Xue teaches further comprising transmitting, by a first communication unit, the DoF information, the color information of the color image, and the distance information of the depth image to the computing system at the second location.
Regarding claim 14, Smith, Gauglitz, Naimark and Xue teach the method of claim 12, further comprising receiving, by a second communication unit, the DoF information, the color information, and the distance information (col.6, lines 53-59 of Smith “The captured 3D model may be communicated to a second location, such as over a network 45, to a computing device 60, or processor. In an embodiment, the captured 3D model of the object 35 may be communicated to a user B 50. The captured 3D model 65 may be received by the computing device 60 and displayed to the user B 50 via the display of the computing device 60.”; ¶0043 of Gauglitz “Under the hood, the system runs a SLAM system and sends the tracked camera pose along with the encoded live video stream to the remote system.; ¶0049 “The network module receives ¶0148 of Xue “From the XR server 900, compressed rendered frame video stream is provided to the HMD 910. From the HMD 910, pose information, including, for example, head location, orientation, and 6 -DoF information is provided to the XR server 900 for rendering frames. The downlink traffic from the XR server includes two video frames, for example, up to 300 KB per frame for each eye, every 16.7 ms if a 60 frames-per-second rate is maintained.”) In addition, the same motivation is used as the rejection for claim 4.
4.	Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Smith et al, U.S Patent No.9088787 (“Smith”) in view of Gauglitz et al, U.S Patent Application Publication No. 20160358383 (“Gauglitz”) further in view of Naimark, U.S Patent Application Publication No. 2013/0218461 (“Naimark”) further in view of THUDOR, WO2019/055389 (“THUDOR”) further in view of Marlatt et al, U.S Patent Application Publication No. 2015/0201198 (“Marlatt”)
Regarding claim 5, Smith, Gauglitz and Naimark teach the remote assistance system of claim 3, wherein the wearable visual enhancement device further includes an image integration unit configured to combine the color information of the color image, the distance information of the depth image, and the DoF information (col.4, lines 52-67 of Smith “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by User A, depth/RGB transmission 34 to User B, and audio transmission 36a from User A to User B. The depth/RGB sensor provides raw color and depth date of an “Additionally, as either the HWD device 100 or the object are moved, the computing system may maintain alignment of the mark or annotation at the fixed location on the object. Though not illustrated, alignment may be maintained with an alignment system or device, such as but not limited to inertial measurement unit (IMU).)”; ¶0018 of Naimark “The inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers. In other embodiments, additional sensing systems are used which also provide information about movement of the mobile device 102 in space. The IMU 113 provides inertial motion parameters to the software 111. The IMU 113 is rigidly attached to the mobile device 102 and thereby provides an indication of the movement of the entire system and is used to determine the pose of the system relative to the real world objects 103A viewed by the camera 112. The inertial parameters provided by the IMU 113 include linear acceleration, angular velocity and gyroscopic orientation with respect to the ground. In one embodiment the sensors include at least 3 accelerometers and 3 As such, the sensors provide six degrees of freedom--3 translation-related values and 3 rotation-related values” where teaches the DOF information;  ¶0043 of Gauglitz “Under the hood, the system runs a SLAM system and sends the tracked camera pose along with the encoded live video stream to the remote system. The local user's system receives information about annotations from the remote system and uses this information together with the live video to render the augmented view.” where teaches encode live video stream along with tracked camera pose which is considered as combine information into a frame) In addition, the same motivation is used as the rejection for claim 2. 
Smith, Naimar teaches the color information of the color image, the distance information of the depth image, and the DoF information.  Gauglitz teaches combine information into a frame.  However,  Smith, Gauglitz, Naimark  are understood to be silent on combine color information of the color image, the distance information of the depth image, and the DoF information that share a timestamp into a frame
In the same field of endeavor, THUDOR teaches an image integration unit configured to combine the color information of the color image, the distance information of the depth image, and the DoF information (see abstract “a sequence of three-dimension scenes is encoded as a video by an encoder and transmitted to a decoder which retrieves the sequence of 3D scenes. Points of a 3D scene visible from a determined point of view are encoded as a color image in a first track of the stream in order to be decodable independently from other tracks of the stream. The color image is compatible with a three degrees of freedom rendering. Depth information and depth and .”)
Therefore, it would have been obvious to a person of ordinary skill in the art at the time of invention to modify system, method for providing a remotely created augmented reality image of Smith and generating 3D scene by a computing system at a second location of Gauglitz  and using the inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers as seen in Naimark with encode depth information, depth and color  of the scene as seen in THUDOR because this modification would carry data representative of a volumetric scene that can be encoded at once and decoded either as a 3DOF video or as a volumetric video (3DoF+ or 6DoF) and require a small amount of data than the Multiview+ Depth (MDV) standard encoding  (col.2, lines 30-33 of THUDOR). Smith,Gauglitz, Naimark and THUDOR are understood to be silent on the remaining limitations of claim 5.
However, Marlatt teaches wherein the device further includes an image integration unit configured to combine the information of the image that share a timestamp into a frame (¶0039 “In order to avoid the need for synchronization of frames between different streams on the client, and as described herein, it is possible to synchronize the frames of the different encodings on the camera, wrap all frames with the same UTC timestamp into a container frame, and transmit a single stream of container frames to the client. A video source device, such as, for example, a camera, generates source video comprising source frames. The camera applies a UTC Because each of the source frame encodings is generated from the same source frame, they all share the same timestamp. The video source device generates a container frame from the source frame encodings sharing a common timestamp. The video source device appends a timestamp ("container timestamp") to a header of the container frame ("container frame header") that is identical to the timestamps of the various source frame encodings”)
Therefore, it would have been obvious to a person of ordinary skill in the art at the time of invention to modify system, method for providing a remotely created augmented reality image of Smith and generating 3D scene by a computing system at a second location of Gauglitz  and using the inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers as seen in Naimark and encode depth information, depth and color  of the scene as seen in THUDOR with generates a container frame from the source frame encodings sharing a common timestamp as seen in Marlatt because this modification would synchronize the frames of the different encodings on the camera (¶0039 of Marlatt ).
Thus, the combination of Smith, Gauglitz, Naimark, THUDOR and Marlatt teaches wherein the wearable visual enhancement device further includes an image integration unit configured to combine the color information of the color image, the distance information of the depth image, and the DoF information that share a timestamp into a frame.
5.  Claims 7 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Smith et al, U.S Patent No.9088787 (“Smith”) in view of Gauglitz et al, U.S Patent Application Publication No. 20160358383(“Gauglitz”) further in view of Naimark, U.S Patent Application Publication No. 2013/0218461 (“Naimark”) further in view of Xue et al, U.S Patent Application Publication No. 2020/0098186 (“Xue”) further in view of THUDOR, WO2019/055389 (“THUDOR”)
Regarding claim 7, Smith, Gauglitz, Naimark and Xue teach the remote assistance system of claim 6, wherein the computation system includes a 3D model generator configured to generate the 3D scene based on the received information (¶0049 of Gauglitz The network module receives the data stream from the local user's device, sends the incoming video data on to the decoder, and finally notifies the main module when a new frame (decoded image data+meta-data) is available. ¶0050 “A 3D surface model is constructed on the fly from the live video stream and from associated camera poses. Keyframes were selected based on a set of heuristics (good tracking quality, low device movement, minimum time interval & translational distance between keyframes), then detect and describe features in the new frame using SIFT. Four closest existing keyframes were chosen and matched against their features (one frame at a time) via an approximate nearest neighbor algorithm and collect matches that satisfy the epipolar constraint (which is known due to the received camera poses) within some tolerance as tentative 3D points. If a feature has previously been matched to features from other frames, we check for mutual epipolar consistency of all observations 
In the same field of endeavor, THUDOR teaches wherein the computation system includes a 3D model generator configured to generate the 3D scene based on the received DoF information, the color information, and the distance information (col.7, lines 15-30 “According to the present principles, a decoding method implemented in a decoder is disclosed. The decoder obtains a stream encoded according the present encoding method from a source, for example a memory or a network interface. The stream comprises at least two elements of syntax, a first element of syntax carrying data representative of a 3D scene for a 3DoF rendering. In an embodiment, this first element of syntax comprises a color image encoded according to a projection mapping of points of the 3D scene to the color image from a determined point of view. The at least one second element of syntax of the stream carries data required by a volumetric renderer to render the 3D scene in 3DoF+ or 6DoF mode. The decoder decodes the first color image from the first element of syntax of the stream. In case the decoder is configured to decode the stream for a 3DoF rendering, the decoder provides a further circuit, for example to a Tenderer or to a format converter with the decoded data from the first element of syntax of the stream. In case the decoder is configured to decode the stream in a volumetric mode (i.e. 3DoF+ or 6DoF), the decoder decodes data embedded in the at least one second element of syntax and 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time of invention to modify system, method for providing a remotely created augmented reality image of Smith and generating 3D scene by a computing system at a second location of Gauglitz  and using the inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers as seen in Naimark and transmit the DoF information to a computing system at the second location as seen in Xue with encode/decode depth information, depth and color  of the scene as seen in THUDOR because this modification would carry data representative of a volumetric scene that can be encoded at once and decoded either as a 3DOF video or as a volumetric video (3DoF+ or 6DoF) and require a small amount of data than the Multiview+ Depth (MDV) standard encoding  (col.2, lines 30-33 of THUDOR). 
Thus, the combination of Smith, Gauglitz, Naimark, Xue and THUDOR teaches wherein the computation system includes a 3D model generator configured to generate the 3D scene based on the received DoF information, the color information, and the distance information.
Regarding claim 15, Gauglitz, Smith, Naimark and Xue teach the method of claim 14, Remaining of claim 15 is similar in scope to claim 7 and therefore rejected under the same rationale  
6. Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Smith et al, U.S Patent No.9088787 (“Smith”) in view of Gauglitz et al, U.S Patent Application 
Regarding claim 13, Smith, Gauglitz, Naimark and Xue teach the method of claim 12, further comprising combine, by an image integration unit, the color information of the color image, the distance information of the depth image, and the DoF information (col.4, lines 52-67 of Smith “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by User A, depth/RGB transmission 34 to User B, and audio transmission 36a from User A to User B. The depth/RGB sensor provides raw color and depth date of an object at a certain level of discretization size, from which the 3D-modeling simulation engine (3D-MSE) 52 (as illustrated in FIG. 2 creates the 3D model. The depth/RGB transmission block 34 may communicate sensed depth/RGB data to the 3D modeling simulation engine 32. The depth/RGB transmission block 34 may include data associated with a 3D model of a real-world view of a scene through the lens of the HWD device 100 (as shown in FIG. 4). The scene may include at least one object. By way of a non-limiting example, a depth/RGB sensor 50 (illustrated in FIG. 3A) may include an ASUS.RTM. Xtion sensor.” Where Smith teaches the color information of the color image, the distance information of the depth image; col.7, lines 23-28 of Smith “Additionally, as either the HWD device 100 or the object are moved, the computing system may maintain alignment of the mark or annotation at the fixed location on the As such, the sensors provide six degrees of freedom--3 translation-related values and 3 rotation-related values” ¶0148 “From the XR server 900, compressed rendered frame video stream is provided to the HMD 910. From the HMD 910, pose information, including, for example, head location, orientation, and 6 -DoF information is provided to the XR server 900 for rendering frames. The downlink traffic from the XR server includes two video frames, for example, up to 300 KB per frame for each eye, every 16.7 ms if a 60 frames-per-second rate is maintained.”;  ¶0043 of Gauglitz “Under the hood, the system runs a SLAM system and sends the tracked camera pose along with the encoded live video stream to the remote system. The local user's system receives information about annotations from the remote system and uses this information together with the live video to render the augmented 
Smith, Naimar and Xue teaches the color information of the color image, the distance information of the depth image, and the DoF information.  Gauglitz teaches combine information into a frame.  However,  Smith, Gauglitz, Naimark and Xue  are understood to be silent on combine color information of the color image, the distance information of the depth image, and the DoF information that share a timestamp into a frame
In the same field of endeavor, THUDOR teaches combining, by an image integration unit, the color information of the color image, the distance information of the depth image, and the DoF information (see abstract “a sequence of three-dimension scenes is encoded as a video by an encoder and transmitted to a decoder which retrieves the sequence of 3D scenes. Points of a 3D scene visible from a determined point of view are encoded as a color image in a first track of the stream in order to be decodable independently from other tracks of the stream. The color image is compatible with a three degrees of freedom rendering. Depth information and depth and color of residual points of the scene are encoded in separate tracks of the stream and are decoded only in case the decoder is configured to decode the scene for a volumetric rendering.”)
Therefore, it would have been obvious to a person of ordinary skill in the art at the time of invention to modify system, method for providing a remotely created augmented reality image of Smith and generating 3D scene by a computing system at a 
However, Marlatt teaches further comprising combining, by an image integration unit, the information of the  image that share a timestamp into a frame (¶0039 “In order to avoid the need for synchronization of frames between different streams on the client, and as described herein, it is possible to synchronize the frames of the different encodings on the camera, wrap all frames with the same UTC timestamp into a container frame, and transmit a single stream of container frames to the client. A video source device, such as, for example, a camera, generates source video comprising source frames. The camera applies a UTC timestamp to each source frame ("source frame timestamp"). The video source device generates multiple encodings of each source frame, each of which is distinguished from the other encodings by using at least one different encoding parameter. Because each of the source frame encodings is generated from the same source frame, they all share the same timestamp. The video source device generates a container frame from the source frame encodings sharing a common timestamp. The video source device appends a timestamp ("container timestamp") to a header of the container frame ("container frame header") that is identical to the timestamps of the various source frame encodings”)
Therefore, it would have been obvious to a person of ordinary skill in the art at the time of invention to modify system, method for providing a remotely created augmented reality image of Smith and generating 3D scene by a computing system at a second location of Gauglitz  and using the inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers as seen in Naimark and transmit the DoF information to a computing system at the second location as seen in Xue and encode depth information, depth and color  of the scene of THUDOR with generates a container frame from the source frame encodings sharing a common timestamp as seen in Marlatt because this modification would synchronize the frames of the different encodings on the camera (¶0039 of Marlatt ).
Thus, the combination of Smith,Gauglitz, Naimark, Xue, THUDOR and Marlatt teaches further comprising combining, by an image integration unit, the color information of the color image, the distance information of the depth image, and the DoF information that share a timestamp into a frame.
7.	Claims 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Smith et al, U.S Patent No.9088787 (“Smith”) in view of Naimark, U.S Patent Application Publication No. 2013/0218461 (“Naimark”)
Regarding independent claim 17, Smith teaches a wearable visual enhancement device (col.4, lines 4-7 “Head Wearable Display ("HWD") Remote , comprising, 
a camera configured to collect color information of a color image of a scene (col. 4, lines 52-59 “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by User A, depth/RGB transmission 34 to User B, and audio transmission 36a from User A to User B. The depth/RGB sensor provides raw color and depth date of an object at a certain level of discretization size, from which the 3D-modeling simulation engine (3D-MSE) 52 (as illustrated in FIG. 2 creates the 3D model.),
a depth camera configured to collect distance information of a depth image of the scene (col. 4, lines 52-59 “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by User A, depth/RGB transmission 34 to User B, and audio transmission 36a from User A to User B. The depth/RGB sensor provides raw color and depth date of an object at a certain level of discretization size, from which the 3D-modeling simulation engine (3D-MSE) 52 (as illustrated in FIG. 2 creates the 3D model”; col.5, lines 47-56 “The 3D-MSE may include, by way of non-limiting example, KinFu by Microsoft.RTM. which is an open-source application configured to provide 3D visualization and interaction. The 3D-MSE may process live depth data from a camera/sensor and create a Point Cloud and 3D models for real-time visualization and interaction. A graphics processing unit (GPU) 62, such as by way of non-limiting example a CUDA graphics processing unit, may be 
an inertial measurement unit (IMU) configured to collect velocity of the wearable visual enhancement device (col.7, lines 23-28 “Additionally, as either the HWD device 100 or the object are moved, the computing system may maintain alignment of the mark or annotation at the fixed location on the object. Though not illustrated, alignment may be maintained with an alignment system or device, such as but not limited to inertial measurement unit (IMU).)”),
a near eye display (col.8, lines 4-8 “By way of non-limiting example, the wearable/mobile computing device 20 may be an HMD device where the display 23 may be mounted in the HMD device. The HMD device may include see-through lenses with augmented reality (AR) overlay capability.”), a processor ,and a non-transitory computer readable medium that store instructions, when executed by the processor (col.14, lines 1-5 “In view of the above, a non-transitory processor readable storage medium is provided. The storage medium may comprise an executable computer program product which further comprises a computer software code that, when executed on a processor”), causes the processor to:
scan a scene in a real world in a forward field-of-view of a first user by the camera and the depth camera col. 4, lines 52-59 “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by User A, depth/RGB transmission 34 to User B, and audio transmission 36a from User A to User B. The depth/RGB sensor provides raw color and depth date of an object at a certain level of discretization size, from which the 3D-“The 3D-MSE may include, by way of non-limiting example, KinFu by Microsoft.RTM. which is an open-source application configured to provide 3D visualization and interaction. The 3D-MSE may process live depth data from a camera/sensor and create a Point Cloud and 3D models for real-time visualization and interaction. A graphics processing unit (GPU) 62, such as by way of non-limiting example a CUDA graphics processing unit, may be used to execute the open-source application. A visualization tool kit 63 may be provided.”; col.6, lines 47-53 “FIG. 5 illustrates a block diagram of an HWD remote assistant system for network-based collaboration, training and/or maintenance in accordance with an embodiment. At a first location, User A 40 uses a wearable/mobile computing device 20. With the device 20, User A scans an object 35 or scene using a 3D scanning device 22 and collects the scene data”), 
generate sensor data associated with one or more objects in the scene by the inertial measurement unit (IMU) (col.4, lines 53-67 “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by User A, depth/RGB transmission 34 to User B, and audio transmission 36a from User A to User B. The depth/RGB sensor provides raw color and depth date of an object at a certain level of discretization size, from which the 3D-modeling simulation engine (3D-MSE) 52 (as illustrated in FIG. 2 creates the 3D model. The depth/RGB transmission block 34 may communicate sensed depth/RGB data to the 3D modeling simulation engine 32. The depth/RGB transmission block 34 may include data associated with a 3D model of a real-world view of a scene through the lens of the and transmit the sensor data to a computing system at a second location ( col.6, lines 47-59 “FIG. 5 illustrates a block diagram of an HWD remote assistant system for network-based collaboration, training and/or maintenance in accordance with an embodiment. At a first location, User A 40 uses a wearable/mobile computing device 20. With the device 20, User A scans an object 35 or scene using a 3D scanning device 22 and collects the scene data. The captured 3D model may be communicated to a second location, such as over a network 45, to a computing device 60, or processor. In an embodiment, the captured 3D model of the object 35 may be communicated to a user B 50. The captured 3D model 65 may be received by the computing device 60 and displayed to the user B 50 via the display of the computing device 60”); receive, from the computing system at the second location, a mark associated with a first object in the scene (col.6, lines 59-67-col.7, lines 1-5 “The computing device 60 may be configured to mark or annotate the received 3D model. By way of non-limiting example, User B 50 may use a mouse, keyboard or other user interface to enter marking(s) or annotation(s) directly on the captured 3D model 65. For illustrative purposes, a "1" has been marked on a surface of the object 35. The computing device 60 may allow the user viewpoint of the 3D model to be , and display the mark adjacent to the first object by the near-eye display (col.7, lines 6-23 “The received marked or annotated 3D model may be received by the wearable/mobile computing device 20 and displayed on display 23. The mark(s) or annotation(s) may be fixed to the location on the image entered by the user B 50. By way of non-limiting example, the mark(s) or annotation(s) add a layer at a location entered on the image. Furthermore, the wearable/mobile computing device 20 may be configured to mark or annotate the 3D model, via marking/annotation module 27, before sending the 3D model to the User B 50. By way of non-limiting example, User A 40, using a user interface 29, enters a mark or annotation for the user B 50. The marking/annotation module 27 may allow the user to enter textual mark(s) over the 3D model. The marking/annotation module 27 may allow the user A 40 to enter textual annotations. The user B 50 may enter free-form markings on the surface of the objects, text relative to a point on the surface of an object, and pre-defined shapes relative to a point on the surface of an object” ) Smith is understood to be silent on the remaining limitations of claim 1.
Naimark teaches an inertial measurement unit (IMU) configured to collect acceleration and angular velocity of the device (¶0018 “The inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers. In other embodiments, additional generate sensor data associated with one or more objects in the scene by the inertial measurement unit (IMU) (¶0018 “The inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers. In other embodiments, additional sensing systems are used which also provide information about movement of the mobile device 102 in space. The IMU 113 provides inertial motion parameters to the software 111. The IMU 113 is rigidly attached to the mobile device 102 and thereby provides an indication of the movement of the entire system and is used to determine the pose of the system relative to the real world objects 103A viewed by the camera 112. The inertial parameters provided by the IMU 113 include linear acceleration, angular velocity and gyroscopic orientation with respect to the ground. In one embodiment the sensors include at least 3 accelerometers and 3 gyroscopes mounted orthogonally. As such, the sensors provide six degrees of freedom--3 translation-related values and 3 rotation-related values”)

Thus, the combination of Smith and Naimark teaches a wearable visual enhancement device, comprising, a camera configured to collect color information of a color image of a scene, a depth camera configured to collect distance information of a depth image of the scene, an inertial measurement unit (IMU) configured to collect acceleration and angular velocity of the wearable visual enhancement device, a near eye display, a processor, and a non-transitory computer readable medium that store instructions, when executed by the processor, causes the processor to: scan a scene in a real world in a forward field-of-view of a first user by the camera and the depth camera, generate sensor data associated with one or more objects in the scene by the inertial measurement unit (IMU), and transmit the sensor data to a computing system at a second location; receive, from the computing system at the second location, a mark associated with a first object in the scene, and display the mark adjacent to the first object by the near-eye display.
he wearable visual enhancement device of claim 17, wherein the instructions further cause the processor to generate degree of freedom (DoF) information at least partially based on the acceleration and angular velocity (col.7, lines 23-28 of Smith “Additionally, as either the HWD device 100 or the object are moved, the computing system may maintain alignment of the mark or annotation at the fixed location on the object. Though not illustrated, alignment may be maintained with an alignment system or device, such as but not limited to inertial measurement unit (IMU).)”; ¶0018 of Naimark “The inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers. In other embodiments, additional sensing systems are used which also provide information about movement of the mobile device 102 in space. The IMU 113 provides inertial motion parameters to the software 111. The IMU 113 is rigidly attached to the mobile device 102 and thereby provides an indication of the movement of the entire system and is used to determine the pose of the system relative to the real world objects 103A viewed by the camera 112. The inertial parameters provided by the IMU 113 include linear acceleration, angular velocity and gyroscopic orientation with respect to the ground. In one embodiment the sensors include at least 3 accelerometers and 3 gyroscopes mounted orthogonally. As such, the sensors provide six degrees of freedom--3 translation-related values and 3 rotation-related values” ) In addition, the same motivation is used as the rejection for claim 17.
8.	Claims 19 are rejected under 35 U.S.C. 103 as being unpatentable over Smith et al, U.S Patent No.9088787 (“Smith”) in view of Naimark, U.S Patent 
Regarding claim 19, Smith and Naimark teach the wearable visual enhancement device of claim 18, wherein the wearable visual enhancement device includes a first communication unit configured to transmit the DoF information, the color information of the color image, and the distance information of the depth image to the computing system at the second location (col.4, lines 52-67 “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by User A, depth/RGB transmission 34 to User B, and audio transmission 36a from User A to User B. The depth/RGB sensor provides raw color and depth date of an object at a certain level of discretization size, from which the 3D-modeling simulation engine (3D-MSE) 52 (as illustrated in FIG. 2 creates the 3D model. The depth/RGB transmission block 34 may communicate sensed depth/RGB data to the 3D modeling simulation engine 32. The depth/RGB transmission block 34 may include data associated with a 3D model of a real-world view of a scene through the lens of the HWD device 100 (as shown in FIG. 4). The scene may include at least one object. By way of a non-limiting example, a depth/RGB sensor 50 (illustrated in FIG. 3A) may include an ASUS.RTM. Xtion sensor.”; col.6, lines 47-59 “FIG. 5 illustrates a block diagram of an HWD remote assistant system for network-based collaboration, training and/or maintenance in accordance with an embodiment. At a first location, User A 40 uses a wearable/mobile computing device 20. With the device 20, User A scans an object 35 or scene using a 3D scanning device 22 and collects the scene data. The captured 3D model may be 
In the same field of endeavor, Xue teaches wherein the wearable visual enhancement device includes a first communication unit configured to transmit the DoF information to the computing system at the second location (¶0148 “From the XR server 900, compressed rendered frame video stream is provided to the HMD 910. From the HMD 910, pose information, including, for example, head location, orientation, and 6 -DoF information is provided to the XR server 900 for rendering frames. The downlink traffic from the XR server includes two video frames, for example, up to 300 KB per frame for each eye, every 16.7 ms if a 60 frames-per-second rate is maintained.”)
Therefore, it would have been obvious to a person of ordinary skill in the art at the time of invention to modify system, method for providing a remotely created augmented reality image of Smith and using inertial motion unit include linear acceleration, angular velocity and gyroscopic orientation of Naimark with transmit the DoF information to a computing system at the second location as seen in Xue because this modification would render one or more frames for display based on the received pose information at the server (¶0137 of Xue).
Thus, the combination of Smith, Naimark and Xue teaches wherein the wearable visual enhancement device includes a first communication unit configured to transmit the DoF information, the color information of the color image, and the distance information of the depth image to the computing system at the second location.
9.	Claims 20 are rejected under 35 U.S.C. 103 as being unpatentable over Smith et al, U.S Patent No.9088787 (“Smith”) in view of Naimark, U.S Patent Application Publication No. 2013/0218461 (“Naimark”) further in view of THUDOR, WO2019/055389 (“THUDOR”) further in view of Marlatt et al, U.S Patent Application Publication No. 2015/0201198 (“Marlatt”)
Regarding claim 20, Smith and Naimark teach the wearable visual enhancement device of claim 18, the color information of the color image, the distance information of the depth image, and the DoF information (col.4, lines 52-67 of Smith “Additionally, the HWD device is configured to perform depth/Red-Blue-Green ("RGB") sensing via a depth/RGB sensor of an object viewed by User A, depth/RGB transmission 34 to User B, and audio transmission 36a from User A to User B. The depth/RGB sensor provides raw color and depth date of an object at a certain level of discretization size, from which the 3D-modeling simulation engine (3D-MSE) 52 (as illustrated in FIG. 2 creates the 3D model. The depth/RGB transmission block 34 may communicate sensed depth/RGB data to the 3D modeling simulation engine 32. The depth/RGB transmission block 34 may include data associated with a 3D model of a real-world view of a scene through the lens of the HWD device 100 (as shown in FIG. 4). The scene may include at least one object. By way of a non-limiting example, a depth/RGB sensor 50 (illustrated in FIG. 3A) may include an ASUS.RTM. Xtion sensor.” Where Smith teaches the color information of the color image, the distance information of the depth image; col.7, lines 23-28 of Smith “Additionally, as either the HWD device As such, the sensors provide six degrees of freedom--3 translation-related values and 3 rotation-related values” where teaches the DOF information)  Smith and Naimark are silent on the remaining limitations of claim 20.
In the same field of endeavor, THUDOR teaches wherein the instructions further cause the processor to combine the color information of the color image, the distance information of the depth image, and the DoF information (see abstract “a sequence of three-dimension scenes is encoded as a video by an encoder and transmitted to a decoder which retrieves the sequence of 3D scenes. Points of a 3D scene visible from a determined point of view are encoded as a color image in a first .”)
Therefore, it would have been obvious to a person of ordinary skill in the art at the time of invention to modify system, method for providing a remotely created augmented reality image of Smith and using the inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers of  Naimark with encode depth information, depth and color  of the scene as seen in THUDOR because this modification would carry data representative of a volumetric scene that can be encoded at once and decoded either as a 3DOF video or as a volumetric video (3DoF+ or 6DoF) and require a small amount of data than the Multiview+ Depth (MDV) standard encoding  (col.2, lines 30-33 of THUDOR). Smith, Naimark and THUDOR are understood to be silent on the remaining limitations of claim 20.
However, Marlatt teaches wherein the instructions further cause the processor to combine the information of the image that share a timestamp into a frame (¶0039 “In order to avoid the need for synchronization of frames between different streams on the client, and as described herein, it is possible to synchronize the frames of the different encodings on the camera, wrap all frames with the same UTC timestamp into a container frame, and transmit a single stream of container frames to the client. A video source device, such as, for example, a camera, generates source Because each of the source frame encodings is generated from the same source frame, they all share the same timestamp. The video source device generates a container frame from the source frame encodings sharing a common timestamp. The video source device appends a timestamp ("container timestamp") to a header of the container frame ("container frame header") that is identical to the timestamps of the various source frame encodings”)
Therefore, it would have been obvious to a person of ordinary skill in the art at the time of invention to modify system, method for providing a remotely created augmented reality image of Smith and using the inertial motion unit (IMU) 113 is a sensing system composed of several inertial sensors which includes an accelerometer, gyroscope, and magnetometers of Naimark and encode depth information, depth and color  of the scene as seen in Fleurea with generates a container frame from the source frame encodings sharing a common timestamp as seen in Marlatt because this modification would synchronize the frames of the different encodings on the camera (¶0039 of Marlatt ).
 Thus, the combination of Smith, Naimark, Fleurea and Marlatt teaches wherein the instructions further cause the processor to combine the color information of the color image, the distance information of the depth image, and the DoF information that share a timestamp into a frame.



Contact


Any inquiry concerning this communication or earlier communications from the examiner should be directed to SARAH LE whose telephone number is (571)270-7842.  The examiner can normally be reached on Monday: 8AM-4:30PM EST, Tuesday: 8 AM-3:30PM EST, Wednesday: 8AM-2:30PM EST, Thursday and Friday off.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mark Zimmerman can be reached on 571-272-7653.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 




/SARAH LE/Primary Examiner, Art Unit 2619