Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 7, 9-11, and 13 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent Application Publication 2013/0271625 A1 (hereinafter Gruber) in view of “KinectFusion: Real-Time Dense Surface Mapping and Tracking” by Richard A. Newcombe, et al. (hereinafter Newcombe) in view of “Real-Time Volumetric 3D Capture of Room-Sized Scenes for Telepresence” by Andrew Maimone, et al. (hereinafter Maimone) in view of “KinectFusion: Real-Time 3D Reconstruction and Interaction Using a Moving Depth Camera” by Shahram Izadi (hereinafter Izadi) in view of U.S. Patent Application 2012/0194517 A1 (hereinafter Molyneaux).
Regarding claim 1, the limitation “A system comprising … a portable display device comprising a display unit for displaying a virtual object to a user and a photographing unit that photographs a predetermined real space … wherein the portable display device renders the virtual object in a superimposed fashion on the predetermined real space, viewed by the user via the display unit” is taught by Gruber 
The limitation “the portable display device is configured to: store point cloud data, obtained in advance, of real objects located in the predetermined real space, wherein the point cloud data constitutes three-dimensional shape elements each having three-dimensional position information” is taught by Gruber in view of Newcombe (Gruber, e.g. paragraph 20, describes performing surface reconstruction using the acquired depth data, stored in a voxel volume, and further incorporates Newcombe by reference, cited as an exemplary surface reconstruction technique which can be used.  Further, Newcombe, section 3.2 describes the depth data as depth maps from each captured image, indicating that each depth map corresponds to a set of points, i.e. point cloud data.  The surface reconstruction based on the point cloud data is obtained in advance, i.e. the surface reconstruction is generated prior to performing the virtual object rendering over the video of the real scene, as described in paragraph 34.  It is additionally noted that the specification, e.g. paragraph 13, as well as depending claim 4, indicates that the three-dimensional shape elements may be meshes or voxels.)
The limitation(s) “a server … image acquisition devices that acquire images individually from a plurality of fixed points where a region is photographed in the predetermined real space … store a table in which two-dimensional position information of each pixel of the images acquired by the image acquisition devices is associated with the point cloud data” is not explicitly taught by Gruber (Gruber, paragraph 20, indicates the use of Newcombe’s KinectFusion technique, and further indicates that other well-known reconstruction techniques could be used, but does not explicitly mention the use of reconstruction techniques relying on fixed acquisition devices.)  However, this limitation is taught by Maimone in view of Izadi (Maimone describes an improvement to KinectFusion to support dynamic scenes with fixed camera arrays (e.g. abstract, section 2, paragraph 4), which includes using fixed precalibrated camera poses (section 4.2, paragraph 7).  Maimone cites Izadi (section 4.2, paragraph 1), which describes the same KinectFusion system described by Newcombe (Izadi, section GPU Implementation, paragraph 1, referring to Newcombe as disclosing a full formulation of the method).  Maimone stores the position of each camera in the array relative to the scene (section 4.2, paragraph 7), which is used with the depth map for each camera to calculate vertices in the global coordinate system for each pixel u of the camera for a given point in time (Izadi, section Depth Map Conversion), i.e. the vertex map stores the 3D coordinate of each pixel, where the 3D coordinates are associated with the voxel grid elements (Izadi, section Volumetric Integration, paragraph 3).  Izadi indicates this is performed using the depth map of each camera, i.e. Di(p) in the pseudocode of listings 1 and 2 refers to the camera’s depth map, and in Maimone’s system, the volume integration step relies on the depth map values for each camera, as shown in figure 3.  The depth map, as discussed above and indicated by Newcombe, section 3.2, provides calibrated depth measurements at each image pixel, which is a two-dimensional table storing position information of the 3D shape element represented by the corresponding pixel, i.e. the depth/distance from the camera viewpoint to the real object in the scene being captured by the Kinect cameras.)
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gruber’s augmented reality system to incorporate Maimone’s fixed camera array surface reconstruction technique in order to allow Gruber’s system to interact with a larger dynamically tracked volume (Maimone, section 3, paragraph 1 indicates support for a 26.5 cubic meter capture volume, whereas Newcombe, section 6, paragraph 3, suggests a practical limit of 7 cubic meter capture volume for the original KinectFusion system), as well as support dynamic color changes in the tracked volume (Maimone, section 4.2, paragraphs 7-11).  In the combination, Gruber’s mobile device would use Maimone’s surface reconstruction based on fixed camera arrays substituted for Newcombe and Izadi’s original KinectFusion technique (Gruber, paragraph 20), including determining dynamic color values as taught by Maimone, and performing photometric registration as taught by Gruber (paragraph 34), and could access the surface reconstruction data from Maimone’s processing device (section 3, paragraph 1), i.e. a server comprising the claimed three-dimensional-space data and two-dimensional table(s).
The limitation “wherein the point cloud data is higher precision data than the images acquired by the image acquisition devices” is implicitly taught by Gruber in view of Maimone (Maimone, section 4.2, paragraph 10, indicates that the volume resolution is typically lower than the color resolution, i.e. depth camera color resolution.  Maimone uses the term “typical”, such that one of ordinary skill in the art would understand that it is common or normal for the volume resolution to be lower than the color resolution, but also that it would be permissible for the reverse relationship, i.e. an atypical relationship is a possible implementation variant.  Further, Molyneaux, describing an analogous system for generating a 3D model of a user’s environment for use in an augmented reality system, e.g. abstract, paragraphs 21-30, including the use of multiple fixed depth cameras similar to Maimone, e.g. paragraphs 64-68, teaches generating a dense depth model having higher resolution, i.e. precision, than the depth camera resolution prior to interactive gameplay, e.g. paragraphs 22, 27.  Molyneaux also teaches that the dense depth model can provide additional advantages, e.g. paragraphs 29, 30, 51-61)
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gruber’s augmented reality system, incorporating Maimone’s fixed camera array surface reconstruction technique, to use a high resolution model constructed in a pre-processing step as taught by Molyneaux, which in turn allows for additional application improvements as taught by Molyneaux, e.g. paragraphs 51-61.
The limitation “map, using the table, color information of one or more pixels of the images acquired by the image acquisition devices to the three-dimensional shape elements of the point cloud data; update the color information of the three-dimensional shape elements based on changes in the color information of each pixel of the images acquired by the image acquisition devices” is taught by Gruber in view of Maimone (Maimone, section 4.2, paragraphs 8-11 describe coloring the surface reconstruction model through three techniques, two of which rely on voxel coloring, where color data is stored as a weighted average in each voxel, and discarded when scene changes are detected.  That is, Maimone’s system determines the color of each voxel according to a weighted average of the corresponding pixels from the color images of the Kinect cameras, including discarding the accumulated color, such that the color information is also being updated on the basis of changes in colors in the captured images over time.  Furthermore, as discussed above, the volume integration relies on the depth maps produced by each Kinect camera, e.g. Maimone, section 4.2, Izadi, listings 1 and 2, Newcombe, section 3.2, corresponding to the claimed table(s), and the voxel colors are updated based on the initial and updated volume integration, Maimone, section 4.2, figure 3, and therefore the color information is mapped and updated based on the depth maps of each Kinect camera, i.e. the claimed two-dimensional table(s).)
The limitation “determine a position of the portable display device and a photographing direction of the photographing unit” is taught by Gruber (paragraph 28, “As illustrated, the surface reconstruction from 218 in FIG. 2 is used to obtain a pose estimation 220 and a ray cast 221 for the camera relative to the environment with six degrees of freedom (6 DOF) for each camera frame. The surface reconstruction from 218 serves as a three-dimensional (3D) reconstruction of the environment that is used to track the pose thereby eliminating the need for any prior knowledge of the environment such as a 2D planar tracking target, a previously created 3D tracking model or a fiducial marker. Pose estimation 220 using a 3D reconstruction is well known by those skilled in the art and the present disclosure is not limited to any particular technique to obtain the pose estimation.”)
The limitation “generate virtual illumination information for the virtual object to be rendered based on the color information and three-dimensional position information of the three-dimensional shape elements” is taught by Gruber (paragraph 3, “Estimated lighting conditions for the environment are generated based on the surface reconstruction and the illumination data. For example, the surface reconstruction may be used to compute the possible radiance transfer, which may be compressed, e.g., using spherical harmonic basis functions, and used in the lighting conditions estimation. A virtual object may then be rendered based on the lighting conditions. Differential rendering may be used with lighting solutions from the surface reconstruction of the environment and a second surface reconstruction of the environment combined with the virtual object.”  Gruber describes estimation of lighting conditions in the real environment using the surface reconstruction in paragraphs 21-26, and using the estimated lighting conditions to render a virtual object for compositing onto the video frames captured by the camera in paragraphs 27-33.)
The limitation “render the virtual object on the display unit based on the position of the display device, the photographing direction of the photographing unit, and the virtual illumination information” is taught by Gruber (Paragraphs 27-33 describe using the surface reconstruction, the SH coefficients and estimated lighting conditions to render the virtual object over the video frame on the basis of the pose of the camera relative to environment (paragraphs 28,29,31) and the SH coefficients and estimated lighting conditions (paragraphs 30, 32,33).  As noted above, Gruber’s device includes a display unit for displaying the composited video (paragraphs 14, 34).)
Regarding claim 2, the limitation “wherein the server or the portable display device generates, as virtual illumination information for the virtual object to be rendered, virtual illumination information on the individual faces of a virtual polyhedron accommodating the virtual object to be rendered based on the color information and three-dimensional position information of the three-dimensional shape elements” is taught by Gruber (paragraph 34, “The pose of the camera with respect to the environment may be determined using well known vision based tracking techniques based on the surface reconstruction. … Illumination data of the environment is generated from at least one video frame (308), e.g., as illustrated by illumination 216 in FIG. 2. The illumination data may be generated by converting at least one video frame into intensity components and color components and using the intensity components to produce the illumination data. … The estimated lighting conditions in the environment are generated in each video frame based on the surface reconstruction and the illumination data (310), e.g., as illustrated by light estimation 226 in FIG. 2. The lighting conditions may be estimated by generating a radiance transfer for the environment based on the surface reconstruction and generating a compressed transfer function of the radiance transfer, e.g., by projecting the radiance transfer into spherical harmonics basis functions. The light conditions may then be estimated using the compressed transfer function of the radiance transfer and the illumination data to estimate the lighting conditions. A virtual object is rendered over the video frames based on pose and the lighting conditions (312), e.g., as illustrated in FIG. 3.”  Gruber describes estimation of lighting conditions based on the surface reconstruction model and color information of the model of real environment in paragraphs 21-26, and using the estimated lighting conditions to render a virtual object for compositing onto the video frames captured by the camera in paragraphs 27-33, which includes determining illumination values for the polygonal surfaces of the virtual object model (paragraphs 29, 31 especially).)
Regarding claims 7, 9-11, the limitations are similar to those treated in the above rejection(s) and are met by the references as discussed in claim 1 above.
Regarding claim 13, the limitation “wherein the image acquisition devices are stationary cameras disposed around the predetermined real space” is taught by Gruber in view of Maimone and Molyneaux (Both Maimone, e.g. section 3, paragraph 1, and Molyneaux, e.g. figure 8, paragraph 64, teach that the image acquisition devices are camera disposed around the predetermined real space.)

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent Application Publication 2013/0271625 A1 (hereinafter Gruber) in view of “KinectFusion: Real-Time Dense Surface Mapping and Tracking” by Richard A. Newcombe, et al. (hereinafter Newcombe) in view of “Real-Time Volumetric 3D Capture of Room-Sized Scenes for Telepresence” by Andrew Maimone, et al. (hereinafter Maimone) in view of “KinectFusion: Real-Time 3D Reconstruction and Interaction Using a Moving Depth Camera” by Shahram Izadi (hereinafter Izadi) in view of U.S. Patent Application 2012/0194517 A1 (hereinafter Molyneaux) as applied to claim 1 above, and further in view of “Global Localization from Monocular SLAM on a Mobile Phone” by Jonathan Ventura, et al. (hereinafter Ventura).
Regarding claim 3, the limitation “wherein the portable display device further includes position and orientation sensors” is taught by Gruber (paragraph 34, “The pose of the camera with respect to the environment may be determined using well known vision based tracking techniques based on the surface reconstruction. The pose may also be determined, e.g., using an image and a known model of the real-world. If desired, additional data may be used to assist in determining the pose, such as inertial sensor data from, e.g., accelerometers, gyroscopes, magnetometers, etc.”)
The limitation “wherein the server or the portable display device considers a position of the display device and a photographing direction of the photographing unit, acquired by the position and orientation sensors, as a provisional user environment, obtains, from the server or the portable display device, the three-dimensional shape elements that can be photographed by the photographing unit at positions and in directions within a predetermined range from the provisional user environment, and determines the position of the display device and the photographing direction of the photographing unit based on the color information and three-dimensional position information of the obtained three-dimensional shape elements and the photographed image of the predetermined real space” is not explicitly taught by Gruber (paragraph 34 mentions the use of position and orientation sensors to aid in pose determination, but does not disclose details thereof.)  However, this limitation is suggested by Ventura (Ventura describes a system for global localization of a handheld device for the purpose of augmented reality (e.g. abstract), which functions by having the client begin point cloud scene tracking and providing sensor based location information to a server for registering the client device coordinate system to a global coordinate system (sections 3, 4).  Further, the server performs this operation (section 5.4) by dividing the global point cloud into cells which are further divided by orientation, and one of the cells is selected according to the client’s sensor measured position, and a direction slice according to the sensor measured direction, and data from the slice is used to perform the accurate registration as described in section 4.  Additionally Ventura describes using the system indoors (section 6).)
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gruber’s augmented reality system, incorporating Maimone’s fixed camera array surface reconstruction technique, using a high resolution model constructed in a pre-processing step as taught by Molyneaux, to further incorporate Ventura’s localization technique in order to extend the supported volume of Maimone’s surface reconstruction to include multiple rooms while still efficiently performing the localization, i.e. each room could be a cell partition (section 5.4, paragraph 2), and multiple rooms could be monitored by Maimone’s fixed camera arrays for surface reconstruction, such that Ventura’s localization technique would allow a mobile device to quickly determine its pose with respect to the multiple room model.

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent Application Publication 2013/0271625 A1 (hereinafter Gruber) in view of “KinectFusion: Real-Time Dense Surface Mapping and Tracking” by Richard A. Newcombe, et al. (hereinafter Newcombe) in view of “Real-Time Volumetric 3D Capture of Room-Sized Scenes for Telepresence” by Andrew Maimone, et al. (hereinafter Maimone) in view of “KinectFusion: Real-Time 3D Reconstruction and Interaction Using a Moving Depth Camera” by Shahram Izadi (hereinafter Izadi) in view of U.S. Patent Application 2012/0194517 A1 (hereinafter Molyneaux) further in view of “Global Localization from Monocular SLAM on a Mobile Phone” by Jonathan Ventura, et al. (hereinafter Ventura).
Regarding claim 6, the limitations are similar to those treated in the above rejection(s) and are met by the references as discussed in claims 1 and 3 above.

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent Application Publication 2013/0271625 A1 (hereinafter Gruber) in view of “KinectFusion: Real-Time Dense Surface Mapping and Tracking” by Richard A. Newcombe, et al. (hereinafter Newcombe) in view of “Real-Time Volumetric 3D Capture of Room-Sized Scenes for Telepresence” by Andrew Maimone, et al. (hereinafter Maimone) in view of “KinectFusion: Real-Time 3D Reconstruction and Interaction Using a Moving Depth Camera” by Shahram Izadi (hereinafter Izadi) in view of U.S. Patent Application 2012/0194517 A1 (hereinafter Molyneaux) as applied to claim 1 above, and further in view of “Online Structure Analysis for Real-Time Indoor Scene Reconstruction” by Yizhong Zhang, et al. (hereinafter Zhang).
Regarding claim 4, the limitation “wherein the three-dimensional shape elements are meshes constituted of polygons created based on point cloud data, obtained in advance, of the real objects located in the predetermined real space” is not explicitly taught by Gruber in view of Newcombe, Izadi, and Maimone (Maimone, e.g. section 4, indicates the 3D elements are voxels created on the basis of point cloud data (i.e. Newcombe and Izadi’s KinectFusion) rather than meshes of polygons.) However, this limitation is taught by Zhang (Zhang describes an improvement to KinectFusion which supplements the voxel data by adding planes and isolated objects defined by recovered surface meshes, and which also improves robustness and accuracy of the reconstruction (e.g. abstract, section 1.2, sections 3-6).)
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gruber’s augmented reality system, incorporating Maimone’s fixed camera array surface reconstruction technique, using a high resolution model constructed in a pre-processing step as taught by Molyneaux, to further incorporate Zhang’s KinectFusion improvement in order to improve the robustness and accuracy of the reconstruction and supplement the voxel data with mesh objects and planar elements, and also because it is an improvement of the KinectFusion reconstruction technique that is the basis of the surface reconstruction used in the references.

Claims 5 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent Application Publication 2013/0271625 A1 (hereinafter Gruber) in view of “KinectFusion: Real-Time Dense Surface Mapping and Tracking” by Richard A. Newcombe, et al. (hereinafter Newcombe) in view of “Real-Time Volumetric 3D Capture of Room-Sized Scenes for Telepresence” by Andrew Maimone, et al. (hereinafter Maimone) in view of “KinectFusion: Real-Time 3D Reconstruction and Interaction Using a Moving Depth Camera” by Shahram Izadi (hereinafter Izadi) in view of U.S. Patent Application 2012/0194517 A1 (hereinafter Molyneaux) in view of U.S. Patent Application 2016/0210787 (hereinafter Chu).
Regarding claim 5, the limitations are similar to those treated in the above rejection(s) and are met by the references as discussed in claim 1 above, except that there is no server, and all of the parts are elements of the display device, and in the claim 1 rejection, Maimone’s server is a separate device from Gruber’s mobile display device.  However, this limitation is suggested by Chu (Chu describes a similar system for performing augmented reality using a surface reconstruction from a fixed camera array (e.g. abstract, paragraphs 21, 22, 25), and suggests that using a single device for performing computing operations and display/capture operations (paragraph 21, “In one example, the combination of the computer 20 and the display device 10 is a desktop computer, a workstation or a notebook computer. In another example, the combination of the computer 20 and the display device 10 is a mobile phone or a tablet computer.”) is an obvious alternative embodiment to the use of a separate computer and display device (“In still another example, the computer 20 and the display device 10 are two independent devices. For example, the computer 20 is the mobile phone or computer of one user, and the display device 10 is the mobile phone or computer of another user.”)
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gruber’s augmented reality system, incorporating Maimone’s fixed camera array surface reconstruction technique, using a high resolution model constructed in a pre-processing step as taught by Molyneaux, to use a single device for computing and display operations as suggested by Chu in an analogous system, which would have the benefits of reduced system complexity and potentially reduced system cost, due to using one device instead of two.
Regarding claim 8, the limitations are similar to those treated in the above rejection(s) and are met by the references as discussed in claim 1 above, except that the operations are all performed by the server rather than the display device, and all of the parts are elements of the display device, and in the claim 1 rejection, Maimone’s server is a separate device from Gruber’s mobile display device, with each performing some of the claimed operations.  However, this limitation is taught by Chu (Chu describes an analogous system for performing augmented reality using a surface reconstruction from a fixed camera array (e.g. abstract, paragraphs 21, 22, 25), and teaches that in a system using two devices for performing computing operations and display/capture operations, the first device may perform all of the computing operations and the second device may merely perform display and capture operations (“In still another example, the computer 20 and the display device 10 are two independent devices. For example, the computer 20 is the mobile phone or computer of one user, and the display device 10 is the mobile phone or computer of another user.”) 
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gruber’s augmented reality system, incorporating Maimone’s fixed camera array surface reconstruction technique, using a high resolution model constructed in a pre-processing step as taught by Molyneaux, to use the first (server) device for computing operations and the second (mobile) device for display operations as taught by Chu in an analogous augmented reality system, which would further have the benefit of allowing a stationary computer to perform the processing for a mobile device, whereby the stationary computer could be selected for processing capability without being limited by the weight and power requirements of a mobile device, i.e. one of ordinary skill in the art would know that a battery powered handheld mobile device must be light enough to carry and is limited by battery life, while a desktop computer is not subject to these requirements, allowing for use of heavier processing equipment with a higher power requirement to achieve a higher processing capability for a given budget.

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent Application Publication 2013/0271625 A1 (hereinafter Gruber) in view of “KinectFusion: Real-Time Dense Surface Mapping and Tracking” by Richard A. Newcombe, et al. (hereinafter Newcombe) in view of “Real-Time Volumetric 3D Capture of Room-Sized Scenes for Telepresence” by Andrew Maimone, et al. (hereinafter Maimone) in view of “KinectFusion: Real-Time 3D Reconstruction and Interaction Using a Moving Depth Camera” by Shahram Izadi (hereinafter Izadi) in view of U.S. Patent Application 2012/0194517 A1 (hereinafter Molyneaux) as applied to claim 1 above, and further in view of U.S. Patent Application Publication 2014/0160235 A1 (hereinafter Norland).
	Regarding claim 12, the limitations “a three-dimensional laser scanner configured to acquire a 360 degree measurement of the predetermined real space, wherein the point cloud data is created using the 3D laser scanner simultaneously with the images acquired by the image acquisition devices” is partially taught by Gruber in view of Maimone and Molyneaux (As discussed in the claim 1 rejection above, Maimone teaches creating the point cloud data by capturing images of the real space, and Molyneaux teaches the advantages creating a high resolution model in a pre-processing step using the fixed depth cameras.  Maimone’s system operates based on simultaneous capture from all the input devices, i.e. each of the Kinect sensors provide input at approximately the same time, which is used to create/update the voxels in Maimone’s surface reconstruction.  Further, Molyneaux, paragraph 64, suggests that the fixed depth cameras may rely on different technologies including laser range finder technology, but neither Maimone nor Molyneaux does not explicitly teach that one of the depth cameras is a 3D laser scanner configured to acquire a 360 degree measurement of the real space.)  However, this limitation is suggested in view of Norland (Norland describes a hybrid monitoring system having stationary viewpoint PTZ cameras and rotating line and/or laser scanners capturing 360 degree panorama images of an observation area.)
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gruber’s augmented reality system, incorporating Maimone’s fixed camera array surface reconstruction technique, using a high resolution model constructed in a pre-processing step as taught by Molyneaux, to use Norland’s 360 degree rotating laser scanner as one of the depth cameras in Maimone’s fixed camera array surface reconstruction technique, because Molyneaux suggests that the fixed cameras may have different depth sensing technologies, and Norland suggests an observation area can be monitored using both stationary viewpoint cameras, analogous to Maimone’s Kinect sensors, together with a rotating laser scanner for capturing a color and range panorama.  In the combination, Norland’s rotating laser scanner(s) would provide color and depth data to be incorporated into the model in combination with the Kinect sensor input captured simultaneously to perform Maimone’s surface reconstruction.  

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent Application Publication 2013/0271625 A1 (hereinafter Gruber) in view of “KinectFusion: Real-Time Dense Surface Mapping and Tracking” by Richard A. Newcombe, et al. (hereinafter Newcombe) in view of “Real-Time Volumetric 3D Capture of Room-Sized Scenes for Telepresence” by Andrew Maimone, et al. (hereinafter Maimone) in view of “KinectFusion: Real-Time 3D Reconstruction and Interaction Using a Moving Depth Camera” by Shahram Izadi (hereinafter Izadi) in view of U.S. Patent Application 2012/0194517 A1 (hereinafter Molyneaux) as applied to claim 1 above, and further in view of U.S. Patent Application Publication 2013/0141421 A1 (hereinafter Mount).
	Regarding claim 14, the limitation “wherein the portable display device is a head mounted display that provides the user with a mixed reality space comprising the predetermined real space and a virtual space comprising the virtual object; and wherein the virtual illumination information is used by the head mounted display to determine global illumination changes in the mixed reality space based on changes in a state of light in the predetermined real space” is partially taught by Gruber (As noted in the claim 1 rejection above, Gruber describes, e.g. paragraphs 27-33, using the surface reconstruction, the SH coefficients and estimated lighting conditions to render the virtual object over the video frame on the basis of the pose of the camera relative to environment (paragraphs 28,29,31) and the SH coefficients and estimated lighting conditions (paragraphs 30, 32,33), where rendering includes accounting for both diffuse and specular reflection based on the lighting conditions in order to support global illumination techniques (paragraphs 22, 23, 27, 29, 30).  Also as noted above, Gruber’s device includes camera(s) capturing the environment, a processor performing the system operations, and a display unit for displaying the composited video (paragraphs 14, 15, 34).  However, Gruber does not explicitly teach that the mobile device could be a head mounted display device.)  However, this limitation is suggested in view of Mount (Mount describes a head mounted display device used for presenting augmented reality, having both an integrated see-through display device for each eye, and camera(s) for capturing depth images of the environment, e.g. abstract, paragraphs 10-16, 30.)
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gruber’s augmented reality system, incorporating Maimone’s fixed camera array surface reconstruction technique, using a high resolution model constructed in a pre-processing step as taught by Molyneaux, to use Mount’s head mounted display device as the mobile device of Gruber’s system, because Gruber indicates, e.g. paragraph 15, that any compatible device capable of performing the necessary functions can be used as the mobile device, and Mount’s head mounted display device is intended for use in an augmented reality system such as Gruber’s, and therefore could be used as Gruber’s mobile device.

Response to Arguments
Applicant's arguments filed 4/27/21 have been fully considered but they are not persuasive.
Applicant’s remarks note that the rejection discusses Maimone’s discussion of what is typical, but do not appear to actually provide any reasoning to contradict the rejection’s analysis of what is implied by the use of the word typical.  Applicant further asserts that Maimone fails to make any mention of the resolution of “point cloud data” or “image data” as found in “images acquired by the image acquisition devices”.  On the contrary, Maimone is discussing “color resolution”, which corresponds to the image data acquired from images captured by the depth cameras, i.e. the source of the color information used in Maimone’s system.  As noted in the rejection, color resolution is depth camera color resolution, and Applicant’s remarks do not dispute this, and instead point out that the claim uses different terminology, without suggesting any reason why Maimone’s color resolution does not correspond to the resolution of the claimed “images acquired by the image acquisition devices”.  Therefore, this argument cannot be considered persuasive.
Applicant asserts that Molyneaux makes no mention of using “higher resolution” data alongside “lower resolution data”, asserting that the “dense depth model” fails to disclose the claimed limitation.  However, the point cloud data used to create the dense depth model includes far more points than would be provided by individual depth scans, meaning it corresponds to point cloud data that is higher precision or resolution than the image data captured by a single depth camera.  Therefore, contrary to Applicant’s assertion, Molyneaux does teach that the dense depth model is higher precision data than the images acquired by the individual image acquisition devices.
Applicant asserts that one of ordinary skill in the art attempting to combine the references in the manner suggested by the rejection would still fail to render the independent claims obvious because one of ordinary skill in the art would have no motivation to supply the missing elements without using Applicant’s disclosure as a guide.  Applicant’s remarks do not acknowledge, or otherwise dispute the motivations given by the rejection, and therefore, cannot be considered persuasive, per se.  
More specifically, as noted in the 1/28/21 Office Action, page 5, one of ordinary skill in the art would have been motivated to incorporate Maimone’s fixed camera array surface reconstruction technique in order to allow Gruber’s system to interact with a larger dynamically tracked volume, as well as supporting dynamic color changes therein.  Rather than explain what is unreasonable about this combination or motivation, Applicant provides irrelevant commentary pointing out that Gruber does not discuss point cloud data, but does not provide any reason why the specifically provided motivation is unreasonable, or any reason why one of ordinary skill in the art would be unable to perform the proposed modification, and therefore, cannot be considered persuasive.
 Further, as noted in the 1/28/21 Office Action, page 6, one of ordinary skill in the art would have been motivated to use a high resolution model constructed in a pre-processing step as taught by Molyneaux, because Molyneaux describes specific advantages provided by the high resolution model.  Rather than explain what is unreasonable about this combination, Applicant asserts that Molyneaux is directed to a “gaming console system” where Gruber is focused on a mobile device used for augmented reality.  However, contrary to Applicant’s assertion, Molyneaux also indicates that those skilled in the art will realize the system may be implemented using a mobile device, e.g. paragraph 106, and further indicates that the purpose is also for augmented reality applications, e.g. paragraph 25.  Therefore, as Applicant’s remarks do not dispute the actual cited motivation from Molyneaux relied upon in the rejection, and Applicant’s assertion that Molyneaux is not directed to an augmented reality application using a mobile device is explicitly contradicted by Molyneaux and otherwise lacks any cited support, Applicant’s remarks cannot be considered persuasive.


Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT BADER whose telephone number is (571)270-3335.  The examiner can normally be reached on 10-6 m-f.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mark Zimmerman can be reached on 571-272-7653.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/ROBERT BADER/Primary Examiner, Art Unit 2619