DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
Applicant's amendments filed on 30 December 2021 have been entered.  Claims 1-5, 7-9, 11-15, 19, and 23 have been amended.  No claims have been canceled.  Claim 26 has been added.  Claims 1-26 are still pending in this application, with claims 1, 14, 19, and 23 being independent.

Response to Arguments
Applicant’s arguments with respect to claims 1-26 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having 

Claims 1-3, 10-15, 19, 20, and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Rohaly et al. (US Pub. 2011/0043613), hereinafter Rohaly, in view of Rosenbaum (US Pub. 2018/0114363).
Regarding claim 1, Rohaly discloses an apparatus comprising: at least one memory (Paragraph [0036]: system 100 may include a computer-usable or computer-readable medium. The computer-usable medium 118 may include one or more memory chips (or other chips, such as a processor, that include memory), optical disks, magnetic disks or other magnetic media, and so forth. The computer-usable medium 118 may in various embodiments include removable memory (such as a USB device, tape drive, external hard drive, and so forth), remote storage (such as network attached storage), volatile or non-volatile computer memory, and so forth. The computer-usable medium 118 may contain computer-readable instructions for execution by the computer 108 to perform the various processes described herein. The computer-usable medium 118 may also, or instead, store data received from the camera 102, store a three-dimensional model of the object 104, store computer code for rendering and display, and so forth); instructions (Paragraph [0036]: system 100 may include a computer-usable or computer-readable medium. The computer-usable medium 118 may include one or more memory chips (or other chips, such as a processor, that include memory), optical disks, magnetic disks or other magnetic media, and so forth. The computer-usable medium 118 may in various embodiments include removable memory (such as a USB device, tape drive, external hard drive, and so forth), remote storage (such as network attached storage), volatile or non-volatile computer memory, and so forth. The computer-usable medium 118 may contain computer-readable instructions for execution by the computer 108 to perform the various processes described herein. The computer-usable medium 118 may also, or instead, store data received from the camera 102, store a three-dimensional model of the object 104, store computer code for rendering and display, and so forth); processor circuitry to execute the instructions to: determine a three-dimensional (3D) position of an object detected within a first image of an environment and within a second image of the environment, the first image captured by a first camera in a first location relative to the environment, the second image captured by a second camera in a second location relative to the environment different than the first location (Fig. 5; Fig. 7; Paragraph [0046]: FIG. 5 illustrates a sequence of images captured from a moving camera. In the sequence 500, a camera 502, which may include, for example, any of the cameras 102 described above, may capture an image of an object 504 from a number of different positions 506a-506e along a camera path 507. While the camera path 507 is depicted as a continuous curvilinear path which represents the physical path of a camera, it will be understood that analytically the camera path 507 may be represented by discrete, straight line transformations along with associated rotations in three-dimensional space; Paragraph [0051]: As shown in step 712, three-dimensional measurements may be obtained from the image data. In general, this may include processing image sets or the like to obtain disparity data across a processing mesh of the camera, and further processing the disparity data to obtain a three-dimensional surface reconstruction. In one embodiment, the disparity data encodes depth information, and may be employed to recover a three-dimensional measurement using a camera model or the like to relate disparity data to depth information for each pixel of the processing mesh. This step 712 may be repeated for each individual measurement (e.g., image set) obtained by the camera. As a result, a three-dimensional measurement or reconstruction may be obtained for each camera pose along a camera path); generate a 3D model of the environment and the object based on the first image and the second image (Fig. 7; Paragraphs [0051]-[0052]: shown in step 712, three-dimensional measurements may be obtained from the image data. In general, this may include processing image sets or the like to obtain disparity data across a processing mesh of the camera…shown in step 714, a three-dimensional model may be constructed from the individual three-dimensional measurements obtained in step 712. Where the three-dimensional measurements of the surface of the object overlap, these three-dimensional measurements may be registered to one another using any of a variety of known techniques. As a result, the camera path from pose to pose may be recovered, and the three-dimensional measurements from each pose may be combined into a full three-dimensional model of scanned regions of the surface of the object); detect a difference between the 3D position of the object and the 3D model (Fig. 7; Paragraph [0048]: one of the two-dimensional measurements, such as the first measurement 602, may be projected onto the three-dimensional model using available spatial information (e.g., the camera position and orientation). The resulting projection may then be backprojected to the second camera pose using warping or other deformation techniques to obtain an expected measurement at the second camera position. In the case of a side channel two-dimensional image or the like, the expected measurement may be a corresponding image expected in the center channel or another side channel. By adapting the three-dimensional measurement from this image pair to reduce or minimize an error between the actual and expected measurements in an overlapping area of the object, the three-dimensional measurement may be refined for that camera position to more accurately represent a surface of the object); and automatically modify the 3D model based on the difference (Fig. 7; Paragraph [0048]: one of the two-dimensional measurements, such as the first measurement 602, may be projected onto the three-dimensional model using available spatial information (e.g., the camera position and orientation). The resulting projection may then be backprojected to the second camera pose using warping or other deformation techniques to obtain an expected measurement at the second camera position. In the case of a side channel two-dimensional image or the like, the expected measurement may be a corresponding image expected in the center channel or another side channel. By adapting the three-dimensional measurement from this image pair to reduce or minimize an error between the actual and expected measurements in an overlapping area of the object, the three-dimensional measurement may be refined for that camera position to more accurately represent a surface of the object; Paragraph [0054]: the projected measurement may be backprojected from the three-dimensional model to another channel of the camera, which may be the center channel or another side channel of the camera described above. The projected result from step 716 may be backprojected using any suitable techniques to obtain a synthetic view of the measurement from one camera channel as it should appear from the other camera channel, based upon the spatial relationship between the projected result, the three-dimensional model, and the position and rotation of the other channel. It will be appreciated that if there were no errors in the initial measurement, this synthetic view would exactly correspond to the actual two-dimensional image obtained from the other channel. However, in a high-speed processing pipeline such as that described above, an initial three-dimensional model may fail to accurately capture surface details for any number of reasons (lower resolution processing, absence of global surface data such as the completed three-dimensional model, etc.). Thus it is expected that in a practical system there may be variations between a synthesized view (based on observations from a different position) and an actual view. Backprojection may be accomplished, for example, by warping or otherwise deforming the projected result based upon the three-dimensional model and camera pose information for respective measurements. By processing these synthesized image sets to obtain disparity data, and further backprojecting the synthesized disparity data through the camera model, a backprojected result may be obtained that represents a synthesized or expected version of the three-dimensional measurement from the second camera position).
	Rohaly does not explicitly disclose the 3D position defined relative to a real-world 3D space corresponding to the environment in which the object is located.
	However, Rosenbaum teaches generating a 3D model from multiple images and comparison between model and actual positions (Abstract; Paragraph [0049]), further comprising the 3D position defined relative to a real-world 3D space corresponding to the environment in which the object is located (Fig. 5; Paragraph [0040]: Any suitable approach can be used for scanning the physical environmental in order to generate scanned environmental features for one or more 3D virtual objects. In some approaches, the user manipulates or physically positions one or more user devices, such as user device 102a, in order to allow environmental scanner 212 to capture different perspectives of the environment. For example, the user may adjust the angle, rotation, or orientation of a user device with respect to the environment as a whole and/or with respect to a region or corresponding real world object the user wishes to scan; Paragraph [0055]: user device 102a may report its location (e.g., GPS coordinates) to server 108 (e.g., via application 110), and a set of one or more of reference objects 232 and/or object attributes 234 may be downloaded to the user device based on the location (and/or other contextual parameters). Rosenbaum teaches that this will provide a more accurate surface reconstruction or 3D model (Paragraph [0018]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rohaly with the features of the 3D position defined relative to a real-world 3D space corresponding to the environment in which the object is located as taught by Rosenbaum so as to provide a more accurate surface reconstruction or 3D model as presented by Rosenbaum.
Regarding claim 2, Rohaly, in view of Rosenbaum teaches the apparatus as defined in claim 1, Rohaly discloses wherein the 3D position of the object defines at least one of a placement, an orientation, or a shape of the object within the real-world 3D space corresponding to the environment (Paragraph [0028]: the camera 102 may include a plurality of apertures including a center aperture positioned along a center optical axis of a lens that provides a center channel for the camera 102, along with any associated imaging hardware. In such embodiments, the center channel may provide a conventional video image of the scanned subject matter, while a number of axially offset channels yield image sets containing disparity information that can be employed in three-dimensional reconstruction of a surface. In other embodiments, a separate video camera and/or channel may be provided to achieve the same result, i.e., a video of an object corresponding temporally to a three-dimensional scan of the object, preferably from the same perspective, or from a perspective having a fixed, known relationship to the perspective of the camera 102. The camera 102 may also, or instead, include a stereoscopic, triscopic or other multi-camera or other configuration in which a number of cameras or optical paths are maintained in fixed relation to one another to obtain two-dimensional images of an object from a number of different perspectives. The camera 102 may include suitable processing for deriving a three-dimensional point cloud from an image set or a number of image sets, or each two-dimensional image set may be transmitted to an external processor such as contained in the computer 108 described below. In other embodiments, the camera 102 may employ structured light, laser scanning, direct ranging, or any other technology suitable for acquiring three-dimensional data, or two-dimensional data that can be resolved into three-dimensional data).
Regarding claim 3, Rohaly, in view of Rosenbaum teaches the apparatus as defined in claim 1, Rohaly discloses wherein the processor circuitry is to execute the instructions to generate the 3D model independently of determining the 3D position of the object (Fig. 6; Paragraph [0047]: A camera model 610 for the camera may be used to relate the disparity field 606 to the three-dimensional reconstruction 612 of the surface of the object 610 as measured from a camera pose. While a center channel image may conveniently be used as the reference for the camera pose of the resulting three-dimensional reconstruction 612, this is not required and may not, in certain systems be available as a reference in any event. The three-dimensional reconstruction 612 can be stitched to other such three-dimensional measurements using camera path information or the like to obtain a three-dimensional model 620 of the object 601).
Regarding claim 10, Rohaly, in view of Rosenbaum teaches the apparatus as defined in claim 1, Rohaly discloses wherein the first image is captured at a same time as the second image (Fig. 5; Paragraph [0039]: it will be understood that, while FIG. 2 depicts one embodiment of an optical system 200, numerous variations are possible. One salient feature of the optical system related to the discussion below is the use of a center optical channel that captures conventional video or still images at one of the sensors 214b concurrent with various offset data (at, e.g., 214a and 214c) used to capture three-dimensional measurements).
Regarding claim 11, Rohaly, in view of Rosenbaum teaches the apparatus as defined in claim 1, Rohaly discloses wherein the processor circuitry is to execute the instructions to automatically modify the 3D model in response to the difference satisfying a confidence threshold and to disregard the difference when the difference does not satisfy the confidence threshold (Paragraph [0056]: a new motion-based reconstruction for some or all of the scan data may be performed using the refined three-dimensional measurements in place of the initial three-dimensional measurements to recover a camera path used to relate the individual measurements to a global coordinate system. In another aspect, this process may be repeated to obtain iterative refinements in the three-dimensional model, e.g., for a predetermined number of iterations, or until a predetermined error threshold is reached, or until no further refinement is obtained from a previous iteration, and so forth, as well as various combinations of these. Iterations may be performed locally (e.g., on specific regions where errors are large) or globally (e.g., for every overlapping region between camera positions), or some combinations of these).
Regarding claim 12, Rohaly, in view of Rosenbaum teaches the apparatus as defined in claim 1, Rohaly discloses wherein the processor circuitry is to execute the instructions to determine the 3D position of the object based on a first calibration matrix for the first camera and a second calibration matrix for the second camera (Paragraph [0067]: may be a set of candidate camera poses, each including a rotation and a translation (or position) referenced to a world coordinate system. There may also be a set of measured frame-to-frame camera motions, each including a rotation and a translation between poses. A measured camera motion may be referenced in the coordinate system of one camera pose. An example set of three key frames may be considered with an origin "O" and three other points "A", "B", and "C", each of the points having a position in a three-dimensional space. In addition to the position of these points, a camera at each of these points may have a different orientation. Therefore, between each of these points is a translation, meaning a change in position, and a rotation, meaning a change in orientation. The translation and rotation values comprise the motion parameters. The relationship between a point, X, expressed in the world coordinate system as XO and the same point expressed in the A coordinate system, XA may be expressed…ROA is the rotation taking points from the world to the A coordinate system. TOA is the translation of the world coordinate system to the A coordinate system. It should be understood that symbols X and T may represent a vector, rather than a scalar, e.g. where X includes x, y, and z coordinate values. Further, it should be understood that symbol R may represent a matrix).
Regarding claim 13, Rohaly, in view of Rosenbaum teaches the apparatus as defined in claim 1, Rohaly discloses wherein the first image is a first frame in a video stream captured by the first camera, and the processor circuitry is to execute the instructions to determine the 3D position of the object within the first image based on a previous image captured by the first camera before the first image, the previous image corresponding to a different frame in the video stream (Paragraph [0032]: video-based scanning system, real time more specifically refers to processing within the time between frames of video data, which may vary according to specific video technologies between about fifteen frames per second and about thirty frames per second. More generally, processing capabilities of the computer 108 may vary according to the size of the object 104, the speed of image acquisition, and the desired spatial resolution of three-dimensional points; Paragraph [0042]: high-accuracy processing controller 324 may provide images or frames to the high-accuracy processing pipeline 350. Separate image sets may have two-dimensional image registration performed by a two-dimensional image registration module 352. Based on the results of the two-dimensional image registration a three-dimensional point cloud or other three-dimensional representation may be generated by a three-dimensional point cloud generation module 354. The three-dimensional point clouds from individual image sets may be connected using a three-dimensional stitching module 356. Global motion optimization, also referred to herein as global path optimization or global camera path optimization, may be performed by a global motion optimization module 357 in order to reduce errors in the resulting three-dimensional model 358. In general, the path of the camera as it obtains the image frames may be calculated as a part of the three-dimensional reconstruction process. In a post-processing refinement procedure, the calculation of camera path may be optimized--that is, the accumulation of errors along the length of the camera path may be minimized by supplemental frame-to-frame motion estimation with some or all of the global path information. Based on global information such as individual frames of data in the image store 322, the high-speed three-dimensional model 340, and intermediate results in the high-accuracy processing pipeline 350, the high-accuracy model 370 may be processed to reduce errors in the camera path and resulting artifacts in the reconstructed model. As a further refinement, a mesh may be projected onto the high-speed model by a mesh projection module 360).
Regarding claim 14, Rohaly discloses a system comprising: a first camera to capture a first image of an environment, the first camera in a first location relative to the environment (Fig. 5; Paragraph [0046]: FIG. 5 illustrates a sequence of images captured from a moving camera. In the sequence 500, a camera 502, which may include, for example, any of the cameras 102 described above, may capture an image of an object 504 from a number of different positions 506a-506e along a camera path 507. While the camera path 507 is depicted as a continuous curvilinear path which represents the physical path of a camera, it will be understood that analytically the camera path 507 may be represented by discrete, straight line transformations along with associated rotations in three-dimensional space); a second camera to capture a second image of the environment, the second camera in a second location different than the first location (Fig. 5; Paragraph [0046]: FIG. 5 illustrates a sequence of images captured from a moving camera. In the sequence 500, a camera 502, which may include, for example, any of the cameras 102 described above, may capture an image of an object 504 from a number of different positions 506a-506e along a camera path 507. While the camera path 507 is depicted as a continuous curvilinear path which represents the physical path of a camera, it will be understood that analytically the camera path 507 may be represented by discrete, straight line transformations along with associated rotations in three-dimensional space); at least one processor (Paragraph [0036]: system 100 may include a computer-usable or computer-readable medium. The computer-usable medium 118 may include one or more memory chips (or other chips, such as a processor, that include memory), optical disks, magnetic disks or other magnetic media, and so forth. The computer-usable medium 118 may in various embodiments include removable memory (such as a USB device, tape drive, external hard drive, and so forth), remote storage (such as network attached storage), volatile or non-volatile computer memory, and so forth. The computer-usable medium 118 may contain computer-readable instructions for execution by the computer 108 to perform the various processes described herein. The computer-usable medium 118 may also, or instead, store data received from the camera 102, store a three-dimensional model of the object 104, store computer code for rendering and display, and so forth); and memory including instructions that, when executed by the at least one processor, cause the at least one processor (Paragraph [0036]: system 100 may include a computer-usable or computer-readable medium. The computer-usable medium 118 may include one or more memory chips (or other chips, such as a processor, that include memory), optical disks, magnetic disks or other magnetic media, and so forth. The computer-usable medium 118 may in various embodiments include removable memory (such as a USB device, tape drive, external hard drive, and so forth), remote storage (such as network attached storage), volatile or non-volatile computer memory, and so forth. The computer-usable medium 118 may contain computer-readable instructions for execution by the computer 108 to perform the various processes described herein. The computer-usable medium 118 may also, or instead, store data received from the camera 102, store a three-dimensional model of the object 104, store computer code for rendering and display, and so forth) to: determine a three-dimensional (3D) position of an object detected within the first image and within the second  FIG. 5 illustrates a sequence of images captured from a moving camera. In the sequence 500, a camera 502, which may include, for example, any of the cameras 102 described above, may capture an image of an object 504 from a number of different positions 506a-506e along a camera path 507. While the camera path 507 is depicted as a continuous curvilinear path which represents the physical path of a camera, it will be understood that analytically the camera path 507 may be represented by discrete, straight line transformations along with associated rotations in three-dimensional space; Paragraph [0051]: As shown in step 712, three-dimensional measurements may be obtained from the image data. In general, this may include processing image sets or the like to obtain disparity data across a processing mesh of the camera, and further processing the disparity data to obtain a three-dimensional surface reconstruction. In one embodiment, the disparity data encodes depth information, and may be employed to recover a three-dimensional measurement using a camera model or the like to relate disparity data to depth information for each pixel of the processing mesh. This step 712 may be repeated for each individual measurement (e.g., image set) obtained by the camera. As a result, a three-dimensional measurement or reconstruction may be obtained for each camera pose along a camera path); generate a 3D model of the environment, including the object, based on the first image and the second image (Fig. 7; Paragraphs [0051]-[0052]: shown in step 712, three-dimensional measurements may be obtained from the image data. In general, this may include processing image sets or the like to obtain disparity data across a processing mesh of the camera…shown in step 714, a three-dimensional model may be constructed from the individual three-dimensional measurements obtained in step 712. Where the three-dimensional measurements of the surface of the object overlap, these three-dimensional measurements may be registered to one another using any of a variety of known techniques. As a result, the camera path from pose to pose may be recovered, and the three-dimensional measurements from each pose may be combined into a full three-dimensional model of scanned regions of the surface of the object); detect a difference  one of the two-dimensional measurements, such as the first measurement 602, may be projected onto the three-dimensional model using available spatial information (e.g., the camera position and orientation). The resulting projection may then be backprojected to the second camera pose using warping or other deformation techniques to obtain an expected measurement at the second camera position. In the case of a side channel two-dimensional image or the like, the expected measurement may be a corresponding image expected in the center channel or another side channel. By adapting the three-dimensional measurement from this image pair to reduce or minimize an error between the actual and expected measurements in an overlapping area of the object, the three-dimensional measurement may be refined for that camera position to more accurately represent a surface of the object; Paragraph [0054]: the projected measurement may be backprojected from the three-dimensional model to another channel of the camera, which may be the center channel or another side channel of the camera described above. The projected result from step 716 may be backprojected using any suitable techniques to obtain a synthetic view of the measurement from one camera channel as it should appear from the other camera channel, based upon the spatial relationship between the projected result, the three-dimensional model, and the position and rotation of the other channel. It will be appreciated that if there were no errors in the initial measurement, this synthetic view would exactly correspond to the actual two-dimensional image obtained from the other channel. However, in a high-speed processing pipeline such as that described above, an initial three-dimensional model may fail to accurately capture surface details for any number of reasons (lower resolution processing, absence of global surface data such as the completed three-dimensional model, etc.). Thus it is expected that in a practical system there may be variations between a synthesized view (based on observations from a different position) and an actual view. Backprojection may be accomplished, for example, by warping or otherwise deforming the projected result based upon the three-dimensional model and camera pose information for respective measurements. By processing these synthesized image sets to obtain disparity data, and further backprojecting the synthesized disparity data through the camera model, a backprojected result may be obtained that represents a synthesized or expected version of the three-dimensional measurement from the second camera position).
	Rohaly does not explicitly disclose the 3D position defined relative to a real-world 3D space corresponding to the environment, the object to be within the environment.
	However, Rosenbaum teaches generating a 3D model from multiple images and comparison between model and actual positions (Abstract; Paragraph [0049]), further comprising the 3D position defined relative to a real-world 3D space corresponding to the environment, the object to be within the environment (Fig. 5; Paragraph [0040]: Any suitable approach can be used for scanning the physical environmental in order to generate scanned environmental features for one or more 3D virtual objects. In some approaches, the user manipulates or physically positions one or more user devices, such as user device 102a, in order to allow environmental scanner 212 to capture different perspectives of the environment. For example, the user may adjust the angle, rotation, or orientation of a user device with respect to the environment as a whole and/or with respect to a region or corresponding real world object the user wishes to scan; Paragraph [0055]: user device 102a may report its location (e.g., GPS coordinates) to server 108 (e.g., via application 110), and a set of one or more of reference objects 232 and/or object attributes 234 may be downloaded to the user device based on the location (and/or other contextual parameters). Rosenbaum teaches that this will provide a more accurate surface reconstruction or 3D model (Paragraph [0018]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rohaly with the features of the 3D position defined relative to a real-world 3D space corresponding to the environment, the object to be within the environment as taught by 
Regarding claim 15, Rohaly, in view of Rosenbaum teaches the system as defined in claim 14, Rohaly discloses wherein the instructions cause the at least one processor to generate the 3D model independently of the at least one processor determining the 3D position of the object (Fig. 6; Paragraph [0047]: A camera model 610 for the camera may be used to relate the disparity field 606 to the three-dimensional reconstruction 612 of the surface of the object 610 as measured from a camera pose. While a center channel image may conveniently be used as the reference for the camera pose of the resulting three-dimensional reconstruction 612, this is not required and may not, in certain systems be available as a reference in any event. The three-dimensional reconstruction 612 can be stitched to other such three-dimensional measurements using camera path information or the like to obtain a three-dimensional model 620 of the object 601).
Regarding claim 19, the limitations of this claim substantially correspond to the limitations of claim 1 (except for the non-transitory computer readable medium, which is disclosed by Rohaly, Paragraph [0036]: system 100 may include a computer-usable or computer-readable medium. The computer-usable medium 118 may include one or more memory chips (or other chips, such as a processor, that include memory), optical disks, magnetic disks or other magnetic media, and so forth. The computer-usable medium 118 may in various embodiments include removable memory (such as a USB device, tape drive, external hard drive, and so forth), remote storage (such as network attached storage), volatile or non-volatile computer memory, and so forth. The computer-usable medium 118 may contain computer-readable instructions for execution by the computer 108 to perform the various processes described herein. The computer-usable medium 118 may also, or instead, store data received from the camera 102, store a three-dimensional model of the object 104, store computer code for rendering and display, and so forth); thus they are rejected on similar grounds.
Regarding claim 20, the limitations of this claim substantially correspond to the limitations of claim 3; thus they are rejected on similar grounds.
Regarding claim 23, the limitations of this claim substantially correspond to the limitations of claim 1; thus they are rejected on similar grounds.

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Rohaly, in view of Rosenbaum, and further in view of Marin et al. (US Pub. 2018/0047208), hereinafter Marin.
Regarding claim 4, Rohaly, in view of Rosenbaum teaches the apparatus as defined in claim 1.
	Rohaly, in view of Rosenbaum does not explicitly disclose wherein the object in the first image of the environment is detected using a deep learning object detection model. 
	However, Marin teaches generating a 3D model from multiple images and comparison between model and actual positions (Abstract; Paragraph [0152]), further including wherein the object in the first image of the environment is detected using a deep learning object detection model (Paragraphs [0156]-[0159]: defect detection may be implemented using a convolutional neural network (CNN). The CNN may extract feature vectors from the 3D models of the defective and defect-free objects. The resulting feature vectors may be used to train a machine learning algorithm to classify the various types of defects observed in the objects). Marin teaches that this will assist in identifying discrepancies and thus creating more accurate models (Paragraph [0108]; Paragraphs [0156]-[0159]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rohaly, in view of Rosenbaum with the features of wherein the object in the first image of the .

Claims 5-8, 16, 17, 21, and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Rohaly, in view of Rosenbaum, and further in view of Hyllus et al. (US Pub. 2018/0218507), hereinafter Hyllus.
Regarding claim 5, Rohaly, in view of Rosenbaum teaches the apparatus as defined in claim 1.
	Rohaly, in view of Rosenbaum does not explicitly disclose wherein, in response to the difference indicating a portion of the object is missing from the 3D model, the processor circuitry is to execute the instructions to modify the 3D model by adding the missing portion. 
	However, Hyllus teaches 3D model from multiple images (Abstract), further comprising wherein, in response to the difference indicating a portion of the object is missing from the 3D model, the 3D model generator is to modify the 3D model by adding the missing portion (Fig. 1; Fig. 2; Paragraph [0005]: method for generating 3D body models from scanned data is described in [2]. A plurality of points clouds obtained from a scanner are aligned and a set of 3D data points obtained by the initial alignment are brought into precise registration with a mean body surface derived from the point clouds. Then an existing mesh-type body model template is fit to the set of 3D data points. The template model can be used to fill in missing detail where the geometry is hard to reconstruct
Regarding claim 6, Rohaly, in view of Rosenbaum, and further in view of Hyllus teaches the apparatus as defined in claim 5, Hyllus discloses wherein the missing portion corresponds to an entirety of the object (Fig. 1; Fig. 2; Paragraph [0005]: plurality of points clouds obtained from a scanner are aligned and a set of 3D data points obtained by the initial alignment are brought into precise registration with a mean body surface derived from the point clouds. Then an existing mesh-type body model template is fit to the set of 3D data points. The template model can be used to fill in missing detail where the geometry is hard to reconstruct).
Regarding claim 7, Rohaly, in view of Rosenbaum, and further in view of Hyllus teaches the apparatus as defined in claim 5, wherein the processor circuitry is to execute the instructions to add the missing portion by: generating a point cloud representation of the missing portion based on a model of the object (Hyllus: Fig. 1; Fig. 2; Paragraph [0005]: plurality of points clouds obtained from a scanner are aligned and a set of 3D data points obtained by the initial alignment are brought into precise registration with a mean body surface derived from the point clouds. Then an existing mesh-type body model template is fit to the set of 3D data points. The template model can be used to fill in missing detail where the geometry is hard to reconstruct); positioning the point cloud representation of the missing portion within the 3D model at a location corresponding to the 3D position of the object (Hyllus: Paragraphs [0027]-[0028]: coarsely aligning the dummy mesh model with the point cloud further comprises determining a prominent spot in the point cloud and adapting an orientation of the dummy mesh model relative to the point cloud based on the position of the prominent spot. The prominent spot may be determined automatically of specified by a user input and constitutes an efficient solution for adapting the orientation of the dummy mesh model. One example of a suitable prominent spot is the top point of the ear on the helix, i.e. the outer rim of the ear…coarsely aligning the dummy mesh model with the point cloud further comprises determining a characteristic line in the point cloud and adapting at least one of a scale of the dummy mesh model and a position of the dummy mesh model relative to the point cloud based on the characteristic line. For example, the characteristic line in the point cloud is determined by detecting edges in the point cloud. For this purpose a depth map associated with the point cloud may be used. Characteristic lines, e.g. edges, are relatively easy to detect in the point cloud data. As such, they are well suited for adjusting the scale and the position of the dummy mesh model relative to the point cloud data); and coloring the point cloud representation of the missing portion based on pixel color information from at least one of the first image or the second image (Rosenbaum: Fig. 3A-4B; Paragraph [0048]: Reference object identifier 216 may also determine or identify one or more of object attributes 234 based on the scanned environmental features generated by scan translator 214. Object attributes 234 can include a library, collection, or catalogue of textures, colors, sounds, movements, animations, decals, 3D riggings (animation rigging), and the like. In some cases, scan augmenter 206 extracts one or more of the object attributes 234 from one or more of reference objects 232 or other 3D virtual objects and incorporates them into the collection).
Regarding claim 8, Rohaly, in view of Rosenbaum, and further in view of Hyllus teaches the apparatus as defined in claim 5, wherein the object corresponds to a person, the processor circuitry is to execute the instructions to determine the 3D position of the object by determining positions of body parts of the person based on a deep learning skeleton detection model, and the missing portion of the object corresponds to a missing body part of the person (Rohaly: Paragraph [0007]: method may include using the refined camera path and the refined measurement to refine the three-dimensional model. The three-dimensional model may be a point cloud or a polygonal mesh. The object may be human dentition; Hyllus: Paragraph [0029]: fitting the dummy mesh model of the object to the point cloud through an elastic transformation of the coarsely aligned dummy mesh model comprises determining a border line of the object in the point cloud and attracting vertices of the dummy mesh model that are located outside of the object as defined by the border line towards the border line. Preferably, in order to reduce the computational burden, a 2D projection of the point cloud and the border line is used for determining if a vertex of the dummy mesh model is located outside of the object. A border line is relatively easy to detect in the point cloud data. However, the user may be asked to specify additional constraints, or such additional constraints may be determined using machine-learning techniques and a database; Paragraph [0054]: the HRTF has to be computed individually before creating a personalized binaural system. In HRTF computation, the ear shape is the most important part of the human body and the 3D model of the ear should be of better quality than the one for the head and the shoulder).
Regarding claim 16, Rohaly, in view of Rosenbaum teaches the system as defined in claim 14.
	Rohaly, in view of Rosenbaum does not explicitly disclose wherein the instructions further cause the at least one processor to, in response to the difference indicating a portion of the object is missing from the 3D model, modify the 3D model by adding the missing portion. 
	However, Hyllus teaches 3D model from multiple images (Abstract), further comprising wherein the instructions further cause the at least one processor to, in response to the difference indicating a portion of the object is missing from the 3D model, modify the 3D model by adding the missing portion (Fig. 1; Fig. 2; Paragraph [0005]: method for generating 3D body models from scanned data is described in [2]. A plurality of points clouds obtained from a scanner are aligned and a set of 3D data points obtained by the initial alignment are brought into precise registration with a mean body surface derived from the point clouds. Then an existing mesh-type body model template is fit to the set of 3D data points. The template model can be used to fill in missing detail where the geometry is hard to reconstruct). Hyllus teaches that this will allow for filling in missing detail where the geometry is hard to reconstruct (Paragraph [0005]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rohaly, in view of Rosenbaum with the features wherein the instructions further cause the at least one processor to, in response to the difference indicating a portion of the object is missing from the 3D model, 
Regarding claim 17, Rohaly, in view of Rosenbaum, and further in view of Hyllus teaches the system as defined in claim 16, wherein the instructions cause the at least one processor to add the missing portion by: generating a point cloud representation of the missing portion based on a model of the object (Hyllus: Fig. 1; Fig. 2; Paragraph [0005]: plurality of points clouds obtained from a scanner are aligned and a set of 3D data points obtained by the initial alignment are brought into precise registration with a mean body surface derived from the point clouds. Then an existing mesh-type body model template is fit to the set of 3D data points. The template model can be used to fill in missing detail where the geometry is hard to reconstruct); positioning the point cloud representation of the missing portion within the 3D model at a location corresponding to the 3D position of the object (Hyllus: Paragraphs [0027]-[0028]: coarsely aligning the dummy mesh model with the point cloud further comprises determining a prominent spot in the point cloud and adapting an orientation of the dummy mesh model relative to the point cloud based on the position of the prominent spot. The prominent spot may be determined automatically of specified by a user input and constitutes an efficient solution for adapting the orientation of the dummy mesh model. One example of a suitable prominent spot is the top point of the ear on the helix, i.e. the outer rim of the ear…coarsely aligning the dummy mesh model with the point cloud further comprises determining a characteristic line in the point cloud and adapting at least one of a scale of the dummy mesh model and a position of the dummy mesh model relative to the point cloud based on the characteristic line. For example, the characteristic line in the point cloud is determined by detecting edges in the point cloud. For this purpose a depth map associated with the point cloud may be used. Characteristic lines, e.g. edges, are relatively easy to detect in the point cloud data. As such, they are well suited for adjusting the scale and the position of the dummy mesh model relative to the point cloud data); and coloring the point cloud representation of the Rosenbaum: Fig. 3A-4B; Paragraph [0048]: Reference object identifier 216 may also determine or identify one or more of object attributes 234 based on the scanned environmental features generated by scan translator 214. Object attributes 234 can include a library, collection, or catalogue of textures, colors, sounds, movements, animations, decals, 3D riggings (animation rigging), and the like. In some cases, scan augmenter 206 extracts one or more of the object attributes 234 from one or more of reference objects 232 or other 3D virtual objects and incorporates them into the collection). 
Regarding claim 21, the limitations of this claim substantially correspond to the limitations of claim 5; thus they are rejected on similar grounds.
Regarding claim 24, the limitations of this claim substantially correspond to the limitations of claim 5; thus they are rejected on similar grounds.

Claims 9, 18, 22, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Rohaly, in view of Rosenbaum, and further in view of Peterson (US Pub. 2016/0065912).
Regarding claim 9, Rohaly, in view of Rosenbaum teaches the apparatus as defined in claim 1.
	Rohaly, in view of Rosenbaum does not explicitly disclose wherein, in response to the determination that a region of the 3D model corresponding to the position of the object includes a duplicate portion of the object, the processor circuitry is to execute the instructions to modify the 3D model by removing the duplicate portion. 
	However, Peterson teaches generating a 3D model from multiple images and comparison between model and actual positions (Paragraph [0066]), further comprising wherein, in response to the determination that a region of the 3D model corresponding to the position of the object includes a duplicate portion of the object, the processor circuitry is to  the post-processing module 174 can modify an image to remove duplicate (or other excess) pixel data…the post-processing module 174 can determine, based upon the comparison of the image 250 with the model 190, that portions of the object 100 represented by some of the imaged lines 252a through 252f are duplicates of portions of the object 100 that are represented by others of the imaged lines 252a through 252f. The post-processing module 174 can then adjust the image 250 appropriately, such that duplicate image data can be reduced or removed and the appropriate, re-scaled aspect ratio (or other aspect) is obtained. For example, using one or more of various techniques, the post-processing module 174 can delete duplicate pixel data (or merge duplicative portion of the imaged lines 252a through 252f together), such that an updated image 254 (see FIG. 9B) exhibits a length 256 and a width 258 (and a corresponding aspect ratio) that appropriately correlate to the model). Peterson teaches that this will allow for removal of excess, unwanted data to appropriately correlate to the model (Paragraphs [0086]-[0087]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rohaly, in view of Rosenbaum with the features of wherein, in response to the determination that a region of the 3D model corresponding to the position of the object includes a duplicate portion of the object, the processor circuitry is to execute the instructions to modify the 3D model by removing the duplicate portion as taught by Peterson so as to appropriately correlate to the model as presented by Peterson.
Regarding claim 18, Rohaly, in view of Rosenbaum teaches system as defined in claim 16.
	Rohaly, in view of Rosenbaum does not explicitly disclose wherein the instructions further cause the at least one processor to, in response to determining that a region of the 3D model corresponding to the position of the object includes a duplicate portion of the object, modify the 3D model by removing the duplicate portion. 
 the post-processing module 174 can modify an image to remove duplicate (or other excess) pixel data…the post-processing module 174 can determine, based upon the comparison of the image 250 with the model 190, that portions of the object 100 represented by some of the imaged lines 252a through 252f are duplicates of portions of the object 100 that are represented by others of the imaged lines 252a through 252f. The post-processing module 174 can then adjust the image 250 appropriately, such that duplicate image data can be reduced or removed and the appropriate, re-scaled aspect ratio (or other aspect) is obtained. For example, using one or more of various techniques, the post-processing module 174 can delete duplicate pixel data (or merge duplicative portion of the imaged lines 252a through 252f together), such that an updated image 254 (see FIG. 9B) exhibits a length 256 and a width 258 (and a corresponding aspect ratio) that appropriately correlate to the model). Peterson teaches that this will allow for removal of excess, unwanted data to appropriately correlate to the model (Paragraphs [0086]-[0087]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rohaly, in view of Rosenbaum with the features of wherein the instructions further cause the at least one processor to, in response to determining that a region of the 3D model corresponding to the position of the object includes a duplicate portion of the object, modify the 3D model by removing the duplicate portion as taught by Peterson so as to appropriately correlate to the model as presented by Peterson.
Regarding claim 22, the limitations of this claim substantially correspond to the limitations of claim 9; thus they are rejected on similar grounds.
Regarding claim 25, the limitations of this claim substantially correspond to the limitations of claim 9; thus they are rejected on similar grounds.

Claim 26 is rejected under 35 U.S.C. 103 as being unpatentable over Rohaly, in view of Rosenbaum, and further in view of Dal Mutto et al. (US Pub. 2019/0096135), hereinafter Dal Mutto.
Regarding claim 26, Rohaly, in view of Rosenbaum teaches the apparatus as defined in claim 1.
	Rohaly, in view of Rosenbaum does not explicitly disclose wherein the object is a foreground object, the environment includes background objects distinct from the foreground object, and the 3D model includes the background objects.
	However, Dal Mutto teaches generating a 3D model from multiple images and comparison between model and actual positions (Abstract; Paragraphs [0018]-[0020]), further comprising wherein the object is a foreground object, the environment includes background objects distinct from the foreground object, and the 3D model includes the background objects (Fig. 3; Paragraph [0108]: portions of the scene that are closer to the depth camera are shown in yellow and portions of the scene that are farther away are shown in blue. Accordingly, the boot and the table are shown generally in yellow, while the background, including a person standing in the background, are shown in shades of blue. The object of interest can be separated from the background by removing pixels that have a depth greater than a threshold (e.g., removing the blue pixels in the images shown in the bottom row of FIG. 3) and by also removing the planar surface at the bottom of the remaining model). Dal Mutto teaches that this will allow the user to visualize information in the correct three-dimensional location, aligned with the object in the real world (Paragraph [0238]). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Rohaly, in view of Rosenbaum with the features of wherein the object is a foreground object, the .

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MATTHEW D SALVUCCI whose telephone number is (571)270-5748. The examiner can normally be reached M-F: 7:30-4:00PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, XIAO WU can be reached on (571) 272-7761. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/MATTHEW SALVUCCI/Primary Examiner, Art Unit 2613