DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment

The amendment filed on December 14, 2020 has been entered. Claims 1-18 are now pending in the application. Applicant's amendments have addressed all informalities as previously set forth in the non-final action mailed on September 17, 2020.
Response to Arguments
Applicant’s arguments, see pages 8-9, filed December 14, 2020 with respect to the rejections of previous claims 1-18 under 35 USC § 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new grounds of rejection is made in view of Choi (US 2017/0169313 A1).

In regards to independent claim 1, The Kutliroff reference was previously cited as it discloses techniques for 3D analysis of a scene including detection, segmentation and registration of objects within the scene (see abstract). 
In regards to the amended limitation “determining whether the location overlaps another marker; for the determination indicative of the location not overlapping another marker: classifying the object from the 2D object detection”, The Choi reference is now 
Choi further discloses in reference to the amended limitation a scenario in which overlap between markers is detected as it discloses “when the bounding boxes determined to be used overlap each other, the overlapped bounding boxes can be merged and can be considered to be a bounding box surrounding a single object. Without assigning particular classifiers to individual objects in the merged bounding box, the classes of the individual objects can be classified using the result of pixel labeling including the pixel probability distribution information of the image on the basis of the reinforced feature map”. The bounding boxes containing the pixel labels are interpreted as the markers. Choi further details methods steps for the object detection in which in step 435, “when the bounding box is generated, the image processing apparatus 100 can generate a confidence score. In an example, the generated bounding box may have a rectangular shape surrounding the periphery of an object”.  In step 437, when the bounding box is determined, the image processing apparatus 100 can distinguish the classes of the individual objects by objects surrounded with the bounding box and detect the objects from the input image, interpreted as the routine processing performed of classifying object via 2D detection as the boundary boxes having the pixel labels do not overlap as paragraph [0101] discusses when they do overlap (see paragraphs [0098]-[0102]) as further detailed in the rejections of the office action. 
In regards to independent claims 7 and 13, these claims recite limitations similar in scope to that of claim 1, and therefore remain rejected under the same rationale as provided above and further detailed in the rejections of the office action below.
In regards to dependent claims 2-6, 8-12, and 14-18, these claims depend from rejected base claims 1, 7, and 13, and therefore they remain rejected under the same rationale as provided above and further detailed in the rejections of the office action below.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-3, 5-9, 11-15, 17, and 18  are rejected under 35 U.S.C. 103 as being unpatentable over Kutliroff (US 2017/0243352 A1, hereinafter referenced “Kutliroff”) in view of Choi (US 2017/0169313 A1, hereinafter referenced “Choi”).

In regards to claim 1 (Currently Amended), Kutliroff discloses a method (Kutliroff, Abstract), comprising: 
-conducting raycasting on a plurality of images to generate a point cloud (Kutliroff, paragraphs [0033] and [0042]; Reference at [0033] discloses the camera pose calculation circuit 302 determines the 3D position of the camera at each frame (i.e. multiple images), in a global coordinate system. Consequently, 3D points extracted from the associated depth maps can also be transformed or projected to this coordinate system (i.e. raycasting or projection of 3D points or a point cloud)); 
-executing two dimensional (2D) object detection on the plurality of images (Kutliroff, Fig. 6 , paragraphs [0036] and [0061]; Reference at paragraph [0036] discloses that FIG. 6 illustrates an example of detected objects in a 3D image, in accordance with certain of the embodiments disclosed herein. The detected and recognized objects in an RGB image of the scene, associated with one camera pose, are shown including for example the lamp 610. Paragraph [0061] describes referring back to FIG. 6 examples are illustrated of 2D bounding boxes applied to recognized objects in an RGB image of the scene associated with one camera pose. The use of the 2D bounding box interpreted as the 2D object detection in the images); 
-for the 2D object detection recognizing an object: determining a location of the object in three dimensional (3D) space from the point cloud (Kutliroff, paragraphs [0035] and [0061]; Reference at paragraph [0035] discloses the object detection/recognition circuit 504 may be configured to process the RGB image, and in some embodiments the associated depth map as well, along with the 3D reconstruction, to generate a list of any objects of interest recognized in the image. The object location circuit 508 may be configured to determine an associated location of each object in the scene. Paragraph [0061] further details that the location is a 3D location of the center of the 2D bounding box computed for the object contained. Reference discuses generating of rays from camera to location of projected points in 3D space (i.e. point cloud see paragraph [0063] with respect to object detected within bounding box))


-and placing a marker in the 3D space to represent the object based on the classifying (Kutliroff, paragraph [0061]; Reference at paragraph [0061] discloses the object detection circuit 134 may be configured to process the RGB image, and in some embodiments the associated depth map as well, to generate a list of any objects of interest recognized in the image. A label may be attached to each of the recognized objects and a 2D bounding box is generated which contains the object. The label interpreted as the marker placed representing the object based on classifying).  
Kutliroff does not explicitly disclose but Choi teaches
-determining whether the location overlaps another marker (Choi, paragraph [0101]; Reference discloses scenario in which overlap between markers is detected as it discloses when the bounding boxes determined to be used overlap each other, the overlapped bounding boxes can be merged and can be considered to be a bounding box surrounding a single object. Without assigning particular classifiers to individual objects in the merged bounding box, the classes of the individual objects can be classified using the result of pixel labeling including the pixel probability distribution information of the image on the basis of the reinforced feature map. The bounding boxes containing the pixel labels interpreted as markers
-for the determination indicative of the location not overlapping another marker: classifying the object from the 2D object detection (Choi, paragraph [0099] and [0102]; Reference at paragraph [0099] discloses step 435 in which In 435, when the bounding box is generated, the image processing apparatus 100 can generate a confidence score. In an example, the generated bounding box may have a rectangular shape surrounding the periphery of an object. Paragraph [0102] discloses in 437, when the bounding box is determined, the image processing apparatus 100 can distinguish the classes of the individual objects by objects surrounded with the bounding box and detect the objects from the input image (interpreted as the routine processing performed of classifying object via 2D detection as the boundary boxes having the pixel labels do not overlap as paragraph [0101] discusses when they do overlap),
Kutliroff and Choi are combinable because they are in the same field of endeavor regarding use of object detection features. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention for the 3D scene analysis features of Kutliroff to include the deep learning features of Choi in order to provide the user with a system that allows for use of detection, segmentation, and registrations steps for objects within the scene where tools such as bounding boxes and labels can be inserted into the scene as taught by Kutliroff, while incorporating the deep learning features of Choi to allow for use of object detection and classification tools such as feature maps for objects identified in captured images for merging object analysis data in a final stage thus increasing confidence of the object detection results, applicable to improving the object detection and reconstruction methods as taught in Kutliroff. 

In regards to claim 2 (Original). Kutliroff in view of Choi teach the method of claim 1.
Kutliroff further discloses
-wherein the plurality of images are associated with one or more of a position and acceleration of a device that captured the plurality of images (Kutliroff, paragraph [0029]; Reference discloses the reconstruction circuit is shown to include a camera pose calculation circuit 302, a depth pixel accumulation circuit 306 and, in some embodiments, inertial sensors 304 such as, for example, a gyroscope and/or an accelerometer. An example rendering 400 of a 3D reconstruction of the scene shown in FIG. 2 is illustrated in FIG. 4. This 3D reconstruction is composed of a relatively large number of points in 3D space, corresponding to structures within the scene, and may be represented in one of several ways including, for example, a signed distance function in a volumetric structure, or, equivalently, a polygonal mesh. The reconstruction circuit capturing pose and acceleration information from the image of the scene interpreted as wherein the plurality of images are associated with one or more of a position and acceleration of a device that captured the plurality of images); 
Kutliroff, paragraphs [0031] and [0049]; Reference at paragraph [0031] discloses in some embodiments, the camera pose may be calculated using an RGB-based Simultaneous Localization and Mapping (SLAM) algorithm which is configured to extract feature descriptors from each RGB frame, match corresponding features across multiple frames and calculate the 6DOF camera pose for each frame through triangulation. Alternatively, data from inertial sensors 304, such as gyroscopes and accelerometers, may be used, either independently, or in combination with the results of the RGB SLAM technique to obtain a more robust estimate of the camera pose. Paragraph [0049] discloses that the overall effect, as presented to the user, for example on display element 112, is that of the selected deleted object(s) being removed from the scene and the selected virtual object(s) being inserted into the scene. The described process may be repeated for each frame generated by the depth camera, so that as the user moves the depth camera, which may be integrated in a tablet, around the scene, the camera pose and the masks generated for each frame are continuously recomputed and super-imposed on the current frame from the camera, in order to maintain the realistic effect. The presentation of the scene on the display as the user moves and as the accelerating and poses is calculated is interpreted as the projecting the 3D space for display on the device based on the one or more of the position and acceleration of the device).  

In regards to claim 3 (Original). Kutliroff in view of Choi teach the method of claim 2.
Kutliroff further discloses
Kutliroff, paragraphs [0033] and [0042]; Reference at [0033] discloses the camera pose calculation circuit 302 determines the 3D position of the camera at each frame, in a global coordinate system. Consequently, 3D points extracted from the associated depth maps can also be transformed or projected to this coordinate system (i.e. raycast or projection of 3D points). Paragraph [0042] discloses the feature detection circuit 902 may be configured to detect features in both the detected/segmented objects and the source objects (models) (i.e. database of previously raycast 3D points). These features may include, for example, 3D corners or any other suitable distinctive features of the object. In some embodiments, the RGB image frames are stored and mapped to the 3D reconstruction, enabling the use of 2D feature detection techniques such as Scale Invariant Feature Transform (SIFT) detection and Speeded-Up Robust Feature (SURF) detection…Because some of the matches may be incorrect, and the object data set may be noisy and/or missing some points, the RANSAC circuit 906 may be configured to iteratively improve the 3D transformation alignment (i.e. the 3D points not meeting a sufficient density). An approximate 3D transformation, generated by the RANSAC circuit 906 is applied to the source object, and the Iterative Closest Point (ICP) circuit 908 may be configured to further improve or refine the computation of the 3D transformation.).  

In regards to claim 5 (Original). Kutliroff in view of Choi teach the method of claim 2.
Kutliroff further discloses
Kutliroff, paragraphs [0046]; Reference discloses in some embodiments, the AR manipulation circuit 140 may be configured to allow a user to select one or more objects of interest in the scene to be deleted or to be replaced by a virtual object. For example, the user may be presented with a list of the detected objects and allowed to make selections including deletion, replacement or insertion of new objects into the scene).  

In regards to claim 6 (Original). Kutliroff in view of Choi teach the method of claim 1. 
Kutliroff further discloses
-wherein the classifying the object from the 2D detection comprises determining a type of the object (Kutliroff, paragraph [0061]; Reference at paragraph [0061] discloses the object detection circuit 134 may be configured to process the RGB image, and in some embodiments the associated depth map as well, to generate a list of any objects of interest recognized in the image. A label may be attached to each of the recognized objects and a 2D bounding box is generated which contains the object…For example, the object recognized and labeled as a lamp 610 is contained within a boundary box 620), 
-and determining a size of the marker from the type of the object (Kutliroff, paragraph [0049]; Reference discloses text and graphic displays overlaid onto the scene to guide the user measure object sizes, etc.).  

In regards to claim 7 (Currently Amended). Kutliroff discloses a non-transitory computer readable medium, storing instructions for executing a process (Kutliroff paragraph [0148]; Reference discloses the embodiments implementing a computer-readable medium), the instructions comprising: 
-conducting raycasting on a plurality of images to generate a point cloud (Kutliroff, paragraphs [0033] and [0042]; Reference at [0033] discloses the camera pose calculation circuit 302 determines the 3D position of the camera at each frame (i.e. multiple images), in a global coordinate system. Consequently, 3D points extracted from the associated depth maps can also be transformed or projected to this coordinate system (i.e. raycasting or projection of 3D points or a point cloud)); 
-executing two dimensional (2D) object detection on the plurality of images (Kutliroff, Fig. 6 , paragraphs [0036] and [0061]; Reference at paragraph [0036] discloses that FIG. 6 illustrates an example of detected objects in a 3D image, in accordance with certain of the embodiments disclosed herein. The detected and recognized objects in an RGB image of the scene, associated with one camera pose, are shown including for example the lamp 610. Paragraph [0061] describes referring back to FIG. 6 examples are illustrated of 2D bounding boxes applied to recognized objects in an RGB image of the scene associated with one camera pose. The use of the 2D bounding box interpreted as the 2D object detection in the images); 
-for the 2D object detection recognizing an object: determining a location of the object in three dimensional (3D) space from the point cloud (Kutliroff, paragraphs [0035] and [0061]; Reference at paragraph [0035] discloses the object detection/recognition circuit 504 may be configured to process the RGB image, and in some embodiments the associated depth map as well, along with the 3D reconstruction, to generate a list of any objects of interest recognized in the image. The object location circuit 508 may be configured to determine an associated location of each object in the scene. Paragraph [0061] further details that the location is a 3D location of the center of the 2D bounding box computed for the object contained. Reference discuses generating of rays from camera to location of projected points in 3D space (i.e. point cloud see paragraph [0063] with respect to object detected within bounding box)); 


-and placing a marker in the 3D space to represent the object based on the classifying (Kutliroff, paragraph [0061]; Reference at paragraph [0061] discloses the object detection circuit 134 may be configured to process the RGB image, and in some embodiments the associated depth map as well, to generate a list of any objects of interest recognized in the image. A label may be attached to each of the recognized objects and a 2D bounding box is generated which contains the object. The label interpreted as the marker placed representing the object based on classifying).  
Kutliroff does not explicitly disclose but Choi teaches
-determining whether the location overlaps another marker (Choi, paragraph [0101]; Reference discloses scenario in which overlap between markers is detected as it discloses when the bounding boxes determined to be used overlap each other, the overlapped bounding boxes can be merged and can be considered to be a bounding box surrounding a single object. Without assigning particular classifiers to individual objects in the merged bounding box, the classes of the individual objects can be classified using the result of pixel labeling including the pixel probability distribution information of the image on the basis of the reinforced feature map. The bounding boxes containing the pixel labels interpreted as markers);
-for the determination indicative of the location not overlapping another marker: classifying the object from the 2D object detection (Choi, paragraph [0099] and [0101]; Reference at paragraph [0099] discloses step 435 in which In 435, when the bounding box is generated, the image processing apparatus 100 can generate a confidence score. In an example, the generated bounding box may have a rectangular shape surrounding the periphery of an object. Paragraph [0102] discloses in 437, when the bounding box is determined, the image processing apparatus 100 can distinguish the classes of the individual objects by objects surrounded with the bounding box and detect the objects from the input image (interpreted as the routine processing performed of classifying object via 2D detection as the boundary boxes having the pixel labels do not overlap as paragraph [0101] discusses when they do overlap),
Kutliroff and Choi are combinable because they are in the same field of endeavor regarding use of object detection features. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention for the 3D scene analysis features of Kutliroff to include the deep learning features of Choi in order to provide the user with a system that allows for use of detection, segmentation, and registrations steps for objects within the scene where tools such as bounding boxes and labels can be inserted into the scene as taught by Kutliroff, while incorporating the deep learning features of Choi to allow for use of object detection and classification tools such as feature maps for objects identified in captured images for merging object analysis data in a final stage thus increasing confidence of the object detection results, applicable to improving the object detection and reconstruction methods as taught in Kutliroff. 

In regards to claim 8 (Original). Kutliroff in view of Choi teach the non-transitory computer readable medium of claim 7.
Kutliroff further discloses
-wherein the plurality of images are associated with one or more of a position and acceleration of a device that captured the plurality of images (Kutliroff, paragraph [0029]; Reference discloses the reconstruction circuit is shown to include a camera pose calculation circuit 302, a depth pixel accumulation circuit 306 and, in some embodiments, inertial sensors 304 such as, for example, a gyroscope and/or an accelerometer. An example rendering 400 of a 3D reconstruction of the scene shown in FIG. 2 is illustrated in FIG. 4. This 3D reconstruction is composed of a relatively large number of points in 3D space, corresponding to structures within the scene, and may be represented in one of several ways including, for example, a signed distance function in a volumetric structure, or, equivalently, a polygonal mesh. The reconstruction circuit capturing pose and acceleration information from the image of the scene interpreted as wherein the plurality of images are associated with one or more of a position and acceleration of a device that captured the plurality of images);  
-wherein the instructions further comprises projecting the 3D space for display on the device based on the one or more of the position and acceleration of the device (Kutliroff, paragraphs [0031] and [0049]; Reference at paragraph [0031] discloses in some embodiments, the camera pose may be calculated using an RGB-based Simultaneous Localization and Mapping (SLAM) algorithm which is configured to extract feature descriptors from each RGB frame, match corresponding features across multiple frames and calculate the 6DOF camera pose for each frame through triangulation. Alternatively, data from inertial sensors 304, such as gyroscopes and accelerometers, may be used, either independently, or in combination with the results of the RGB SLAM technique to obtain a more robust estimate of the camera pose. Paragraph [0049] discloses that the overall effect, as presented to the user, for example on display element 112, is that of the selected deleted object(s) being removed from the scene and the selected virtual object(s) being inserted into the scene. The described process may be repeated for each frame generated by the depth camera, so that as the user moves the depth camera, which may be integrated in a tablet, around the scene, the camera pose and the masks generated for each frame are continuously recomputed and super-imposed on the current frame from the camera, in order to maintain the realistic effect. The presentation of the scene on the display as the user moves and as the accelerating and poses is calculated is interpreted as the projecting the 3D space for display on the device based on the one or more of the position and acceleration of the device) 

In regards to claim 9 (Original). Kutliroff in view of Choi teach the non-transitory computer readable medium of claim 8.
Kutliroff further discloses
Kutliroff, paragraphs [0033] and [0042]; Reference at [0033] discloses the camera pose calculation circuit 302 determines the 3D position of the camera at each frame, in a global coordinate system. Consequently, 3D points extracted from the associated depth maps can also be transformed or projected to this coordinate system (i.e. raycast or projection of 3D points). Paragraph [0042] discloses the feature detection circuit 902 may be configured to detect features in both the detected/segmented objects and the source objects (models) (i.e. database of previously raycast 3D points). These features may include, for example, 3D corners or any other suitable distinctive features of the object. In some embodiments, the RGB image frames are stored and mapped to the 3D reconstruction, enabling the use of 2D feature detection techniques such as Scale Invariant Feature Transform (SIFT) detection and Speeded-Up Robust Feature (SURF) detection…Because some of the matches may be incorrect, and the object data set may be noisy and/or missing some points, the RANSAC circuit 906 may be configured to iteratively improve the 3D transformation alignment (i.e. the 3D points not meeting a sufficient density). An approximate 3D transformation, generated by the RANSAC circuit 906 is applied to the source object, and the Iterative Closest Point (ICP) circuit 908 may be configured to further improve or refine the computation of the 3D transformation.).  

In regards to claim 11 (Original). Kutliroff in view of Choi teach the non-transitory computer readable medium of claim 8.

-the instructions further comprising providing an interface to the device configured to add or remove one or more objects detected in the 2D object detection from the plurality of images (Kutliroff, paragraphs [0046]; Reference discloses in some embodiments, the AR manipulation circuit 140 may be configured to allow a user to select one or more objects of interest in the scene to be deleted or to be replaced by a virtual object. For example, the user may be presented with a list of the detected objects and allowed to make selections including deletion, replacement or insertion of new objects into the scene).   

In regards to claim 12 (Original). Kutliroff in view of Choi teach the non-transitory computer readable medium of claim 8.
Kutliroff further discloses
-wherein the classifying the object from the 2D detection comprises determining a type of the object (Kutliroff, paragraph [0061]; Reference at paragraph [0061] discloses the object detection circuit 134 may be configured to process the RGB image, and in some embodiments the associated depth map as well, to generate a list of any objects of interest recognized in the image. A label may be attached to each of the recognized objects and a 2D bounding box is generated which contains the object…For example, the object recognized and labeled as a lamp 610 is contained within a boundary box 620), and determining a size of the marker from the type of the object (Kutliroff, paragraph [0049]; Reference discloses text and graphic displays overlaid onto the scene to guide the user, measure object sizes, etc.).  

In regards to claim 13 (Currently Amended). Kutliroff discloses an apparatus (Kutliroff, Abstract), comprising: 
-a processor, configured to: conduct raycasting on a plurality of images to generate a point cloud (Kutliroff, paragraphs [0033] and [0042]; Reference at [0033] discloses the camera pose calculation circuit 302 determines the 3D position of the camera at each frame (i.e. multiple images), in a global coordinate system. Consequently, 3D points extracted from the associated depth maps can also be transformed or projected to this coordinate system (i.e. raycasting or projection of 3D points or a point cloud)); 
-execute two dimensional (2D) object detection on the plurality of images (Kutliroff, Fig. 6 , paragraphs [0036] and [0061]; Reference at paragraph [0036] discloses that FIG. 6 illustrates an example of detected objects in a 3D image, in accordance with certain of the embodiments disclosed herein. The detected and recognized objects in an RGB image of the scene, associated with one camera pose, are shown including for example the lamp 610. Paragraph [0061] describes referring back to FIG. 6 examples are illustrated of 2D bounding boxes applied to recognized objects in an RGB image of the scene associated with one camera pose. The use of the 2D bounding box interpreted as the 2D object detection in the images); 
-for the 2D object detection recognizing an object: determine a location of the object in three dimensional (3D) space from the point cloud (Kutliroff, paragraphs [0035] and [0061]; Reference at paragraph [0035] discloses the object detection/recognition circuit 504 may be configured to process the RGB image, and in some embodiments the associated depth map as well, along with the 3D reconstruction, to generate a list of any objects of interest recognized in the image. The object location circuit 508 may be configured to determine an associated location of each object in the scene. Paragraph [0061] further details that the location is a 3D location of the center of the 2D bounding box computed for the object contained. Reference discuses generating of rays from camera to location of projected points in 3D space (i.e. point cloud see paragraph [0063] with respect to object detected within bounding box))


-and place a marker in the 3D space to represent the object based on the classification (Kutliroff, paragraph [0061]; Reference at paragraph [0061] discloses the object detection circuit 134 may be configured to process the RGB image, and in some embodiments the associated depth map as well, to generate a list of any objects of interest recognized in the image. A label may be attached to each of the recognized objects and a 2D bounding box is generated which contains the object. The label interpreted as the marker placed representing the object based on classifying).  
Kutliroff does not explicitly disclose but Choi teaches
-determining whether the location overlaps another marker (Choi, paragraph [0101]; Reference discloses scenario in which overlap between markers is detected as it discloses when the bounding boxes determined to be used overlap each other, the overlapped bounding boxes can be merged and can be considered to be a bounding box surrounding a single object. Without assigning particular classifiers to individual objects in the merged bounding box, the classes of the individual objects can be classified using the result of pixel labeling including the pixel probability distribution information of the image on the basis of the reinforced feature map. The bounding boxes containing the pixel labels interpreted as markers
-for the determination indicative of the location not overlapping another marker: classify the object from the 2D object detection (Choi, paragraph [0099] and [0101]; Reference at paragraph [0099] discloses step 435 in which In 435, when the bounding box is generated, the image processing apparatus 100 can generate a confidence score. In an example, the generated bounding box may have a rectangular shape surrounding the periphery of an object. Paragraph [0102] discloses in 437, when the bounding box is determined, the image processing apparatus 100 can distinguish the classes of the individual objects by objects surrounded with the bounding box and detect the objects from the input image (interpreted as the routine processing performed of classifying object via 2D detection as the boundary boxes having the pixel labels do not overlap as paragraph [0101] discusses when they do overlap); 
Kutliroff and Choi are combinable because they are in the same field of endeavor regarding use of object detection features. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention for the 3D scene analysis features of Kutliroff to include the deep learning features of Choi in order to provide the user with a system that allows for use of detection, segmentation, and registrations steps for objects within the scene where tools such as bounding boxes and labels can be inserted into the scene as taught by Kutliroff, while incorporating the deep learning features of Choi to allow for use of object detection and classification tools such as feature maps for objects identified in captured images for merging object analysis data in a final stage thus increasing confidence of the object detection results, applicable to improving the object detection and reconstruction methods as taught in Kutliroff. 

In regards to claim 14 (Original). Kutliroff in view of Choi teach the apparatus of claim 13.
Kutliroff further discloses
-wherein the plurality of images are associated with one or more of a position and acceleration of a device that captured the plurality of images (Kutliroff, paragraph [0029]; Reference discloses the reconstruction circuit is shown to include a camera pose calculation circuit 302, a depth pixel accumulation circuit 306 and, in some embodiments, inertial sensors 304 such as, for example, a gyroscope and/or an accelerometer. An example rendering 400 of a 3D reconstruction of the scene shown in FIG. 2 is illustrated in FIG. 4. This 3D reconstruction is composed of a relatively large number of points in 3D space, corresponding to structures within the scene, and may be represented in one of several ways including, for example, a signed distance function in a volumetric structure, or, equivalently, a polygonal mesh. The reconstruction circuit capturing pose and acceleration information from the image of the scene interpreted as wherein the plurality of images are associated with one or more of a position and acceleration of a device that captured the plurality of images); 
-wherein the processor is configured to project the 3D space for display on the device based on the one or more of the position and acceleration of the device (Kutliroff, paragraphs [0031] and [0049]; Reference at paragraph [0031] discloses in some embodiments, the camera pose may be calculated using an RGB-based Simultaneous Localization and Mapping (SLAM) algorithm which is configured to extract feature descriptors from each RGB frame, match corresponding features across multiple frames and calculate the 6DOF camera pose for each frame through triangulation. Alternatively, data from inertial sensors 304, such as gyroscopes and accelerometers, may be used, either independently, or in combination with the results of the RGB SLAM technique to obtain a more robust estimate of the camera pose. Paragraph [0049] discloses that the overall effect, as presented to the user, for example on display element 112, is that of the selected deleted object(s) being removed from the scene and the selected virtual object(s) being inserted into the scene. The described process may be repeated for each frame generated by the depth camera, so that as the user moves the depth camera, which may be integrated in a tablet, around the scene, the camera pose and the masks generated for each frame are continuously recomputed and super-imposed on the current frame from the camera, in order to maintain the realistic effect. The presentation of the scene on the display as the user moves and as the accelerating and poses is calculated is interpreted as the projecting the 3D space for display on the device based on the one or more of the position and acceleration of the device).  

In regards to claim 15 (Original). Kutliroff in view of Choi teach the apparatus of claim 14.
Kutliroff further discloses
-the processor further configured to, for the point cloud not meeting a sufficient density, project additional points from a database of previously raycast point clouds based on Kutliroff, paragraphs [0033] and [0042]; Reference at [0033] discloses the camera pose calculation circuit 302 determines the 3D position of the camera at each frame, in a global coordinate system. Consequently, 3D points extracted from the associated depth maps can also be transformed or projected to this coordinate system (i.e. raycast or projection of 3D points). Paragraph [0042] discloses the feature detection circuit 902 may be configured to detect features in both the detected/segmented objects and the source objects (models) (i.e. database of previously raycast 3D points). These features may include, for example, 3D corners or any other suitable distinctive features of the object. In some embodiments, the RGB image frames are stored and mapped to the 3D reconstruction, enabling the use of 2D feature detection techniques such as Scale Invariant Feature Transform (SIFT) detection and Speeded-Up Robust Feature (SURF) detection…Because some of the matches may be incorrect, and the object data set may be noisy and/or missing some points, the RANSAC circuit 906 may be configured to iteratively improve the 3D transformation alignment (i.e. the 3D points not meeting a sufficient density). An approximate 3D transformation, generated by the RANSAC circuit 906 is applied to the source object, and the Iterative Closest Point (ICP) circuit 908 may be configured to further improve or refine the computation of the 3D transformation.).  

In regards to claim 17 (Original). Kutliroff in view of Choi teach the apparatus of claim 14.
Kutliroff further discloses
Kutliroff, paragraphs [0046]; Reference discloses in some embodiments, the AR manipulation circuit 140 may be configured to allow a user to select one or more objects of interest in the scene to be deleted or to be replaced by a virtual object. For example, the user may be presented with a list of the detected objects and allowed to make selections including deletion, replacement or insertion of new objects into the scene). 

In regards to claim 18 (Original). Kutliroff in view of Choi teach the apparatus of claim 14.
Kutliroff further discloses
-wherein the processor is configured to classify the object from the 2D detection by determining a type of the object (Kutliroff, paragraph [0061]; Reference at paragraph [0061] discloses the object detection circuit 134 may be configured to process the RGB image, and in some embodiments the associated depth map as well, to generate a list of any objects of interest recognized in the image. A label may be attached to each of the recognized objects and a 2D bounding box is generated which contains the object…For example, the object recognized and labeled as a lamp 610 is contained within a boundary box 620), 
-and determining a size of the marker from the type of the object (Kutliroff, paragraph [0049]; Reference discloses text and graphic displays overlaid onto the scene to guide the user measure object sizes, etc.).

Claims 4, 10, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Kutliroff (US 2017/0243352 A1) in view of Choi (US 2017/0169313 A1) as applied to claim 1, and further in view of Reisner-Kollmann (US 2015/0062120 A1, hereinafter referenced “Kollmann”) and Hakim (US 2015/0229838A1, hereinafter referenced “Hakim”).

In regards to claim 4 (Original). Kutliroff in view of Choi teach the method of claim 2.
Kutliroff and Choi does not disclose but Kollmann teaches
-further comprising:  - 17 - DOCS 120179-271UT1/3671436.1Attorney Docket No.: 120179-271UT1 searching the 3D space for one or more vacant areas (Kollmann, paragraphs [0069] and [0070]; Reference at [0069] discloses the boundaries (or borders) of the represented AR plane 310 may grow over time by analyzing more portions (e.g., cells) and adding them to the identified portion of the planar surface. This usually happens when more areas of the planar surface become visible due to a new viewpoint….depending on the algorithm utilized, the representation can continue to be improved and refined with the availability of additional information (e.g., images at different angles). Paragraph [0070] discloses the AR plane 310 can not only indicate where borders of a planar surface 210 may be, but also can include “holes” 320 where portions of the planar surface 210 are occluded.  The continued improvement based on adding more images of different angles with respect to the AR plane that includes “holes” or points of occlusion interpreted as the search the 3D space for one or more vacant areas); 


-and generating a recommendation for the device comprising a position and angle to conduct image capture Hakim, paragraph [0034]; Reference discloses the horizontal and vertical attributes of an object are used to determine an angle of capture and to offer suggestions to adjust the angle of the capturing device to improve the quality of the image, based on rules… The rules identify commands requesting user action to adjust the capturing device when capturing the image. The commands are presented as suggestions. (Interpreted as generating recommendations for the camera regarding angles and position for image capture)).  
Hakim does not explicitly disclose
-based on the one or more vacant areas (However, the Kollmann reference at paragraphs [0069] and [0070] previously discloses the boundaries (or borders) of the represented AR plane 310 may grow over time by analyzing more portions (e.g., cells) and adding them to the identified portion of the planar surface. This usually happens when more areas of the planar surface become visible due to a new viewpoint….depending on the algorithm utilized, the representation can continue to be improved and refined with the availability of additional information (e.g., images at different angles). Paragraph [0070] discloses the AR plane 310 can not only indicate where borders of a planar surface 210 may be, but also can include “holes” 320 where portions of the planar surface 210 are occluded.  The continued improvement based on adding more images of different angles with respect to the AR plane that includes “holes” or points of occlusion interpreted as the search the 3D space for one or more vacant areas)
Kutliroff and Choi are combinable because they are in the same field of endeavor regarding use of object detection features. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention for the 3D scene analysis features of Kutliroff to include the deep learning features of Choi in order to provide the user with a system that allows for use of detection, segmentation, and registrations steps for objects within the scene where tools such as bounding boxes and labels can be inserted into the scene as taught by Kutliroff, while incorporating the deep learning features of Choi to allow for use of object detection and classification tools such as feature maps for objects identified in captured images for merging object analysis data in a final stage thus increasing confidence of the object detection results, applicable to improving the object detection and reconstruction methods as taught in Kutliroff. 
Kutliroff and Kollmann are also combinable because they are in the same field of endeavor regarding use of object detection features. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention for the 3D scene analysis features of Kutliroff, in view of the deep learning features of Choi, to include the physical scene representation method of Kollmann in order to provide the user with a system that allows for use of detection, segmentation, and registrations steps for objects within the scene where tools such as bounding boxes and labels can be inserted into the scene as taught by Kutliroff, while incorporating the deep learning features of Choi to allow for use of object detection and classification tools such as feature maps for objects identified in captured images for merging object analysis data in a final stage. Further incorporating the physical scene representation features of Kollmann allows for constructing a digital representation of a physical scene by obtaining information about the physical scene such as the planar surfaces and objects in AR supporting 3D reconstruction and real-time based recognition, applicable to improving the object detection and reconstruction methods as taught in Kutliroff and Choi. 
Kutliroff and Hakim are also combinable because they are in the same field of endeavor regarding 3D reconstruction. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention for the 3D scene analysis features of Kutliroff, in view of the deep learning features of Choi in further view of the physical scene representation method of Kollmann, to include the photo composition and position guidance features of Hakim in order to provide the user with a system that allows for use of detection, segmentation, and registrations steps for objects within the scene where tools such as bounding boxes and labels can be inserted into the scene as taught by Kutliroff, while incorporating the deep learning features of Choi to allow for use of object detection and classification tools such as feature maps for objects identified in captured images for merging object analysis data in a final stage. Further incorporating the physical scene representation features of Kollmann allows for constructing a digital representation of a physical scene by obtaining information about the physical scene such as the planar surfaces and objects in AR supporting 3D reconstruction and real-time based recognition. Adding the photo composition and position guidance features of Hakim allows for addition of tools for analyzing attributes of objects within an image and providing suggestions to the user of the device for adjusting the device for improving the image quality and allowing for more optimal image capture applicable to improving the image capture and object detection functions in the methods as taught in Kutliroff, Choi, and Kollmann. 

In regards to claim 10 (Original). Kutliroff in view of Choi teach the non-transitory computer readable medium of claim 8.
Kutliroff and Choi does not disclose but Kollmann teaches
-the instructions further comprising: searching the 3D space for one or more vacant areas (Kollmann, paragraphs [0069] and [0070]; Reference at [0069] discloses the boundaries (or borders) of the represented AR plane 310 may grow over time by analyzing more portions (e.g., cells) and adding them to the identified portion of the planar surface. This usually happens when more areas of the planar surface become visible due to a new viewpoint….depending on the algorithm utilized, the representation can continue to be improved and refined with the availability of additional information (e.g., images at different angles). Paragraph [0070] discloses the AR plane 310 can not only indicate where borders of a planar surface 210 may be, but also can include “holes” 320 where portions of the planar surface 210 are occluded.  The continued improvement based on adding more images of different angles with respect to the AR plane that includes “holes” or points of occlusion interpreted as the search the 3D space for one or more vacant areas); 

Kollmann does not disclose but Hakim teaches
Hakim, paragraph [0034]; Reference discloses the horizontal and vertical attributes of an object are used to determine an angle of capture and to offer suggestions to adjust the angle of the capturing device to improve the quality of the image, based on rules… The rules identify commands requesting user action to adjust the capturing device when capturing the image. The commands are presented as suggestions. (Interpreted as generating recommendations for the camera regarding angles and position for image capture)).  
Hakim does not explicitly disclose
-based on the one or more vacant areas (However, the Kollmann reference at paragraphs [0069] and [0070] previously discloses the boundaries (or borders) of the represented AR plane 310 may grow over time by analyzing more portions (e.g., cells) and adding them to the identified portion of the planar surface. This usually happens when more areas of the planar surface become visible due to a new viewpoint….depending on the algorithm utilized, the representation can continue to be improved and refined with the availability of additional information (e.g., images at different angles). Paragraph [0070] discloses the AR plane 310 can not only indicate where borders of a planar surface 210 may be, but also can include “holes” 320 where portions of the planar surface 210 are occluded.  The continued improvement based on adding more images of different angles with respect to the AR plane that includes “holes” or points of occlusion interpreted as the search the 3D space for one or more vacant areas)
Kutliroff and Hakim are also combinable because they are in the same field of endeavor regarding 3D reconstruction. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention for the 3D scene analysis features of Kutliroff, in view of the deep learning features of Choi in further view of the physical scene representation method of Kollmann, to include the photo composition and position guidance features of Hakim in order to provide the user with a system that allows for use of detection, segmentation, and registrations steps for objects within the scene where tools such as bounding boxes and labels can be inserted into the scene as taught by Kutliroff, while incorporating the deep learning features of Choi to allow for use of object detection and classification tools such as feature maps for objects identified in captured images for merging object analysis data in a final stage. Further incorporating the physical scene representation features of Kollmann allows for constructing a digital representation of a physical scene by obtaining information about the physical scene such as the planar surfaces and objects in AR supporting 3D reconstruction and real-time based recognition. Adding the photo composition and position guidance features of Hakim allows for addition of tools for analyzing attributes of objects within an image and providing suggestions to the user of the device for adjusting the device for improving the image quality and allowing for more optimal image capture applicable to improving the image capture and object detection functions in the methods as taught in Kutliroff, Choi, and Kollmann. 

In regards to claim 16 (Original). Kutliroff in view of Choi teach the apparatus of claim 14.

-the processor further configured to: search the 3D space for one or more vacant areas (Kollmann, paragraphs [0069] and [0070]; Reference at [0069] discloses the boundaries (or borders) of the represented AR plane 310 may grow over time by analyzing more portions (e.g., cells) and adding them to the identified portion of the planar surface. This usually happens when more areas of the planar surface become visible due to a new viewpoint….depending on the algorithm utilized, the representation can continue to be improved and refined with the availability of additional information (e.g., images at different angles). Paragraph [0070] discloses the AR plane 310 can not only indicate where borders of a planar surface 210 may be, but also can include “holes” 320 where portions of the planar surface 210 are occluded.  The continued improvement based on adding more images of different angles with respect to the AR plane that includes “holes” or points of occlusion interpreted as the search the 3D space for one or more vacant areas); 

Kollmann does not disclose but Hakim teaches
-and generate a recommendation for the device comprising a position and angle to conduct image capture Hakim, paragraph [0034]; Reference discloses the horizontal and vertical attributes of an object are used to determine an angle of capture and to offer suggestions to adjust the angle of the capturing device to improve the quality of the image, based on rules… The rules identify commands requesting user action to adjust the capturing device when capturing the image. The commands are presented as suggestions. (Interpreted as generating recommendations for the camera regarding angles and position for image capture)).  
Hakim does not explicitly disclose
-based on the one or more vacant areas (However, the Kollmann reference at paragraphs [0069] and [0070] previously discloses the boundaries (or borders) of the represented AR plane 310 may grow over time by analyzing more portions (e.g., cells) and adding them to the identified portion of the planar surface. This usually happens when more areas of the planar surface become visible due to a new viewpoint….depending on the algorithm utilized, the representation can continue to be improved and refined with the availability of additional information (e.g., images at different angles). Paragraph [0070] discloses the AR plane 310 can not only indicate where borders of a planar surface 210 may be, but also can include “holes” 320 where portions of the planar surface 210 are occluded.  The continued improvement based on adding more images of different angles with respect to the AR plane that includes “holes” or points of occlusion interpreted as the search the 3D space for one or more vacant areas)
Kutliroff and Hakim are also combinable because they are in the same field of endeavor regarding 3D reconstruction. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention for the 3D scene analysis features of Kutliroff, in view of the deep learning features of Choi in further view of the physical scene representation method of Kollmann, to include the photo composition and position guidance features of Hakim in order to provide the user with a system that allows for use of detection, segmentation, and registrations steps for objects within the scene where tools such as bounding boxes and labels can be inserted into the scene as taught by Kutliroff, while incorporating the deep learning features of Choi to allow for use of object detection and classification tools such as feature maps for objects identified in captured images for merging object analysis data in a final stage. Further incorporating the physical scene representation features of Kollmann allows for constructing a digital representation of a physical scene by obtaining information about the physical scene such as the planar surfaces and objects in AR supporting 3D reconstruction and real-time based recognition. Adding the photo composition and position guidance features of Hakim allows for addition of tools for analyzing attributes of objects within an image and providing suggestions to the user of the device for adjusting the device for improving the image quality and allowing for more optimal image capture applicable to improving the image capture and object detection functions in the methods as taught in Kutliroff, Choi, and Kollmann. 


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: See the Notice of References Cited (PTO-892)
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to TERRELL M ROBINSON whose telephone number is (571)270-3526.  The examiner can normally be reached on 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mark Zimmerman can be reached on 571-272-7653.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private 






/TERRELL M ROBINSON/Examiner, Art Unit 2619