DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-2, 7-11, 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shotton et al. (“Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images”) in view of Powers et al. (US 10504008 B1).
Regarding claim 1, Shotton teaches:
A method predicting… a three-dimensional (3D) representation in a 3D scene, using one or more two-dimensional (2D) representations obtained from at least one camera- generated image, wherein the predicating comprises ..obtaining a regression tree forest, (page 2930, left bottom: “As illustrated in Fig. 1, the forest is trained to directly predict correspondences from any image pixel to points in the scene’s 3D world coordinate frame.” The image pixel is in 2D frame image captured by a camera.) and for said at least one image: 
extracting a 2D representation associated with one or more positions within the image, (2931. Left teaches extracting 2D information of an image pixel in position p: “”

    PNG
    media_image1.png
    749
    496
    media_image1.png
    Greyscale
)
predicting a 3D representation, corresponding to the extracted 2D representation, using the regression tree forest which comprises a set of possible associations between at least one 2D representation and a 3D representation, each possible association resulting from a predictive model,  (page 2931: right teaches a 3D information of an pixel information at location p is found out by traversing the regression forest:
: “
    PNG
    media_image2.png
    390
    509
    media_image2.png
    Greyscale
”)
evaluating one or more of the possible associations of the regression tree forest, according to a predetermined confidence criterion, (2931, right bottom teaches when traversing the regression forest, the 2D information at location p is evaluated using the weak learner as shown in formular 1 below. Based on the result value (1 or 0), different traversing path is used:
“
    PNG
    media_image3.png
    374
    487
    media_image3.png
    Greyscale
”) and 
However, Shotton does not explicitly teach, but Powers teaches:
updating the regression tree forest comprising a deactivation of one or more possible associations, depending on the evaluation of said possible associations. (In the regression forest, each leaf node represents an associations. Powers col. 13, left upper teaches remove or deactivate some of the nodes during evaluation: “At 314, the spatial interaction system may train the forest using the training samples based on weak learner parameter space, and at 318, the spatial interaction system may train the leaf nodes of the forest using the scene specific samples based on one or more node splitting objectives. For example, during the leaf node training the spatial interaction system may remove any empty branches and add any missing branches based on the weak learner parameter and/or the node splitting objectives. For example, the spatial interaction system may remove dead branches during pre-training. For instance, it is possible that some intermediate split nodes don't have children nodes and/or some nodes do not receive training samples. As a consequence these branches are removed since these branches do not describe the appearance of the training samples. Similarly, during the real-time leaf node training, it could happen that some intermediate split nodes not containing children, need to be further split to produce deeper branches (filling missing branches during pre-training).”)
by a device (FIG. 10, e.g. phone, HMD… all have a computer. )
Shotton teaches a regression forest tree training process. Powers teaches a combination of pre-training and real-time training process. During the real-time training process, the image 2D and 3D information association, represented in the regression forest tree, may be updated based on the evaluation of the 2D and 3D information association, so the updated regression forest tree reflects the updated image information.
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to have combined the teachings of Shotton with the specific teachings of Powers to provide a method that can rebuild or relocate a new environment in a fast and easy way. (Powers, col. 1 upper)
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to have combined the teachings of Shotton with the device teachings of Powers to make it possible to implement the method Shotton on an computerized environment.

Regarding claim 2, Shotton in view of Powers teaches:
The method according to claim 1, wherein in the evaluating, the prediction can be evaluated for at least one possible association as being correct or incorrect according to the predetermined confidence criterion. (Shotton, Abstract teaches that the 2D information and 3D information is evaluated to find some 3D information (coordinates) are inlier (correct, which would be used for camera pose updating) or outlier (incorrect, which would not be used for camera pose updating): “The camera pose is inferred using a robust optimization scheme. This starts with an initial set of hypothesized camera poses, constructed by applying the forest at a small fraction of image pixels. Preemptive RANSAC then iterates sampling more pixels at which to evaluate the forest, counting inliers, and refining the hypothesized poses. We evaluate on several varied scenes captured with an RGB-D camera and observe that the proposed technique achieves highly accurate relocalization and substantially out-performs two state of the art baselines.” Section 3.1 teaches the details of using energy function to decide the inlier or outlier pixels when evaluating acquired 3D coordinates from the regression forest. The inlier pixels corresponds to correct associations, which are used to update camera pose. )

Regarding claim 7, Shotton in view of Powers teaches: 
The method according to further comprising an update of the pose parameters of said camera, according to the possible associations assessed as correct or incorrect. (Shotton Abstract: “The camera pose is inferred using a robust optimization scheme. This starts with an initial set of hypothesized camera poses, constructed by applying the forest at a small fraction of image pixels. Preemptive RANSAC then iterates sampling more pixels at which to evaluate the forest, counting inliers, and refining the hypothesized poses. We evaluate on several varied scenes captured with an RGB-D camera and observe that the proposed technique achieves highly accurate relocalization and substantially out-performs two state of the art baselines.” Section 3.1 teaches the details of using energy function to decide the inlier or outlier pixels when evaluating acquired 3D coordinates from the regression forest. The inlier pixels corresponds to correct associations, which are used to update camera pose. )

Claim 8 recites similar limitations of claim 1, thus are rejected using the same rejection rationale.
Claim 9 recites similar limitations of claim 1, in a form of device, thus are rejected using the same rejection rationale.
Claim 10 recites similar limitations of claim 2, in a form of device, thus are rejected using the same rejection rationale.
Claim 11 recites similar limitations of claim 7, in a form of device, thus are rejected using the same rejection rationale.

Regarding claim 14, Shotton in view of Powers teaches: 
A computer-readable storage medium having recorded thereon a computer program comprising program code instructions (Powers, FIG. 10, 1014. It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to have combined the teachings of Shotton with the media with computer of Powers to make it possible implement the method of Shotton in a computerized environment.) 
The rest of claim 14 recites similar limitations of claim 1, thus are rejected using the same rejection rationale.

Claim 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Powers in view of Shotton.
Regarding claim 12, Powers teaches:
An augmented reality system (Background: “The presence of three-dimensional (3D) imaging systems, mixed reality systems, and 3D representations of real physical environments are becoming more and more commonplace. In some cases, it is also common place for users of the 3D image or mixed reality systems to revisit physical environments or scenes on more than one occasion. In these cases, the system may have to rebuild the virtual environment or relocate the individual within the virtual environment, both of which can be computationally intensive and difficult to achieve in substantially real-time.”) comprising: 
a camera capable of acquiring an image of a three-dimensional scene, (FIG. 11, col. 20: “FIG. 11 illustrates an example physical environment 1100 including a user 1102 of a spatial interaction system including an integrated image capture and display device 1104” “As the user 1102 moves through the physical environment 1100, the display device 1104 allows the user 1102 to view a virtual representation of the physical environment 1100 (e.g., to view a specific virtual environment representative of the physical environment). In other cases, the user 1102 may utilize the spatial interaction system to view a scene or other imaginary virtual environment that may incorporate one or more features of images captured by the device 1104 as, for instance, a user input or manipulatable object within the virtual scene.””)
a module capable of delivering an output image from an input image acquired by said camera, representing said scene in three dimensions and at least one real or virtual object, from : a 3D representation of the at least one object in the scene, and camera pose parameters, (Col. 20, “As the user 1102 moves through the physical environment 1100, the display device 1104 allows the user 1102 to view a virtual representation of the physical environment 1100 (e.g., to view a specific virtual environment representative of the physical environment). In other cases, the user 1102 may utilize the spatial interaction system to view a scene or other imaginary virtual environment that may incorporate one or more features of images captured by the device 1104 as, for instance, a user input or manipulatable object within the virtual scene.” “FIG. 11 illustrates an example physical environment 1100 including a user 1102 of a spatial interaction system including an integrated image capture and display device 1104”” Thus, as the user 1102 utilizes the spatial interaction system, the spatial interaction system may locate or relocate features, objects, and/or the user 1102 within a given virtual representation or scene (e.g., the user 1102 may re-enter an existing virtual world). For example, the spatial interaction system may utilize the ORB approach or the relocalization forest approach discussed above to assist with locating features and/or the user 1102 within the virtual environment or scene. While the current example, illustrates a combined image capture and display device 1104, it should be understood that in some implementations the image capture device may be separate from the display device. In some cases, in addition to relocalization, the methods and system described above may be used by the spatial interaction system during pose recovery when a pose tracker loses tracking. For example, during scanning of the scene or environment, or when tracking within a virtual environment. In other cases, the spatial interaction system may be useful to recognize a scene when revisiting the scene for purposes of loop closing (e.g., when a user revisits the same place multiple times, the system may leverage that knowledge to improve the quality of the 3D reconstructions of the physical environment the user is in).”)
the module (FIG. 10, device) comprising a device for updating installation parameters of said camera, (col. 16, middle teaches the camera pose information is updated: “Once the 2D to 3D matches are known, the spatial interaction system 700 may determine 2D to 3D correspondences 712 to locate the camera pose within the virtual environment based at least in part on the relocalization forest 710 and the 2D to 3D matches.”)
the device comprising a computing machine (FIG. 10) dedicated to or configured to: 
obtain a regression tree forest, (FIG. 3, obtain the pre-trained regression forest after step 308.)
and for said input image acquired by the camera: (col. 12, bottom: “At 310, the spatial interaction system may receive scene specific RGBD data. For example, an individual may generate a virtual representation of a specific physical environment (e.g., a user's home, bedroom, office, backyard, etc.). In these cases, the RGBD data may be captured of the specific physical environment and used to train the leaf nodes of the forest. In some cases, the scene specific RGBD data may be captured over a plurality of time periods, while in other cases, the scene specific RGBD data may be captured in close temporal proximity.”)
a 2D representation associated with one or more positions within the image, (col. 4, “The feature descriptions … and the metadata associated with each training RGBD frame (e.g., 2D feature location for each stored feature, the index of the RGBD data, the camera pose associated with the RGBD data, etc.).”)
predict the 3D representation, corresponding to the extracted 2D representation, using the regression tree forest which comprises set of possible associations between at least one 2D representation and a 3D representation, each possible association resulting from a predictive model, (FIG. 8, col. 17, upper: “This canonical feature descriptor allows for improved feature matching between 2D features 828 observed in the relocalization image 820 and 3D model features 816 learned during training.”)
evaluate one or more of the possible associations defined by the forest of regression trees, according to a predetermined confidence criterion, (col. 12, middle teaches when traversing the regression tree which defined the 2D data’s association with 3D data, the weak learner evaluation function is used: “At 304, the spatial interaction system may generate training samples from the RGBD training data. For example, the spatial interaction system may generate a dataset including a weak learner parameter space and a plurality of training features. In some cases, the training features may include gravity aligned features and/or magnetic aligned features. In some cases, each tree of the forest discussed herein, may include nodes having a weak learner, represented as a θ=(ϕ, τ), where ϕ is a feature and τ is a scalar threshold, and a splitting objective. The weak learner is a function of the feature f(ϕ) and the scalar threshold. That is, a weak learner is a function that takes a feature with its corresponding parameters and a scalar threshold and produces a binary response (produces a simplified binary description of the feature). If the feature descriptor f(ϕ) is above a threshold, the weak learner function evaluates to 1. In contrast, if the feature descriptor f(ϕ) evaluates to a value under the threshold, the weak learner function evaluates to 0. Therefore, the weak learner parameter space represents the space of possible features to test as well as the space of possible thresholds to consider for testing’) and 
update the regression tree forest including a deactivation of one or more possible associations, depending on the evaluation of said possible associations, (col. 13: “At 314, the spatial interaction system may train the forest using the training samples based on weak learner parameter space, and at 318, the spatial interaction system may train the leaf nodes of the forest using the scene specific samples based on one or more node splitting objectives. For example, during the leaf node training the spatial interaction system may remove any empty branches and add any missing branches based on the weak learner parameter and/or the node splitting objectives. For example, the spatial interaction system may remove dead branches during pre-training. For instance, it is possible that some intermediate split nodes don't have children nodes and/or some nodes do not receive training samples. As a consequence these branches are removed since these branches do not describe the appearance of the training samples. Similarly, during the real-time leaf node training, it could happen that some intermediate split nodes not containing children, need to be further split to produce deeper branches (filling missing branches during pre-training).”) and 
a display module capable of displaying the output image (col. 29, middle: “As the user 1102 moves through the physical environment 1100, the display device 1104 allows the user 1102 to view a virtual representation of the physical environment 1100 (e.g., to view a specific virtual environment representative of the physical environment). In other cases, the user 1102 may utilize the spatial interaction system to view a scene or other imaginary virtual environment that may incorporate one or more features of images captured by the device 1104 as, for instance, a user input or manipulatable object within the virtual scene. Thus, as the user 1102 utilizes the spatial interaction system, the spatial interaction system may locate or relocate features, objects, and/or the user 1102 within a given virtual representation or scene (e.g., the user 1102 may re-enter an existing virtual world). For example, the spatial interaction system may utilize the ORB approach or the relocalization forest approach discussed above to assist with locating features and/or the user 1102 within the virtual environment or scene. While the current example, illustrates a combined image capture and display device 1104, it should be understood that in some implementations the image capture device may be separate from the display device.”)
However, Powers does not explicitly teach, but shotton teaches:
extract a 2D representation associated with one or more positions within the image(2931. Left teaches extracting 2D information of an image pixel in position p: “”

    PNG
    media_image1.png
    749
    496
    media_image1.png
    Greyscale
)
update the pose parameters related to said camera, according to the possible associations assessed as correct or incorrect, (Abstract: “The camera pose is inferred using a robust optimization scheme. This starts with an initial set of hypothesized camera poses, constructed by applying the forest at a small fraction of image pixels. Preemptive RANSAC then iterates sampling more pixels at which to evaluate the forest, counting inliers, and refining the hypothesized poses. We evaluate on several varied scenes captured with an RGB-D camera and observe that the proposed technique achieves highly accurate relocalization and substantially out-performs two state of the art baselines.” Section 3.1 teaches the details of using energy function to decide the inlier or outlier pixels when evaluating acquired 3D coordinates from the regression forest. The inlier pixels corresponds to correct associations, which are used to update camera pose. )
Powers teaches updating a regression tree and updating camera parameters based on the regression tree. However, Shotton provides more details regarding the implementation of camera parameters updating and the extracting of 2D image information. Shotton’s camera parameters updating only uses the 3D coordinates that are deems as correct (inliers) to achieves highly accurate relocalization of camera.
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to have combined the teachings of Powers with the specific teachings of Shotton to achieves highly accurate relocalization of camera.(Shotton, Abstract). 

Allowable Subject Matter
Claims 3-6 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter: none of the references along or in combination teaches the limitations of  “wherein the predictive model of a possible association is defined by a distribution characterising the at least one 2D representation associated with the 3D representation for said possible association, and by the following parameters a first parameter representative of a status of the predictive model of the possible association; a second parameter representative of a number of consecutive predictions evaluated as incorrect for the possible association; a third parameter representative of the extracted 2D representations associated with the 3D representation by said possible association.” Recited in claim 3. 
Claims 4-6 are objected for the reasons of depending on claim 3.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YANNA WU whose telephone number is (571)270-0725. The examiner can normally be reached Monday-Thursday 8:00-5:30 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on 571-272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/YANNA WU/Primary Examiner, Art Unit 2611