Detailed Action
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 1 is objected to because of the following informalities:  Claim 1 recites “taken with the image capturing device (100).”  Applicant forgot to delete “(100).”  Appropriate correction is required.

Specification 
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 

Allowable Subject Matter 
Claims 6-8 and 17-18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Claims 6 and 17, in combination with their parent claims, are distinguished from Civera et al. (Civera) (“Inverse Depth Parametrization for Monocular SLAM”) because of the limitations similar to “generating the baseline initialization coordinates, responsive to the mean of the inverses of the plurality of distances plus the standard deviation of the inverses of the plurality of distances being greater than a second scaled value; and refraining from generating the baseline initialization coordinates, responsive to the mean of the inverses of the plurality of distances plus the standard deviation of the inverses of the plurality of distances being less than the second scaled value.”  (Claim 6).  Claim 7 depends on Claim 6.   

Claims 8 and 18, in combination with their parent claims, are distinguished from Civera et al. (Civera) (“Inverse Depth Parametrization for Monocular SLAM”) because of the limitations similar to “responsive to the updated standard deviation of the inverses of the plurality of distances being greater than the first scaled value of the updated mean of the inverses of the plurality of distances, and responsive to the updated mean of the inverses of the plurality of distances plus the updated standard deviation of the inverses of the plurality of distances being greater than a second scaled value; and refraining from generating the baseline initialization coordinates, responsive to the updated standard deviation of the inverses of the plurality of distances being less than the first scaled value of the updated mean of the inverses of the plurality of distances, or responsive to the updated mean of the inverses of the plurality of distances plus the updated standard deviation of the inverses of the plurality of distances being less than the second scaled value”  (Claim 8).  

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4, 9-15, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Civera et al. (Civera) (“Inverse Depth Parametrization for Monocular SLAM”).
Regarding Claim 1, Civera teaches or suggests A method of tracking a position of an image capturing device from Two-Dimensional, 2D images (
A MONOCULAR camera is a projective sensor that measures the bearing of image features. Given an image sequence of a rigid 3-D scene taken from a moving camera, it is now well known that it is possible to compute both a scene structure and a camera motion up to a scale factor. To infer the 3-D position of each feature, the moving camera must observe it repeatedly each time, capturing a ray of light from the feature to its optic center. The measured angle between the captured rays from different viewpoints is the feature’s parallax—this is what allows its depth to be estimated.”  Civera Introduction.

    PNG
    media_image1.png
    302
    542
    media_image1.png
    Greyscale

A monocular camera image as shown in Fig. 6 is a 2D image. 
Camera motion include at least a series of camera positions: 
    PNG
    media_image2.png
    159
    383
    media_image2.png
    Greyscale
), the method comprising: 
receiving a first 2D image and a second 2D image taken with the image capturing device (100) (
“To infer the 3-D position of each feature, the moving camera must observe it repeatedly each time, capturing a ray of light from the feature to its optic center. The measured angle between the captured rays from different viewpoints is the feature’s parallax—this is what allows its depth to be estimated.”  Civera Introduction.
The moving camera captures a sequence of 2D images.
Civera Fig. 6.); 
identifying a plurality of first feature points in the first 2D image and a corresponding plurality of second feature points in the second 2D image (“To infer the 3-D position of each feature, the moving camera must observe it repeatedly each time, capturing a ray of light from the feature to its optic center. The measured angle between the captured rays from different viewpoints is the feature’s parallax—this is what allows its depth to be estimated.”  Civera Introduction.
The same features are observed in the sequence of images captured.  The repeatedly observed features correspond to the plurality of first feature points and the corresponding plu7rality of second features joints.); 
estimating a plurality of distances based on corresponding ones of the plurality of first feature points and based on corresponding ones of the plurality of second feature points (
    PNG
    media_image3.png
    336
    394
    media_image3.png
    Greyscale
); 
determining a mean and a standard deviation of inverses of the plurality of distances that were estimated (

    PNG
    media_image4.png
    490
    510
    media_image4.png
    Greyscale

“The initial value for ρ0 and its standard deviation are set empirically such that the 95% confidence region spans a range of depths from close to the camera up to infinity. In our experiments, we set ρ0 = 0.1, σρ = 0.5, which gives an inverse depth confidence region [1.1, −0.9]. Notice that infinity is included in this range.”  Civera pages 937 and 938. 
ρ0 may correspond to the “inverses of the plurality of distances.”  ρ0 represents feature points (plural) at multiple locations. ); 
generating baseline initialization coordinates based on the mean and the standard deviation of inverses of the plurality of distances (

    PNG
    media_image4.png
    490
    510
    media_image4.png
    Greyscale
 When a feature has been initialized, the baseline initialization coordinates have been generated.); and 
generating a 3D image based on the baseline initialization coordinates (
“A MONOCULAR camera is a projective sensor that measures the bearing of image features. Given an image sequence of a rigid 3-D scene taken from a moving camera, it is now well known that it is possible to compute both a scene structure and a camera motion up to a scale factor. To infer the 3-D position of each feature, the moving camera must observe it repeatedly each time, capturing a ray of light from the feature to its optic center. The measured angle between the captured rays from different viewpoints is the feature’s parallax—this is what allows its depth to be estimated.”  Civera Introduction.  The scene structure may correspond to the 3D image.). 


Regarding Claim 2, Civera teaches or suggests The method of Claim 1, 
wherein the generating baseline initialization coordinates comprises: 
generating the baseline initialization coordinates, responsive to the standard deviation of the inverses of the plurality of distances being greater than a scaled value of the mean of the inverses of the plurality of distances (
“The initial value for ρ0 and its standard deviation are set empirically such that the 95% confidence region spans a range of depths from close to the camera up to infinity. In our experiments, we set ρ0 = 0.1, σρ = 0.5, which gives an inverse depth confidence region [1.1, −0.9]. Notice that infinity is included in this range. Experimental validation has shown that the precise values of these parameters are relatively unimportant to the accurate operation of the filter as long as infinity is clearly included in the confidence interval.”  Civera pages 937 and 938. 
σρ = 0 corresponds to infinity.   Therefore, if σρ is greater than a scaled value of the mean of the inverses of the plurality of distances, 0 (infinity) is included); and 
refraining from generating the baseline initialization coordinates, responsive to the standard deviation of the inverses of the plurality of distances being less than the scaled value of the mean of the inverses of the plurality of distances (Therefore, if σρ is less than a scaled value of the mean of the inverses of the plurality of distances, 0 (infinity) is not included.  According to Civera, this is when it does not produce good results.).  

Regarding Claim 3, Civera teaches or suggests The method of Claim 2, wherein the refraining from generating the baseline initialization coordinates comprises: 
receiving a third 2D image (
“Once initialized, features are processed with the standard EKF prediction-update loop—even in the case of negative inverse depth estimates—and immediately contribute to camera location estimation within SLAM.”  Civera 937, right col. lines 17-22.  Therefore, the prediction described in section V. Feature Initialization is repeated for prediction-update.);  In re: Johannes ELG et al. PCT Application No.: PCT/US2017/049582 Filed: August 31, 2017 Page 4 
identifying a plurality of third feature points in the third 2D image that correspond to the plurality of first feature points in the first 2D image (See Claim 1 rejection for detailed analyses regarding a similar limitation regarding the first and second feature points.); 
estimating an updated plurality of distances based on corresponding ones of the plurality of first feature points and based on corresponding ones of the plurality of third feature points (See Claim 1 rejection for detailed analyses regarding a similar limitation regarding the first and second feature points.); 
generating an updated mean and an updated standard deviation of the inverses of the updated plurality of distances (See Claim 1 rejection for detailed analyses regarding a similar limitation regarding the first and second feature points.); and 
generating the baseline initialization coordinates based on the updated mean and the updated standard deviation (See Claim 1 rejection for detailed analyses regarding a similar limitation regarding the first and second feature points.).  

Regarding Claim 4, Civera teaches or suggests The method of Claim 3, wherein the third 2D image is captured at a third location that is different from a first location where the first 2D image was captured and is different from a second location where the second 2D image was captured, and wherein the first location is different from the second location (
“To infer the 3-D position of each feature, the moving camera must observe it repeatedly each time, capturing a ray of light from the feature to its optic center. The measured angle between the captured rays from different viewpoints is the feature’s parallax—this is what allows its depth to be estimated.”  Civera Introduction.  A moving camera takes different locations, which include the first, second, and third locations.   
    PNG
    media_image5.png
    906
    514
    media_image5.png
    Greyscale
 Fig. 8 also shows the Camera trajectory in 3D space.).  

Regarding Claim 9, Civera teaches or suggests The method of Claim 1, 
wherein the 3D image comprises a first 3D image further comprising: 
receiving a user input indicating that the first 3D image is to be generated (
The user input may be mapped to a user’s input to start a software to implement the algorithms described in Civera. 
The Examiner takes an Official Notice that it would have been well-known in the art that a user input may be used to start a piece of software to implement an algorithm.  The benefits of combining this well-known knowledge would have been that a user would have more control of a process.), In re: Johannes ELG et al. PCT Application No.: PCT/US2017/049582 Filed: August 31, 2017 
Page 6wherein the first 2D image is captured at a time before a second 3D image is captured (
The recited “a second 3D image” is a dangling limitation.  It could by any 3D image captured by any device and at any future point of time. 
There exists a second 3D image captured after the first 2D image.).  

Regarding Claim 10, Civera teaches or suggests The method of Claim 1, the inverses of the plurality of distances comprise reciprocals of the respective distances from a camera to the feature point (
    PNG
    media_image3.png
    336
    394
    media_image3.png
    Greyscale

    PNG
    media_image6.png
    297
    483
    media_image6.png
    Greyscale
  The point’s depth along the ray may correspond to a distance from a camera to the feature point.  The cameras are illustrated as black boxes in Fig. 3.).  

Regarding Claim 11, Civera teaches or suggests The method of Claim 1, 
wherein the receiving a first 2D image and a second 2D image comprises receiving a plurality of images, wherein the first 2D image comprises a 2D image that was received earliest in time of the plurality of images that were received, and wherein the second 2D image comprises a 2D image that was received most recently in time of the plurality of image that were received (
“A MONOCULAR camera is a projective sensor that measures the bearing of image features. Given an image sequence of a rigid 3-D scene taken from a moving camera, it is now well known that it is possible to compute both a scene structure and a camera motion up to a scale factor. To infer the 3-D position of each feature, the moving camera must observe it repeatedly each time, capturing a ray of light from the feature to its optic center. The measured angle between the captured rays from different viewpoints is the feature’s parallax—this is what allows its depth to be estimated.”  Civera Introduction.
The image sequence taken from a moving camera may correspond to the first 2D image and the second 2D image.   Because they are taken from a moving camera, the images are taken at different points of time. ).  

Regarding Claim 12, Civera teaches or suggests An imaging system for processing images, the imaging system comprising: a processor; and a memory coupled to the processor and storing computer readable program code that when executed by the processor causes the processor to perform operations comprising: 
receiving a first 2D image and a second 2D image from an image capturing system (See Claim 1 rejection for detailed analysis.); 
identifying a plurality of first feature points in the first 2D image and a corresponding plurality of second feature points in the second 2D image (See Claim 1 rejection for detailed analysis.); 
estimating a plurality of distances based on corresponding ones of the plurality of first feature points and based on corresponding ones of the plurality of second feature points (See Claim 1 rejection for detailed analysis.); 
determining a mean and a standard deviation of inverses of the plurality of distances that were estimated (See Claim 1 rejection for detailed analysis.);  In re: Johannes ELG et al. PCT Application No.: PCT/US2017/049582 Filed: August 31, 2017 Page 7 
generating baseline initialization coordinates based on the mean and the standard deviation of inverses of the plurality of distances (See Claim 1 rejection for detailed analysis.); and 
generating a 3D image based on the baseline initialization coordinates (See Claim 1 rejection for detailed analysis.).  

Regarding Claim 13, Civera teaches or suggests The imaging system of Claim 12, 
wherein the generating the baseline initialization coordinates comprises: 
generating the baseline initialization coordinates, responsive to the standard deviation of the inverses of the plurality of distances being greater than a scaled value of the mean of the inverses of the plurality of distances (See Claim 2 rejection for detailed analysis.); and 
refraining from generating the baseline initialization coordinates, responsive to the standard deviation of the inverses of the plurality of distances being less than the scaled value of the mean of the inverses of the plurality of distances (See Claim 2 rejection for detailed analysis.).  

Regarding Claim 14, Civera teaches or suggests The imaging system of Claim 13, wherein the refraining from generating the baseline initialization coordinates comprises: 
receiving a third 2D image (See Claim 3 rejection for detailed analysis.); 
identifying a plurality of third feature points in the third 2D image that correspond to the plurality of first feature points in the first 2D image (See Claim 3 rejection for detailed analysis.); 
estimating an updated plurality of distances based on corresponding ones of the plurality of first feature points and based on corresponding ones the plurality of third feature points (See Claim 3 rejection for detailed analysis.); 
generating an updated mean and an updated standard deviation of the inverses of the updated plurality of distances (See Claim 3 rejection for detailed analysis.); and 
generating the baseline initialization coordinates based on the updated mean and the updated standard deviation (See Claim 3 rejection for detailed analysis.).  

Regarding Claim 15, Civera teaches or suggests discloses The imaging system of Claim 14, 
wherein the third 2D image is captured by the image capturing system at a third location that is different from a first location where the first 2D image was captured and is different from a second location where the second 2D image was captured, and In re: Johannes ELG et al.PCT Application No.: PCT/US2017/049582Filed: August 31, 2017Page 8wherein the first location is different from the second location (See Claim 4 rejection for detailed analysis.). 

Regarding Claim 19, Civera teaches or suggests The imaging system of Claim 12, wherein the processor is configured to perform operations further comprising: receiving a user input indicating that the 3D image is to be generated, wherein the first 2D image is captured at a time before a second 3D image is captured (See Claim 5 rejection for detailed analysis.).  

Regarding Claim 20, Civera teaches or suggests A computer program product for operating an image capturing system, the computer program product comprising a non-transitory computer readable storage medium having computer readable program code embodied in the medium that when executed by a processor causes the processor to perform the method of Claim 1 (See Claim 1 rejection for detailed analysis.).

Claims 5 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Civera et al. (Civera) (“Inverse Depth Parametrization for Monocular SLAM”) in view of Lin et al. (Lin) (US 9607388 B2).
Regarding Claim 5, Civera discloses The method of Claim 4. 
However, Civera does not explicitly disclose wherein the receiving the third 2D image comprises: 
receiving the third 2D image, responsive to the third location associated with the third 2D image being less than a threshold angular separation from the first location associated with the first 2D image; and 
refraining from generating the baseline initialization coordinates, responsive to the third location associated with the third 2D image being greater than the threshold angular separation from the first location associated with the first 2D image.
Lin discloses wherein the receiving the third 2D image comprises: 
receiving the third 2D image, responsive to the third location associated with the third 2D image being less than a threshold angular separation from the first location associated with the first 2D image (
 “As another example, the keyframe selector 242 may iteratively compare the camera pose estimate 250 to a set of one or more keyframes (of the keyframes 214) until the keyframe selector 242 identifies a particular keyframe that satisfies a relative position threshold and/or a relative distance threshold. For example, the keyframe selector 242 may determine the first difference and/or the second difference for a first keyframe (e.g., a most recently generated keyframe). The keyframe selector 242 may compare the first difference to the relative position threshold and/or may compare the second difference to the relative angle threshold. If the first difference satisfies (is less than or equal to) the relative position threshold and/or if the second difference satisfies (is less than or equal to) the relative angle threshold, the keyframe may be identified as being similar (e.g., the most similar) to the camera pose estimate 250 and the camera pose estimate 250 may not be compared to another keyframe (e.g., a keyframe generated prior to the most recently generated keyframe). If none of the set of one or more keyframes satisfies the relative position threshold and/or the relative angle threshold, a most similar keyframe is not selected and the keyframe generator 201 may be instructed to generate a new keyframe based on the image frame that corresponds to the camera pose estimate 250.”  Lin col. 12 lines 20-43.); and 
refraining from generating the baseline initialization coordinates, responsive to the third location associated with the third 2D image being greater than the threshold angular separation from the first location associated with the first 2D image (
When the angle is greater than the threshold, the scene is too different from the original scene, and separate estimation may be needed.  After the combination of Civera and Lin, the system refrains from generating the baseline initialization coordiantes.).
Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to combine Civera with Lin.  The suggestion/motivation would have been in order to increase efficacy and accuracy.  When angular separation is large, it may introduce inaccuracies for the initialization.

Regarding Claim 16, Civera in view of Lin discloses The imaging system of Claim 15, wherein the receiving the third 2D image from the image capturing system comprises: receiving the third 2D image from the image capturing system, responsive to the third location associated with the third 2D image being less than a threshold angular separation from the first location associated with the first 2D image; and refraining from generating the baseline initialization coordinates, responsive to the third location associated with the third 2D image being greater than the threshold angular separation from the first location associated with the first 2D image (See Claim 5 rejection for detailed analysis.).

Conclusion 
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Montiel et al. (“United Inverse Depth Parametrization for Monocular SLAM”), which also discloses the use of inverse depth for a SLAM algorithm. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZHENGXI LIU whose telephone number is (571)270-7509.  The examiner can normally be reached on M-F 9 AM - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on (571) 272-7794.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/ZHENGXI LIU/Primary Examiner, Art Unit 2611