DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The Information Disclosure Statement filed on 06/07/2020 has been considered. An initialed copy of the Form 1449 is enclosed herewith.
Status of Claims
Claims 1-20 were originally filled on 06/07/2020 and is a 371 of PCT/CN2018/093743, which was filled on 06/29/2018. 
Claim Objections
Claim 5 is objected to because line 4 states “resemble to the snapshot”. This is improper grammar. The claim should be written as “resemble the snapshot”. 
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 7-8, 10, 13-14, and 18-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding Claims 7-8, claim 7 states “if the tracked movement trajectory starting from one of the plurality of relocalization hypotheses is compatible with the pre-built visual map, the hypothesis is rejected”. However, according to paragraphs 0008 and 0039 of the specification, the hypothesis is rejected when the movement trajectory is incompatible. It is unclear if applicant has made a typo in claim 7, or if applicant is claiming the opposite of what is written in the specification. Therefore, the claim is rejected under 112b. (For examination purposes, examiner will interpret the claim as if there is a typo and that the hypothesis is rejected when the movement trajectory is incompatible).

	Regarding Claim 10, claim 10 states “a second pose” in line 4. It is unclear if this the same second pose as the one mentioned earlier in claim 9, or is this a different second pose. Therefore, the claim is rejected under 112b. (For examination purposes, examiner will interpret the second pose in claim 10 to be any pose that is different than the first pose).

	Regarding Claims 13-14, claim 13 lacks antecedent basis for “the new pose” in line 7. It is unclear if the new pose is the same as the second pose mentioned earlier in claim 9, or if this is a different pose. Therefore, the claim is rejected under 112b. (For examination purposes, examiner will interpret the new pose in claim 13 to be the same as the second pose).
	Claim 14 lacks antecedent basis for “the error threshold”. It is unclear what error threshold applicant is referring to.

	Regarding Claim 18, claim 18 lacks antecedent basis for “the new pose” in line 8. It is unclear if the new pose is the same as the second pose mentioned earlier in claim 15, or if this is a different pose. Therefore, the claim is rejected under 112b. (For examination purposes, examiner will interpret the new pose in claim 18 to be the same as the second pose).

	Regarding Claim 19, claim 19 lacks antecedent basis for “the winning hypothesis” in line 6. It is unclear if the winning hypothesis is the same as the last relocalization hypothesis mentioned earlier in the claim, or if this is a different hypothesis. Therefore, the claim is rejected under 112b. (For examination purposes, examiner will interpret the winning hypothesis to be the same as the last relocalization hypothesis).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5-6, 9, and 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Samarasekera et al (US 20080167814 A1) in view of Goncalves et al (US 20040167667 A1) (Hereinafter referred to as Samarasekera and Goncalves respectively)

Regarding Claim 1, Samarasekera discloses a method for self-relocalization in a pre-built visual map (See at least Samarasekera Paragraphs 0015-0016, the position of a user is determined and a map of the environment is used to identify objects of interest), comprising: 
capturing a snapshot using at least one visual sensor at an initial pose (See at least Samarasekera Paragraph 0035, the cameras are interpreted as visual sensors; See at least Samarasekera Paragraph 0047-0049, the first view is interpreted as a snapshot at an initial pose); 
establishing a plurality of relocalization hypotheses in the visual map based at least on the snapshot (See at least Samarasekera Paragraphs 0049-0050, the pose hypotheses are interpreted as relocalization hypotheses);
…taking additional visual measurement at the new pose (See at least Samarasekera Paragraph 0036 and Figure 5, there are multiple cameras, and each camera captures video from its own perspective, the second camera’s perspective is interpreted as the new pose); and 
implementing hypotheses refinement by using at least one of the movement trajectory and the additional visual measurement, to reject one or more relocalization hypotheses (See at least Samarasekera Paragraphs 0057-0058 and Figure 6, the pose refinement is interpreted as implementing hypothesis refinement, and the views from each camera are used to determine the pose hypothesis with the highest global score to be selected, meaning all other hypotheses are rejected).
Even though Samarasekera teaches taking visual measurements from different poses, Samarasekera fails to explicitly disclose moving the at least one visual sensor from the initial pose to a new pose with a movement trajectory tracked.
However, Goncalves teaches this limitation (See at least Goncalves Paragraphs 0161 and 0163, the camera is moved while taking images and the distance of movement between images is determined). 
It would have been obvious to one of ordinary skill to modify the teachings disclosed in Samarasekera with Goncalves to move the visual sensor from the initial pose to a new pose with the movement trajectory tracked. Taking multiple images at different poses allows the system to determine common features within the images, and use those common features to identify 3-d positions of the feature points relative to the camera (See at least Goncalves Paragraphs 0165-0168 and Figure 10). This would allow the system to localize itself using recognized features and their locations relative to the visual sensor, and allow the system to mark positions of new landmarks (See at least Goncalves Figure 10). 

Regarding Claim 5, Samarasekera discloses establishing the plurality of relocalization hypotheses comprising: 
applying an appearance-based approach to retrieve one or more key frames stored in the visual map that resemble to the snapshot (See at least Samarasekera Paragraphs 0015-0016, the map is used to identify the location of the user by using features in the video images to match the map);
matching one or more two-dimensional (2D) features in the snapshot to 2D features in the retrieved frames of the pre-built visual map (See at least Samarasekera Paragraphs 0019 and 0072, 2d features are extracted from the snapshots and matched to 2d features in the map); 
inferring or mapping the matching points in the frames to three-dimensional (3D) points in the pre-built visual map (See at least Samarasekera Paragraph 0050, the 2d image points are corresponded to 3d world points); 
using the inferred 3D points to calculate a plurality of candidate poses (See at least Samarasekera Paragraph 0050, the pose hypothesis is generated using the correspondence between the image points and the 3d world points); 
calculating reprojection error for each of the plurality of candidate poses (See at least Samarasekera Paragraphs 0054-0056, the reprojection error is calculated); and 
based on at least the calculated reprojection error, selecting one or more candidate poses as the hypotheses (See at least Samarasekera Paragraph 0057, the pose with the highest global score is selected; See at least Samarasekera Paragraph 0050, the reprojection error is used during the scoring process).

Regarding Claim 6, Samarasekera discloses the movement trajectory is tracked using non-visual sensors (See at least Samarasekera Paragraph 0011, position measurement data can be measured by an IMU, which is interpreted as a non-visual sensor).

Regarding Claim 9, Samarasekera discloses a system for self-relocalization in a pre-built visual map (See at least Samarasekera Paragraphs 0015-0016, the position of a user is determined and a map of the environment is used to identify objects of interest; See at least Samarasekera Paragraph 0035 and Figure 4, the vision-based navigation system is interpreted as the system) comprising: 
at least one visual sensor for visual measurement (See at least Samarasekera Paragraph 0035, the cameras are interpreted as visual sensors); 
…a processor coupled to the at least one visual sensor (See at least Samarasekera Paragraph 0035, the system includes a processing device; See at least Samarasekera Paragraph 0016, the images are processed by the processing system)…
a non-volatile memory storing one or more instructions (See at least Samarasekera Paragraph 0035, the program is interpreted as the instructions, and computers have non-volatile memories), when executed by the processor, causing the processor to perform the following operations: 
instructing the at least one visual sensor for visual measurement at a first pose (See at least Samarasekera Paragraph 0047-0049, the first view is interpreted as a visual measurement at a first pose); 
implementing single-shot relocalization to localize the system with respect to a pre-built visual map using the visual measurement at the first pose (See at least Samarasekera Paragraphs 0015-0016, the position of a user is determined and a map of the environment is used to identify objects of interest), the localization result comprising candidate relocalization hypotheses of the at least one visual sensor at the first pose in the pre-built visual map (See at least Samarasekera Paragraphs 0049-0050, the pose hypotheses are interpreted as relocalization hypotheses);
… instructing the at least one visual sensor for visual measurement at the second pose (See at least Samarasekera Paragraph 0036 and Figure 5, there are multiple cameras, and each camera captures video from its own perspective, the second camera’s perspective is interpreted as the second pose); and   
implementing hypotheses refinement, by using at least one of the tracked movement trajectory and the additional visual measurement, to reject one or more relocalization hypotheses (See at least Samarasekera Paragraphs 0057-0058 and Figure 6, the pose refinement is interpreted as implementing hypothesis refinement, and the views from each camera are used to determine the pose hypothesis with the highest global score to be selected, meaning all other hypotheses are rejected).
Even though Samarasekera teaches taking visual measurements from different poses, Samarasekera fails to explicitly disclose a motion system to move the at least one visual sensor; 
a processor coupled to the at least one visual sensor and the motion system; and 
instructing the motion system to move the at least one visual sensor from the first pose to a second pose with a movement trajectory tracked.
However, Goncalves teaches a motion system to move the at least one visual sensor (See at least Goncalves Paragraphs 0161 and 0163, the camera is moved while taking images; See at least Goncalves Paragraph 0064, the motors coupled to the wheels are interpreted as the motion system); 
a processor coupled to the…the motion system (See at least Goncalves Paragraphs 0065-0066, the motors are controlled by the control, which has a microprocessor); and 
instructing the motion system to move the at least one visual sensor from the first pose to a second pose with a movement trajectory tracked (See at least Goncalves Paragraphs 0161 and 0163, the camera is moved while taking images and the distance of movement between images is determined).
It would have been obvious to one of ordinary skill to modify the teachings disclosed in Samarasekera with Goncalves to move the visual sensor from the first pose to a second pose with the movement trajectory tracked. Taking multiple images at different poses allows the system to determine common features within the images, and use those common features to identify 3-d positions of the feature points relative to the camera (See at least Goncalves Paragraphs 0165-0168 and Figure 10). This would allow the system to localize itself using recognized features and their locations relative to the visual sensor, and allow the system to mark positions of new landmarks (See at least Goncalves Figure 10). 

	Regarding Claim 11, Samarasekera discloses the candidate relocalization hypotheses are hypotheses with the highest N scores or the lowest N uncertainties that pass an absolute error threshold, an error score ratio test, a heuristic algorithm, or a combination thereof, to prune obvious false hypotheses (See at least Samarasekera Paragraphs 0054-0056, the reprojection errors are calculated for each image point; See at least Samarasekera Paragraphs 0049-0050 and 0058, the pose hypothesis with the highest score is selected, and the reprojection error is used during the scoring phase; the dropping out of the least scoring half is interpreted as an error score ratio test), N being an positive integer number (See at least Samarasekera Paragraphs 0049-0050, the scoring is done based on matching points, which would be a positive integer).

	Regarding Claim 12, Samarasekera discloses the system further comprises at least one non-visual sensor configured to track the movement trajectory of the at least one visual sensor (See at least Samarasekera Paragraph 0011, position measurement data can be measured by an IMU, which is interpreted as a non-visual sensor).

Claims 2-3 are rejected under 35 U.S.C. 103 as being unpatentable over Samarasekera in view of Goncalves, and in further view of Poropat (US 20020049530 A1) (Hereinafter referred to as Poropat) 

Regarding Claim 2, Samarasekera discloses outputting a winning hypothesis among the plurality of relocalization hypotheses (See at least Samarasekera Paragraphs 0057-0058 and Figure 6, the pose hypothesis with the highest global score is selected).
Even though Samarasekera discloses outputting a winning hypothesis, modified Samarasekera fails to explicitly disclose outputting a refined relocalization pose, which is obtained by superimposing the tracked movement trajectory with the winning hypothesis.
However, Poropat teaches this limitation (See at least Poropat Paragraphs 0017-0022, the new position, which is interpreted as the refined relocalization pose, is determined using the initial position, which is interpreted as the winning hypothesis, and the displacement of the object, which is interpreted as the movement trajectory). 
It would have been obvious to one of ordinary skill to modify the teachings disclosed in modified Samarasekera with Poropat to output the winning hypothesis with a refined relocalization pose. This would allow the system to know where the visual sensor was moved to (See at least Poropat Paragraphs 0017-0022), which would increase the awareness of the system. 

Regarding Claim 3, Samarasekera discloses hypotheses refinement is implemented by using…the additional visual measurement (See at least Samarasekera Paragraphs 0057-0058 and Figure 6, the views from each camera are used to determine the pose hypothesis with the highest global score to be selected, which is interpreted as using additional visual measurement). 
Modified Samarasekera fails to explicitly disclose hypotheses refinement is implemented by using… the movement trajectory. 
However, Poropat teaches this limitation (See at least Poropat Paragraphs 0017-0022, the new position of the object is determined using the displacement, which is interpreted as the movement trajectory). 
It would have been obvious to one of ordinary skill to modify the teachings disclosed in modified Samarasekera with Poropat to implement hypothesis refinement by using the movement trajectory. Using the movement trajectory allow the system to know where the visual sensor was moved to (See at least Poropat Paragraphs 0017-0022), which would increase the awareness of the system and allow the system to determine where the object’s current position is. 

Claims 4 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Samarasekera in view of Goncalves, and in further view of Dan (US 20210339393 A1) (Hereinafter referred to as Dan) 

Regarding Claim 4, modified Samarasekera fails to explicitly disclose in response to the plurality of relocalization hypotheses rejected in total, moving the at least one visual sensor to capture a new snapshot with new relocalization hypotheses for next round of self- relocalization.
However, Dan teaches taking new snapshots from a different pose when the system cannot be localized using the first snapshot (See at least Dan Paragraph 0108, when the current position cannot be determined using the features of the ground from the original photograph, a photograph of the environment is taken instead, to start a next round of self-relocalization). 
It would have been obvious to one of ordinary skill to modify the teachings disclosed in modified Samarasekera with Dan to move the visual sensor to capture a new snapshot when all the relocalization hypotheses are rejected. This would allow the system to determine the current position using a second snapshot when the features from the first snapshot cannot be used to identify the current position (See at least Dan Paragraph 0108), which would increase the effectiveness of the system. 

Regarding Claim 10, Samarasekera fails to explicitly disclose instructing the motion system to move the at least one visual sensor from the first pose to a second pose with a movement trajectory tracked. 
However, Goncalves teaches this limitation (See at least Goncalves Paragraphs 0161 and 0163, the camera is moved while taking images and the distance of movement between images is determined). 
It would have been obvious to one of ordinary skill to modify the teachings disclosed in Samarasekera with Goncalves to move the visual sensor from the first pose to a second pose with the movement trajectory tracked. Taking multiple images at different poses allows the system to determine common features within the images, and use those common features to identify 3-d positions of the feature points relative to the camera (See at least Goncalves Paragraphs 0165-0168 and Figure 10). This would allow the system to localize itself using recognized features and their locations relative to the visual sensor, and allow the system to mark positions of new landmarks (See at least Goncalves Figure 10).
Modified Samarasekera fails to explicitly disclose wherein in response to zero relocalization hypothesis established at the first pose, the processor is configured to further perform:
instructing the at least one visual sensor for visual measurement at the second pose; and 
re-implementing the single-shot relocalization to localize the system in the pre- built visual map using the visual measurement at the second pose.
However, Dan teaches this limitation (See at least Dan Paragraph 0108, when the current position cannot be determined using the features of the ground from the original photograph, a photograph of the environment is taken instead, to start a next round of self-relocalization). 
It would have been obvious to one of ordinary skill to modify the teachings disclosed in modified Samarasekera with Dan to have the visual sensor capture a new snapshot at the second pose when all the relocalization hypotheses are rejected. This would allow the system to determine the current position using a second snapshot when the features from the first snapshot cannot be used to identify the current position (See at least Dan Paragraph 0108), which would increase the effectiveness of the system. 

Allowable Subject Matter
Claims 7-8 and 13-14 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.

Claims 15-17 and 20 are allowed.

Claims 18-19 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Duong et al (US 20210174539 A1) teaches a method for estimating the pose of a camera using the images.
Zhang et al (US 10571926 B1) teaches using features from images to determine the location
Balan et al (US 20190325600 A1) teaches using the features from images to determine the pose of a handheld object
Kawanishi et al (US 10393515 B2) teaches determining a movement candidate by taking images at different positions 
Park et al (US 20190138026 A1) teaches imposing the movement trajectory onto the plurality of localization hypotheses 
Heinla et al (US 20180253107 A1) teaches localizing a robot using images
Denda (US 20160152241 A1) teaches determining the movement amount of an autonomous mobile body


Any inquiry concerning this communication or earlier communications from the examiner should be directed to ESVINDER SINGH whose telephone number is (571)272-7875. The examiner can normally be reached Monday-Friday: 9 am-5 pm est.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abby Lin can be reached on 571-270-3976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/E.S./Examiner, Art Unit 3664                                                                                                                                                                                                        
/BHAVESH V AMIN/Primary Examiner, Art Unit 3664