DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
Y The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
This office action is made in response to Applicant’s remarks filed on 1/21/22. Claims 9-10 and 19-20 have been added. Claims 1-5, 7-8, 11-15, and 17 have been amended. Claims 1-20 are pending. 

Response to Arguments
Applicant’s amendments regarding Examiner's rejections under 35 USC 112 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph have been considered and are accepted. These rejections are accordingly withdrawn.
Applicant’s arguments with respect to Examiner's rejections under 35 USC 103 have been considered but are not persuasive. Therefore, these rejections are maintained.
Regarding claim 1, Applicant suggests that the cited prior art does not teach, "calculating a relative pose between the current visual data frame and the target visual data frame, in response to the target visual data frame being found during the loop closure detection; and performing a loop closure optimization on pose data of one or more frames between the current visual data frame and the target visual data frame, by using the relative pose wherein the pose data are calculated based on the plurality of distance data frames," (Remarks at pg. 10). Examiner, however, respectfully disagrees.
Namely, Liang discloses calculating a relative pose (e.g. at least translation and rotation between the two frames, see e.g. at least pg. 10, p. 2 – pg. 11, p. 2) between the current visual data frame and the target visual data frame, in response to the target visual data frame being found during the loop closure detection (id., calculating the translation and rotation between two frames); and 
performing a loop closure optimization (e.g. at least map optimization, see e.g. at least Fig. 3-1, and related text) on pose data of one or more frames between the current visual data frame and the target visual data frame, by using the relative pose, wherein the pose data are calculated based on the plurality of distance data frames (see e.g. at least Abstract, pg. 6, p. 1 – pg. 7, p. 2, pg. 8, p. 1, pg. 18, p.1-4, Fig. 3-1, 3-2, 3-5, and related text).


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.

Claim 1 recites: "A computer implemented visual assisted distance-based simultaneous localization and mapping method for a mobile robot, comprising executing on a processor the steps of:
obtaining a plurality of distance data frames from a laser sensor and a plurality of visual data frames from a camera, wherein each of the plurality of visual data frames corresponds to one of the plurality of distance data frames, and the corresponding visual data frame and the distance data frame are obtained at a same time;
performing a loop closure detection based on a current visual data frame in the plurality of visual data frames, and determining whether a target visual data frame is found during the loop closure detection, wherein the target visual data frame and the current visual data frame are obtained in a same scene;
calculating a relative pose between the current visual data frame and the target visual data frame, in response to the target visual data frame being found during the loop closure detection; and
performing a loop closure optimization on pose data of one or more frames between the current visual data frame and the target visual data frame, by using the relative pose, wherein the pose data are calculated based on the plurality of distance data frames."
This language is vague and indefinite for at least the following reasons:
Generally Unclear: The terms “scene” and “same scene” as employed in the claim are vague and indefinite as the scope of these term is not clearly articulated. Namely, the term “scene” is vague and undefined by the claims (and the Specification), such that the metes and bounds of the scope of this term is vague and indefinite. For example, it is unclear what constitutes a scene (e.g. is a scene a general environment that is observed within an indefinite or otherwise undefined, limited, timeframe? Moreover, is the scene defined by the configuration of an environment itself (regardless of a perspective from which it is observed), or is the scene further defined by a perspective by which the environment is observed? Moreover, the metes and bounds of what constitutes a “scene” is undefined, such that it is unclear what distinguishes a “scene” from another “scene”, and correspondingly, what constitutes a “same scene”. Accordingly, the metes and bounds of the scope of the terms “scene” and “same scene” are vague and indefinite, such that persons of ordinary skill in the art would not readily be able to ascertain the scope of these terms and the corresponding language of the claim. Finally, the scope and nature of the term “[current/target] visual data frame” is vague and indefinite (see rejection of claim 3, below).
Although the following language does not necessarily cure the issues discussed above, for purposes of examination under 35 USC 102 and 103, Examiner will interpret this language as reading:
"A computer implemented visual assisted distance-based simultaneous localization and mapping method for a mobile robot, comprising executing on a processor the steps of:
obtaining a plurality of distance data frames from a laser sensor and a plurality of visual data frames from a camera, wherein each of the plurality of visual data frames corresponds to one of the plurality of distance data frames, and the corresponding visual data frame and the distance data frame are obtained at a same time;
performing a loop closure detection based on a current visual data frame in the plurality of visual data frames, and determining whether a target visual data frame is found during the loop closure detection, wherein the target visual data frame and the current visual data frame are obtained 
calculating a relative pose between the current visual data frame and the target visual data frame, in response to the target visual data frame being found during the loop closure detection; and
performing a loop closure optimization on pose data of one or more frames between the current visual data frame and the target visual data frame, by using the relative pose, wherein the pose data are calculated based on the plurality of distance data frames."
Claims 2-7, 10, and 19-20 are further rejected as depending on this claim.

Claim 2 recites: "The method of claim 1, wherein the step of performing the loop closure detection based on the current visual data frame in the plurality of visual data frames:
finding at least one candidate visual data frame as the target visual data frame, wherein a similarity between the current visual data frame and each of the at least one candidate visual data frame is greater than a preset threshold, the at least one candidate visual data frame precedes the current visual data frame, and a frame spacing between the current visual data frame and each of the at least one candidate visual data frame is within a preset range."
This language is also rejected as vague and indefinite for at least the following reasons:
Subjective/Relative Terms: The language "a similarity between the current visual data frame and each of the at least one candidate visual data frame is greater than a preset threshold” is subjective and/or relative such that the scope of the term is unclear (i.e. the metes and bounds of the term are vaguely articulated such that persons of ordinary skill in the art would not be reasonably apprised of the precise scope of the term and corresponding claim). Furthermore, the language is not defined by the claim, and the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
Generally Unclear: The language "wherein the step of performing the loop closure detection based on the current visual data frame in the plurality of visual data frames: finding … the at least one candidate visual data frame precedes the current visual data frame, and a frame spacing between the current visual data frame and each of the at least one candidate visual data frame is within a preset range" is generally narrative and indefinite, failing to conform with current U.S. practice. This language appears to be a literal translation into English from a foreign document and is replete with grammatical and idiomatic errors. For example, it is unclear how the body of the claim is connected to the invention as whole. Second, it is unclear what constitutes an order, hierarchy, space, and/or distance of [candidate/current] visual data frames. Are the frames defined by a score, time, arbitrarily (or otherwise undefined) assigned enumeration or valuation? Accordingly, it is unclear what constitutes “space” or “spacing” between [candidate/current] data frames.
Although the following language does not necessarily cure the issues discussed above, for purposes of examination under 35 USC 102 and 103, Examiner will interpret this language as reading:
"The method of claim 1, wherein the step of performing the loop closure detection based on the current visual data frame in the plurality of visual data frames further comprises:
finding at least one candidate visual data frame as the target visual data frame
Claims 3-5 and 20 are further rejected as depending on this claim.

Claim 3 recites: "The method of claim 2, wherein the step of performing the loop closure detection based on the current visual data frame in the plurality of visual data frames further comprises:
checking the current visual data frame and the target visual data frame, and removing at least one unqualified visual data frame from the target visual data frame, wherein the at least one unqualified visual data frame is similar to the current visual data frame in appearance, but are not obtained in the same scene as the current visual data frame."
This language is also rejected as vague and indefinite for the same reasons discussed in the rejection of claims 1-2 above. Moreover, this language is further rejected as vague and indefinite for at least the following reasons:
Generally Unclear: The language "checking the current visual data frame and the target visual data frame, and removing at least one unqualified visual data frame from the target visual data frame, wherein the at least one unqualified visual data frame is similar to the current visual data frame in appearance, but are not obtained in the same scene as the current visual data frame" is generally narrative and indefinite, failing to conform with current U.S. practice. This language appears to be a literal translation into English from a foreign document and is replete with grammatical and idiomatic errors. For example, it is unclear what constitutes “checking”, how this action is performed, and what the intended consequence of this action might be. Second, the language “removing at least one unqualified visual data frame from the target visual data frame“ is vague and indefinite as the nature of the terms [at-least-one-unqualified/target] visual data frame is unclear and the relationship between the at least one unqualified visual data frame to the target visual data frame is unclear (e.g. is a visual data frame a set of elements comprising other visual data frames? How, and what does it mean to “remove” an unqualified visual data frame from a target visual data frame?). Third, it is unclear what constitutes “unqualified” (and “qualified”) as the metes and bounds of this term is undefined. Fourth, for reasons discussed in the paragraphs above, it is unclear what constitutes “similar to the current visual data frame in appearance”. Likewise, the language “but are not obtained in the same scene as the current visual data frame” is similarly vague and indefinite.
Although the following language does not necessarily cure the issues discussed above, for purposes of examination under 35 USC 102 and 103, Examiner will interpret this language as reading:
"The method of claim 2, wherein the step of performing the loop closure detection based on the current visual data frame in the plurality of visual data frames further comprises:
analyzing the current visual data frame and the target visual data frame, and identifying at least one unqualified visual data frame simultaneously with the current visual data frame."
Claims 4 and 20 are further rejected as depending on this claim.

Claim 4 recites: “The method of claim 3, wherein the step of checking the current visual data frame and the target visual data frame comprises:
performing a random sampling consistency filtering on map point data of the current visual data frame and map point data of the target visual data frame.”
This language is also rejected as vague and indefinite for the same reasons discussed in the rejection of claims 1-3 above.
Although the following language does not necessarily cure the issues discussed above, for purposes of examination under 35 USC 102 and 103, Examiner will interpret this language as reading:
“The method of claim 3, wherein the step of analyzing the current visual data frame and the target visual data frame comprises:
performing a random sampling consistency filtering on map point data of the current visual data frame and map point data of the target visual data frame.”
Claim 20 is further rejected as depending on this claim.

Claim 9 recites: “The method of claim 8, wherein when a tracking is lost during a booting or localization process of the mobile robot, the step of obtaining the current visual data by the camera is performed; and
the loop closure detection on the current visual data is performed based on similarity of images.”
This language is rejected as vague and indefinite for at least the following reasons:
Idiomatic Language: The language “when a tracking is lost” is generally narrative and indefinite, failing to conform with current U.S. practice. This language appears to be a literal translation into English from a foreign document and is replete with grammatical and idiomatic errors. Namely, the expression “when a tracking is lost” as used in the claim is vague and indefinite and leaves the reader in doubt as to the meaning of the technical features to which it refers, thereby rendering the definition and scope of the subject-matter of said claim unclear. 
Although the following language does not necessarily cure the issues discussed above, for purposes of examination under 35 USC 102 and 103, Examiner will interpret this language as reading:
“The method of claim 8, wherein 
the loop closure detection on the current visual data is performed based on similarity of images.”

Claim 11 recites: "A mobile robot, comprising:
a processor;
a laser sensor:
a camera; and
one or more computer programs stored in the memory and executable on the processor, wherein the processor is coupled to each of the laser sensor and the camera, and the one or more computer programs comprise:
instructions for obtaining a plurality of distance data frames from the laser sensor and a plurality of visual data frames from the camera, wherein each of the plurality of visual data frames corresponds to one of the plurality of distance data frames, and the corresponding visual data frame and the distance data frame are obtained at a same time;
instructions for performing a loop closure detection based on a current visual data frame in the plurality of visual data frames, and determining whether a target visual data frame is found during the loop closure detection, wherein the target visual data frame and the current visual data frame are obtained in a same scene:
instructions for calculating a relative pose between the current visual data frame and the target visual data frame, in response to the target visual data frame being found during the loop closure detection; and 
instructions for performing a loop closure optimization on pose data of one or more frames between the current visual data frame and the target visual data frame by using the relative pose, wherein the pose data are calculated based on the plurality of distance data frames.”
This language is also rejected as vague and indefinite for the same reasons discussed in the rejection of claim 1 above. Moreover, this language is further rejected as vague and indefinite for at least the following reasons:
Antecedent Basis: The following terms lack proper antecedent basis:
“the memory”
“the corresponding visual data frame”
“the distance data frame”
Intended Use: The claim contains the following language that is vague and indefinite as it is unclear whether the scope of this language is intended to affirmatively require specific performance by the processor or whether this language is deliberately articulated as an expression of intended use of (transitory) instructions:
“instructions for obtaining … time”
“instructions for performing … scene”
“instructions for calculating … detection”
“instructions for performing … frames”
Accordingly, this language does not serve to patentably distinguish the claimed structure over that of the reference. See In re Pearson, 181 USPQ 641; In re Yanush, 177 USPQ 705; In re Finsterwalder, 168 USPQ 530; In re Casey, 512 USPQ 235; In re Otto, 136 USPQ 458; Ex parte Masham, 2 USPQ 2nd 1647.
Although the following language does not necessarily cure the issues discussed above, for purposes of examination under 35 USC 102 and 103, Examiner will interpret this language as reading:
"A mobile robot, comprising:
a processor;
a laser sensor:
a camera; and
one or more computer programs stored in a non-transitory memory and executable on the processor, wherein the processor is coupled to each of the laser sensor and the camera, and the one or more computer programs comprise:
instructions [intended for obtaining a plurality of distance data frames from the laser sensor and a plurality of visual data frames from the camera, wherein each of the plurality of visual data frames corresponds to one of the plurality of distance data frames, and wherein each distance data frame is obtained contemporaneously with a corresponding visual data frame ];
instructions [intended for performing a loop closure detection based on a current visual data frame in the plurality of visual data frames, and determining whether a target visual data frame is found during the loop closure detection, wherein the target visual data frame and the current visual data frame are obtained ]:
instructions [intended for calculating a relative pose between the current visual data frame and the target visual data frame, in response to the target visual data frame being found during the loop closure detection]; and 
instructions [intended for performing a loop closure optimization on pose data of one or more frames between the current visual data frame and the target visual data frame by using the relative pose, wherein the pose data are calculated based on the plurality of distance data frames].”
Claims 12-18 are further rejected as depending on this claim.

Claim 12 recites: "The mobile robot of claim 11, wherein the instructions for performing the loop closure detection based on the current visual data frame in the plurality of visual data frames comprise:
instructions for finding at least one candidate visual data frame as the target visual data frame, wherein a similarity between the current visual data frame and each of the at least one candidate visual data frame is greater than a preset threshold, the at least one candidate visual data frame precedes the current visual data frame, and a frame spacing between the current visual data frame and each of the at least one candidate visual data frame is within a preset range."
This language is also rejected as vague and indefinite for the same reasons discussed in the rejection of claims 1-2 and 11 above. 
Although the following language does not necessarily cure the issues discussed above, for purposes of examination under 35 USC 102 and 103, Examiner will interpret this language as reading:
"The mobile robot of claim 11, wherein the instructions [intended for performing the loop closure detection based on the current visual data frame in the plurality of visual data frames comprise:
instructions [intended for finding at least one candidate visual data frame as the target visual data frame]]."
Claims 13-15 are further rejected as depending on this claim.

Claim 13 recites: "The mobile robot of claim 12, wherein the instructions for performing the loop closure detection based on the current visual data frame in the plurality of visual data frames further comprise:
instructions for checking the current visual data frame and the target visual data frame, and removing at least one unqualified visual data frame from the target visual data frame, wherein the at least one unqualified visual data frame is similar to the current visual data frame in appearance, but are not obtained in the same scene as the current visual data frame."
This language is also rejected as vague and indefinite for the same reasons discussed in the rejection of claims 1-3 and 11-12 above. 
Although the following language does not necessarily cure the issues discussed above, for purposes of examination under 35 USC 102 and 103, Examiner will interpret this language as reading:
"The mobile robot of claim 12, wherein the instructions [intended for performing the loop closure detection based on the current visual data frame in the plurality of visual data frames further comprise:
instructions [intended for analyzing the current visual data frame and the target visual data frame, and identifying at least one unqualified visual data frame simultaneously with the current visual data frame]]."
Claim 14 is further rejected as depending on this claim.

Claim 14 recites: "The mobile robot of claim 13, wherein the instructions for checking the current visual data frame and the target visual data frame comprise:
instructions for performing a random sampling consistency filtering on map point data of the current visual data frame and map point data of the target visual data frame."
This language is also rejected as vague and indefinite for the same reasons discussed in the rejection of claims 1-4 and 11-13 above. 
Although the following language does not necessarily cure the issues discussed above, for purposes of examination under 35 USC 102 and 103, Examiner will interpret this language as reading:
"The mobile robot of claim 13, wherein the instructions [intended for analyzing the current visual data frame and the target visual data frame comprise:
instructions [intended for performing a random sampling consistency filtering on map point data of the current visual data frame and map point data of the target visual data frame]]."

Claim 16 recites: "The mobile robot of claim 11, wherein the one or more computer programs further comprise:
instructions for storing the pose data after the loop closure optimization in map data.”
This language is also rejected as vague and indefinite for the same reasons discussed in the rejection of claim 11 above. Although the following language does not necessarily cure the issues discussed above, for purposes of examination under 35 USC 102 and 103, Examiner will interpret this language as reading:
"The mobile robot of claim 11, wherein the one or more computer programs further comprise:
instructions [intended for storing the pose data after the loop closure optimization in map data].”
Claim 17 is further rejected as depending on this claim.

Claim 17 recites: “The mobile robot of claim 16, wherein the one or more computer programs further comprise:
instructions for obtaining current visual data from the camera;
instructions for performing a loop closure detection on the current visual data, and determining whether a visual data frame matching the current visual data is found in the plurality of visual data frames;
instructions for calculating a relative pose between the current visual data and the visual data frame matching the current visual data, in response to the visual data frame matching the current visual data being found in the plurality of visual data frames; and
instructions for current pose of the mobile robot by using the relative pose between the current visual data and the visual data frame matching the current visual data, and pose data associated with the visual data frame matching the current visual data.”
This language is also rejected as vague and indefinite for the same reasons discussed in the rejection of claim 11 above. 
Although the following language does not necessarily cure the issues discussed above, for purposes of examination under 35 USC 102 and 103, Examiner will interpret this language as reading:
“The mobile robot of claim 16, wherein the one or more computer programs further comprise:
instructions [intended for obtaining current visual data from the camera];
instructions [intended for performing a loop closure detection on the current visual data, and determining whether a visual data frame matching the current visual data is found in the plurality of visual data frames];
instructions [intended for calculating a relative pose between the current visual data and the visual data frame matching the current visual data, in response to the visual data frame matching the current visual data being found in the plurality of visual data frames]; and
instructions [intended for current pose of the mobile robot by using the relative pose between the current visual data and the visual data frame matching the current visual data, and pose data associated with the visual data frame matching the current visual data].”

Claim 19 recites: "The method of claim 1, further comprising:
in response to the loop closure optimization being completed or the target visual data frame not being found during the loop closure detection, using another visual data frame in the plurality of visual data frames as the current data frame, and returning to the step of performing the loop closure detection based on the current visual data frame in the plurality of visual data frames, and determining whether the target visual data frame is found during the loop closure detection."
This language is rejected as vague and indefinite for at least the following reasons:
Antecedent Basis: The following terms lack proper antecedent basis:
“the current data frame”
Although the following language does not necessarily cure the issues discussed above, for purposes of examination under 35 USC 102 and 103, Examiner will interpret this language as reading:
"The method of claim 1, further comprising:
in response to the loop closure optimization being completed or the target visual data frame not being found during the loop closure detection, using another visual data frame in the plurality of visual data frames as a current data frame, and returning to the step of performing the loop closure detection based on the current visual data frame in the plurality of visual data frames, and determining whether the target visual data frame is found during the loop closure detection."

Claim 20 recites: "The method of claim 4, wherein the step of performing the random sampling consistency filtering on the map point data of the current visual data frame and the map point data of the target visual data frame comprises:
performing a point pair matching between feature points in the map point data of the target visual data frame and feature points in the map point data of the current visual data frame;
estimating a pose between the current visual data frame and the target visual data frame using randomly adopted point pairs;
verifying correctness of the pose using point pairs other than the randomly adopted point pairs, taking a point pair meeting the pose as inner points, and recording a number of the inner points;
repeating the point pair matching for several times, and determining whether a number of the inner points of a pose with the most inner points is less than a threshold; and
determining that the target visual data frame is the unqualified visual data frame, in response to the number of the inner points of the pose with the most inner points being less than the threshold."
This language is also rejected as vague and indefinite for the same reasons discussed in the rejection of claim 1 above. Moreover, this language is further rejected as vague and indefinite for at least the following reasons:
Generally Unclear: The language “estimating a pose … using randomly adopted point pairs” is vague and indefinite as the scope of this language is not clearly articulated. Namely, it is unclear how this operation is performed. More specifically, it is unclear if this operation is intended to be distinct from the language “performing a point pair matching … current visual data frame”, or whether these are directed to the same operation. Similarly, the language “verifying correctness of the pose using point pairs other than the randomly adopted point pairs … a number of the inner points” is vague and indefinite as it is unclear whether this language as the scope of this language is not clearly articulated wherein it is unclear how this operation is performed, and whether this operation is intended to be distinct from the language “performing a point pair matching … current visual data frame”. Furthermore, the language “correctness” and “verifying correctness” is vague and undefined (e.g. what constitutes “correctness”?). Moreover, the language “taking a point pair meeting the pose as inner points” is vague and indefinite as this language is generally narrative and indefinite and unclear as to its intended meaning, failing to conform with current U.S. practice. This language appears to be a literal translation into English from a foreign document and is replete with grammatical and idiomatic errors. For example, it is unclear if “taking a point pair meeting” is intended to be directed to an operation distinct from “performing a point pair matching”. Moreover, the term “the pose” lacks proper antecedent basis.” Furthermore, the term “inner points” is vague and undefined. Accordingly, it is unclear what meaning is intended by the language “determining whether a number of the inner points of a pose with the most inner points is less than a threshold” and “determining … in response to the number of the inner points of the pose with the most inner points being less than the threshold.”
Relative Terms: The term “several” and/ or "for several times” is subjective and/or relative such that the scope of the term is unclear (i.e. the metes and bounds of the term are vaguely articulated such that persons of ordinary skill in the art would not be reasonably apprised of the precise scope of the term and corresponding claim). Furthermore, the term(s) is/are not defined by the claim, and the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
Antecedent Basis: The following terms lack proper antecedent basis:
“the inner points of a pose”
“the most inner points”
“the unqualified visual data frame”
Undefined term: The term “the unqualified visual data frame” is vague and undefined.
Although the following language does not necessarily cure the issues discussed above, for purposes of examination under 35 USC 102 and 103, Examiner will interpret this language as reading:
"The method of claim 4, wherein the step of performing the random sampling consistency filtering on the map point data of the current visual data frame and the map point data of the target visual data frame comprises:
performing a point pair matching between feature points in the map point data of the target visual data frame and feature points in the map point data of the current visual data frame;
estimating a pose between the current visual data frame and the target visual data frame using randomly adopted point pairs;
verifying correctness of a pose using point pairs other than the randomly adopted point pairs, 
repeating the point pair matching 
determining 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Liang (Liang, X. Research on Robot Indoor Localization and Mapping Based on Integration of Laser and Monocular Vision. Chinese Master's Theses Full-text Database. Information Science and Technology 2016. For purposes of this examination, Examiner will refer to the English language translation of this reference provided with this Office Action) in view of Wu (Wu Q. Visual and LiDAR-based for the mobile 3D mapping (IEEE 2016)).

Regarding claim 1, Liang discloses a visual assisted distance-based simultaneous localization and mapping method for a mobile robot (see e.g. at least Abstract, Fig. 3-1, 3-2, and related text), comprising the steps of:
obtaining a plurality of distance data frames from a laser sensor (e.g. at least laser sensor, see e.g. at least Abstract, Fig. 3-1, and related text) and a plurality of visual data frames from a camera (e.g. at least camera, see e.g. at least § 3.2.3, Fig. 3-1, and related text), wherein each of the plurality of visual data frames corresponds to one of the plurality of distance data frames, and the corresponding visual data frame and the distance data frame are obtained at a same time (see e.g. at least Abstract, § 3.1, Fig. 3-1, and related text);
performing a loop closure detection based on a current visual data frame in the plurality of visual data frames, and determining whether a target visual data frame is found during the loop closure detection (see e.g. at least §§ 3.2.1, 3.2.3-3.2.4, 3.3.1, Fig. 3-2, and related text), wherein the target visual data frame and the current visual data frame are obtained (id.);
calculating a relative pose (e.g. at least translation and rotation between the two frames, see e.g. at least § 3.2.2) between the current visual data frame and the target visual data frame, in response to the target visual data frame being found during the loop closure detection (id., calculating the translation and rotation between two frames); and 
performing a loop closure optimization (e.g. at least map optimization, see e.g. at least Fig. 3-1, and related text) on pose data of one or more frames between the current visual data frame and the target visual data frame, by using the relative pose, wherein the pose data are calculated based on the plurality of distance data frames (see e.g. at least Abstract, § 3.1-3.2.1, 3.2.4, 3.3.1, Fig. 3-1, 3-2, 3-5, and related text).
Additionally, Wu teaches limitations not expressly disclosed by Liang including namely: computer-implemented visual assisted distance-based simultaneous localization and mapping method for a mobile robot (see e.g. at least Abstract, Fig. 1, and related text), comprising executing steps on a processor (id.).
Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify the teaching of Liang by configuring a computer-implemented visual assisted distance-based simultaneous localization and mapping method for a mobile robot, comprising executing the steps of the method on processor as taught by Wu in order to improve accuracy of autonomous robot navigation optimization by creating a 3D map with RGB information using LiDAR and panoramic camera (Wu: Abstract, p. 1527).

Regarding claim 2, Modified Liang teaches that the step of performing the loop closure detection based on the current visual data frame in the plurality of visual data frames further comprises:
finding at least one candidate visual data frame as the target visual data frame (Liang: see e.g. at least §§ 3.2.1-3.2.2, wherein the error is smaller than a certain threshold; see also e.g. at least § 3.2.2, wherein the number of inliers meet the requirement; see also e.g. at least § 3.2.4, using key frames that are near the current frame and a threshold number M of co-visibility points included between the two frames).

Regarding claim 3, Modified Liang teaches that the step of performing the loop closure detection based on the current visual data frame in the plurality of visual data frames further comprises:
analyzing the current visual data frame and the target visual data frame, and identifying at least one unqualified visual data frame, wherein the at least one unqualified visual data frame is not obtained simultaneously with the current visual data frame (Liang: see e.g. at least §§ 3.1-3.2.1, 3.2.4, eliminating all key frames with a score lower than Smin).

Regarding claim 4, Modified Liang teaches that the step of analyzing the current visual data frame and the target visual data frame comprises:
performing a random sampling consistency filtering on map point data of the current visual data frame and map point data of the target visual data frame (Liang: see e.g. at least §§ 3.1-3.2.1, 3.2.4).

Regarding claim 5, Liang discloses the method of claim 2, wherein a maximum value is positively correlated with a fame rate of the distance data frames (Liang: see e.g. at least § 3.2.4).

Regarding claim 6, Modified Liang teaches storing the pose data after the loop closure optimization in map data (Liang: see e.g. at least § 3.1, Fig. 3-1, and related text).

Regarding claim 7, Modified Liang teaches:
obtaining current visual data from the camera (Liang: see e.g. at least § 3.2.1, Fig. 3-1, 3-2, and related text, continuously extracting the current frame);
performing a loop closure detection on the current visual data, and determining whether a visual data frame matching the current visual data is found in the plurality of visual data frames (Liang: see e.g. at least §§ 3.2.1, 3.2.3, 3.2.4, 3.3.1, Fig. 3-2, and related text);
calculating a relative pose between the current visual data and the visual data frame matching the current visual data, in response to the visual data frame matching the current visual data being found in the plurality of visual data frames (Liang: see e.g. at least § 3.2.2, calculating the translation and rotation between two frames); and
calculating a current pose of the mobile robot by using the relative pose between the current visual data and the visual data frame matching the current visual data, and pose data associated with the visual data frame matching the current visual data (Liang: see e.g. at least Abstract, §§ 3.1, 3.2.1, 3.2.4, 3.3.1, Fig. 3-1, 3-2, 3-5, and related text).

Regarding claim 8, Liang discloses a visual assisted distance-based simultaneous localization and mapping method for a mobile robot (see e.g. at least Abstract, Fig. 3-1, 3-2, and related text), comprising the steps of:
obtaining current visual data by a camera (e.g. at least camera, see e.g. at least § 3.2.3, Fig. 3-1, and related text);
performing a loop closure detection on the current visual data, and determining whether a visual data frame matching the current visual data is found in a plurality of stored visual data frames (see e.g. at least Abstract, §§ 3.1, 3.2.1, 3.2.3, 3.2.4, Fig. 3-1, 3-2, and related text);
calculating a relative pose between the current visual data and the visual data frame matching the current visual data, in response to the visual data frame matching the current visual data being found in the plurality of stored visual data frames (see e.g. at least § 3.2.2, calculating the translation and rotation between two frames); and
calculating a current pose of the mobile robot by using the relative pose, and pose data associated with the visual data frame matching the current visual data (see e.g. at least Abstract, §§ 3.1, 3.2.1, 3.2.4, 3.3.1, Fig. 3-1, 3-2, 3-5, and related text).
Additionally, Wu teaches limitations not expressly disclosed by Liang including namely: computer-implemented visual assisted distance-based simultaneous localization and mapping method for a mobile robot (see e.g. at least Abstract, Fig. 1, and related text), comprising executing steps on a processor (id.).
Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify the teaching of Liang by configuring a computer-implemented visual assisted distance-based simultaneous localization and mapping method for a mobile robot, comprising executing the steps of the method on processor as taught by Wu in order to improve accuracy of autonomous robot navigation optimization by creating a 3D map with RGB information using LiDAR and panoramic camera (Wu: Abstract, p. 1527).

Regarding claim 9, Modified Liang teaches that during a booting or localization process of the mobile robot, the step of obtaining the current visual data by the camera is performed (see e.g. at least §§ 3.2.1, 3.2.3-3.2.4, 3.3.1, Fig. 3-2, and related text); and
the loop closure detection on the current visual data is performed based on similarity of images (Liang: see e.g. at least §§ 3.2.1, 3.2.2, 3.2.4).

Regarding claim 10, Modified Liang teaches that the camera comprises at least one of an RGB camera and a depth camera (Wu: e.g. at least panoramic camera, see e.g. at least Abstract, pg. 1525, p. 2, Fig. 1, and related text); and
feature data extracted from image data and/or depth data obtained by the camera are used as the plurality of visual data frames (Liang: see e.g. at least Abstract, § 3.1, Fig. 3-1, and related text), and wherein the feature data comprise map point data extracted from the image data and/or the depth data (Liang: see e.g. at least § 1.1-1.2).

Regarding claim 11, Liang discloses a mobile robot (e.g. at least robot, see e.g. at least Abstract), comprising:
a laser sensor (e.g. at least laser sensor, see e.g. at least Abstract, Fig. 3-1, and related text);
a camera (e.g. at least camera, see e.g. at least § 3.2.3, Fig. 3-1, and related text); and
one or more computer programs stored in a non-transitory memory and executable on the processor, wherein the processor is coupled to each of the laser sensor and the camera, and the one or more computer programs comprise (see e.g. at least Abstract, Fig. 3-1, and related text):
instructions [intended for obtaining a plurality of distance data frames from the laser sensor and a plurality of visual data frames from the camera, wherein each of the plurality of visual data frames corresponds to one of the plurality of distance data frames, and wherein each distance data frame is obtained contemporaneously with a corresponding visual data frame] (see e.g. at least Abstract, § 3.1, Fig. 3-1, and related text);
instructions [intended for performing a loop closure detection based on a current visual data frame in the plurality of visual data frames, and determining whether a target visual data frame is found during the loop closure detection, wherein the target visual data frame and the current visual data frame are obtained] (see e.g. at least §§ 3.2.1, 3.2.3-3.2.4, 3.3.1, Fig. 3-2, and related text);
instructions [intended for calculating a relative pose between the current visual data frame and the target visual data frame, in response to the target visual data frame being found during the loop closure detection] (see e.g. at least § 3.2.2, calculating the translation and rotation between two frames); and
instructions [intended for performing a loop closure optimization on pose data of one or more frames between the current visual data frame and the target visual data frame by using the relative pose, wherein the pose data are calculated based on the plurality of distance data frames] (e.g. at least translation and rotation between the two frames, see e.g. at least § 3.2.2).
Additionally, Wu teaches limitations not expressly disclosed by Liang including namely: a processor (e.g. at least computer, see e.g. at least Fig. 1, and related text);
one or more computer programs stored in a memory and executable on the processor (e.g. at least computer, see e.g. at least Fig. 1, and related text), wherein the processor is coupled to each of the laser sensor (e.g. at least LiDAR, see e.g. at least Fig. 1, and related text) and the camera (e.g. at least camera, see e.g. at least Fig. 1, and related text), and the one or more computer programs comprise instructions for performing the method (see e.g. at least Abstract, Fig. 1, and related text).
Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify the teaching of Liang by configuring a processor; one or more computer programs stored in a memory and executable on the processor, wherein the processor is coupled to each of the laser sensor and the camera, and the one or more computer programs comprise instructions for performing the method as taught by Wu in order to improve accuracy of autonomous robot navigation optimization by creating a 3D map with RGB information using LiDAR and panoramic camera (Wu: Abstract, p. 1527).

Regarding claim 12, Modified Liang teaches that the instructions [intended for performing the loop closure detection based on the current visual data frame in the plurality of visual data frames further comprise:
instructions [intended for analyzing the current visual data frame and the target visual data frame, and identifying at least one unqualified visual data frame, wherein the at least one unqualified visual data frame is not obtained simultaneously with the current visual data frame]] (Liang: see e.g. at least §§ 3.2.1-3.2.2, wherein the error is smaller than a certain threshold; see also e.g. at least § 3.2.2, wherein the number of inliers meet the requirement; see also e.g. at least § 3.2.4, using key frames that are near the current frame and a threshold number M of co-visibility points included between the two frames).

Regarding claim 13, Modified Liang teaches that the instructions [intended for performing the loop closure detection based on the current visual data frame in the plurality of visual data frames further comprise:
instructions [intended for analyzing the current visual data frame and the target visual data frame, and identifying at least one unqualified visual data frame, wherein the at least one unqualified visual data frame is not obtained simultaneously with the current visual data frame]] (Liang: see e.g. at least §§ 3.1-3.2.1, 3.2.4, eliminating all key frames with a score lower than Smin).

Regarding claim 14, Modified Liang teaches that the instructions [intended for analyzing the current visual data frame and the target visual data frame comprise:
instructions [intended for performing a random sampling consistency filtering on map point data of the current visual data frame and map point data of the target visual data frame]] (Liang: see e.g. at least §§ 3.1-3.2.1, 3.2.4).

Regarding claim 15, Modified Liang teaches that a maximum value of the preset range is positively correlated with a frame rate of the distance data frames (Liang: see e.g. at least § 3.2.4).

Regarding claim 16, Modified Liang teaches that the one or more computer programs further comprise:
instructions [intended for storing the pose data after the loop closure optimization in map data] (Liang: see e.g. at least Fig. 3-1, and related text).

Regarding claim 17 Modified Liang teaches that the one or more computer programs further comprise:
instructions [intended for obtaining current visual data from the camera] (Liang: see e.g. at least § 3.2.1, Fig. 3-1, 3-2, and related text, continuously extracting the current frame);
instructions [intended for performing a loop closure detection on the current visual data, and determining whether a visual data frame matching the current visual data is found in the plurality of visual data frames] (Liang: see e.g. at least §§ 3.2.1, 3.2.3, 3.2.4, 3.3.1, Fig. 3-2, and related text);
instructions [intended for calculating a relative pose between the current visual data and the visual data frame matching the current visual data, in response to the visual data frame matching the current visual data being found in the plurality of visual data frames] (Liang: see e.g. at least § 3.2.2, calculating the translation and rotation between two frames); and
instructions [intended for current pose of the mobile robot by using the relative pose between the current visual data and the visual data frame matching the current visual data, and pose data associated with the visual data frame matching the current visual data] (Liang: see e.g. at least Abstract, §§ 3.1, 3.2.1, 3.2.4, 3.3.1, Fig. 3-1, 3-2, 3-5, and related text).

Regarding claim 18 Modified Liang teaches that the laser sensor is a laser radar (Wu: e.g. at least LiDAR, see e.g. at least Abstract, pg. 1525, p. 2, Fig. 1, and related text), the camera comprises at least one of an RGB camera and a depth camera (Wu: e.g. at least panoramic camera, see e.g. at least Abstract, pg. 1525, p. 2, Fig. 1, and related text).

Regarding claim 19, Modified Liang teaches:
in response to the loop closure optimization being completed or the target visual data frame not being found during the loop closure detection (see e.g. at least §§ 3.2.1, 3.2.3-3.2.4, 3.3.1, Fig. 3-2, and related text), using another visual data frame in the plurality of visual data frames as a current data frame (id.), and returning to the step of performing the loop closure detection based on the current visual data frame in the plurality of visual data frames (id.), and determining whether the target visual data frame is found during the loop closure detection (id.).

Regarding claim 20, Modified Liang teaches that the step of performing the random sampling consistency filtering on the map point data of the current visual data frame and the map point data of the target visual data frame comprises:
performing a point pair matching between feature points in the map point data of the target visual data frame and feature points in the map point data of the current visual data frame (Liang: see e.g. at least §§ 3.1-3.2.1, 3.2.3-3.2.4, 3.3.1, Fig. 3-2, and related text);
estimating a pose between the current visual data frame and the target visual data frame using randomly adopted point pairs (Liang: see e.g. at least Abstract, §§ 3.1-3.2.2, 3.2.4, 3.3.1-3.3.2, Fig. 3-1, 3-2, 3-5, and related text);
verifying correctness of a pose using point pairs other than the randomly adopted point pairs, and recording a number of points (id.);
repeating the point pair matching, and determining whether a number of points is less than a threshold (id.); and
determining the target visual data frame, in response to the number points being less than the threshold (id.).

Conclusion
	THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLES J HAN whose telephone number is (571)270-3980.  The examiner can normally be reached on M-Th and every other F (7:30 AM - 5 PM).
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Christian Chace can be reached on 571-272-4190.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 900-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CHARLES J HAN/Primary Examiner, Art Unit 3662