DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
The amendment filed on 10/28/2021 has been entered. Applicant has included newly added claims 17-20. The application has pending claims 1-20. With respect to the drawings, Applicant has amended the reference character for auxiliary locationing sensors in Fig. 2 and added reference character 110 for Fig. 10B in the specification. Therefore, the drawings objections are withdrawn. With respect to the specification, Applicant has amended the specification to include a cross-related reference to related applications section and summary of the invention section. The Applicant has also amended Paras. 0018, 0029, and 0033 to correct for minor informalities. Therefore, the objections to the specification have been withdrawn. With respect to the claims, Applicant has amended claims 1, 2, 8, and 16 to address the 112(b) rejections. However, the 112(b) rejection for claim 13 regarding lack of antecedent basis for “the track layout” was not addressed. Therefore, the 112(b) rejection for claim 13 is maintained (see below). 

Response to Arguments
Applicant's arguments filed on 10/28/2021 with respect to claims 1-20 have been fully considered but are moot in view of the new ground(s) of rejection necessitated by the amendments. The amendments to independent claims 1, 14, and 16 have changed the scope of the claims originally filed. The amended limitation in claims 1, 4, and 16 is as follows: “…based on at least a comparison of colors or highlights within the (rectified) stereoscopic images.”  Therefore, the rejection has been 

Applicant's arguments, see claims 5-13 in pages 17-19, filed 10/28/2021 have been fully considered but they are not persuasive. Applicant argues in page 17 that Susca et al. does not teach the following limitation in claim 5, “dividing an image region above a road surface into multiple tracks, and processing the multiple tracks to obtain multiple independent estimates of a trajectory of the vehicle”. However the Examiner disagrees because the prior art reference Susca et al. (US 2009/0279741 A1) does indeed disclose the broadest reasonable claim language interpretation of such a limitation “multiple tracks” as recited in claim 5 (see rejection below with updated comments in bold). More specifically Susca et al. discloses dividing an imaging region into multiple captures (i.e. multiple tracks) in Paras. 0015-0016. Further, in response to applicant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., a road region divided into three parallel tracks and a road region divided into six tracks as described in Paras. 0028-0029 of Applicant’s originally filed disclosure) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.


The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 14-15 and 17-18 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim 14 recites “estimating motion and orientation between corresponding features…based on at least a comparison of colors or highlights within the stereoscopic images.” After reviewing Para. 0023 in the specification and Fig. 5 cited by the Applicant in pg. 13 of the remarks filed on 10/28/2021, the specification and figure does not disclose estimating motion and orientation of features based on a comparison of colors or highlights. The specification instead discloses in Para. 0023 that features are detected or identified based on a comparison of colors. Therefore, the specification fails to provide a written description that shows the inventor possessed the invention as recited in claim 14 (and claims 15 and 17-18 by dependency). It is believed that the features are detected based on a comparison of color or highlights, not an estimation of motion and orientation of features are determined by a comparison of colors or highlights. 

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claim 13 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 13 recites the limitation "the track layout" in line 3.  There is insufficient antecedent basis for this limitation in the claim.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3-4, and 16, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over "Real-time Localization in Outdoor Environments using Stereo Vision and Inexpensive GPS" by Agrawal et al. in view of "Feature-based Visual Odometry Prior for Real-time Semi-dense Stereo SLAM" by Krombach et al and further in view of Chou et al. (US 9,325,899 B1).
Regarding claim 1, Agrawal et al. teaches, a method of processing a sequence of images to determine an image based trajectory of a vehicle along a road, comprising by a data processing system (Abstract: we describe a real-time, low-cost system to localize a mobile robot in outdoor environments. Our system relies on stereo vision to robustly estimate frame-to-frame motion in real time (also known as visual odometry); Note: frame-to-frame motion in real time is used to determine the trajectory of a moving object. A mobile robot is a moving object, similar to a vehicle): 
rectifying each stereoscopic pair of images to a common epipolar plane; 
rectified stereoscopic pair of images in each successive frame (Pg. 2, Col. 1: our visual odometry system uses feature tracks to estimate the relative motion between two frames. Corner feature points are detected in the left image of each stereo pair and tracked across frames; Pg. 2, Col. 2: commonly used features for feature-based approaches are the Harris corners [8] or the more stable SIFT features [11]; Pg. 3, Col. 1: Harris corners [8] are detected in the left and the right image of each frame in the video sequence) based on at least a comparison of colors or highlights within the rectified stereoscopic images; 
matching points of a detected feature in a first image of the rectified stereoscopic pair with corresponding detected features in a second image of the stereoscopic pair of images to generate a feature disparity map (Pg. 3, Col. 1: the features detected in the left image are matched to features in the right image of the same row using normalized cross correction 11×11 (NCC) over a window. Similarly, the features in the right image are matched to the left image; Pg. 3, Col. 1: The point M projects in the left image to the (x, y) point and its disparity is d; Pg. 3, Col. 1: for each feature point in the current frame, its NCC is evaluated for every feature point in the next frame that lies within a specified distance of its location in the current frame; Pg. 3, Col. 2: Corresponding to each rotation and translation pair hypothesis H (R, t), the disparity space homography H (R, t) can be calculated using equation 1. For a hypothesized correspondence in the disparity space); 
calculating a depth at each matched points of the detected features to obtain a sparse three-dimensional depth map of the road; 
and determining motion and orientation of the vehicle between successive stereoscopic image frames based on execution of an optical flow process by the data processing system that determines estimates of motion and orientation between successive stereoscopic pairs of images (Pg. 2, Col. 1: three of these points are used to estimate the motion using absolute orientation; Pg. 2, Col. 1: the relative motion between consecutive frames are chained together to obtain the absolute pose at each 
Agrawal et al. does not expressly disclose the following limitations underlined above: rectifying each stereoscopic pair of images to a common epipolar plane; rectified stereoscopic pair; based on at least a comparison of colors or highlights within the rectified stereoscopic images; calculating a depth at each matched points of the detected features to obtain a sparse three-dimensional depth map of the road. 
However, Krombach et al. teaches, rectifying each stereoscopic pair of images to a common epipolar plane (Pg. 29: a general prerequisite for stereo computation is to rectify the images. To allow different models for calibration we build a general rectification nodelet in ROS, that rectifies the images given respective look-up tables as input; Pg. 31: the non-rigid mounting of the stereo cameras introduces difficult conditions for stereo correspondence search along fixed epipolar lines); 
rectified stereoscopic pair (As shown in Pg. 29 and Fig. 17, stereo images are rectified);
calculating a depth at each matched points of the detected features to obtain a sparse three-dimensional depth map of the road (Pg. 7: in the depth map estimation, tracked frames are then used to refine the existing depth map of the key frame by many short-baseline stereo comparisons; Pg. 7: the depth is calculated by finding the best matching point along the epipolar line; Pg. 11: the depth map of each key frame is updated with instant stereo measurements as well as with propagated depth from the previous key frame; Pg. 6: deriving a dense 3D map from a set of sparse points as provided by any of the above SLAM systems).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include rectifying the stereoscopic images and calculating a depth 
The combination of Agrawal et al. and Krombach et al. does not expressly disclose the following limitation underlined above: based on at least a comparison of colors or highlights within the rectified stereoscopic images.
However, Chou et al. teaches, based on at least a comparison of colors or highlights within the rectified stereoscopic images (As shown in Col. 2, lines 54-57, a plurality of feature point correspondences are identified to calculate a homography matrix according to color information of a plurality of neighboring points of each feature point in the rectified images; Note: a homography matrix is a mapping of image planes that contain color information of the features. The mapping is a comparison, which means the color information (i.e. color components of the pixels) are compared during mapping).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include detecting features in the rectified images based on a comparison of colors as taught by Chou et al. into the combined image processing of Agrawal et al. and Krombach et al. in order to determine color information for a plurality of neighboring points in an image (Chou et al., Col. 2).
Regarding claim 3, the combination of Agrawal et al., Krombach et al., and Chou et al. teaches the limitations as explained in claim 1.
Agrawal et al. in the combination further teaches, wherein the disparity map measures pixel displacement between the matched features (Pg. 3, Col. 1: for each feature point in the current frame, its NCC is evaluated for every feature point in the next frame that lies within a specified distance of its location in the current frame. This distance is taken to be 50 for our setup. As in the stereo matching 
Regarding claim 4, the combination of Agrawal et al., Krombach et al., and Chou et al. teaches the limitations as explained in claim 1.
Agrawal et al. in the combination further teaches, further comprising estimating a current trajectory of the vehicle based on the estimates of motion and orientation between successive stereoscopic pairs of images (Pg. 2, Col. 1: three of these points are used to estimate the motion using absolute orientation; Pg. 2, Col. 1: the relative motion between consecutive frames are chained together to obtain the absolute pose at each frame; Pg. 4, Col. 1: the hypothesis is evaluated and scored based on reprojection errors in both views, resulting in an accurate estimate of the motion; Abstract: our system relies on stereo vision to robustly estimate frame-to-frame motion in real time (also known as visual odometry); Fig. 2(b)).
Regarding claim 16, Agrawal et al. teaches, a computer program product for execution by a computer system and comprising at least one non-transitory computer-readable medium having computer readable program code portions embodied therein to process a sequence of images to determine an image based trajectory of a vehicle along a road (Pg. 2, Col. 1: motion estimation from video is a well-studied problem in computer vision; Pg. 3: since each of the hypotheses generated during the RANSAC needs to be scored with all the feature correspondences, it is extremely important to code this efficiently. We have coded these routines using SIMD instructions; Note: computer vision consists of training computers to interpret and understand visual inputs (i.e. images and video). Computers have a 
the computer-readable program code portions comprising: an executable code portion to rectify each stereoscopic pair of images to a common epipolar plane; 
an executable code portion to detect features in each of the rectified stereoscopic pair of images in each successive frame (Pg. 3: since each of the hypotheses generated during the RANSAC needs to be scored with all the feature correspondences, it is extremely important to code this efficiently. We have coded these routines using SIMD instructions; Note: computer vision consists of training computers to interpret and understand visual inputs (i.e. images and video). Computers have a computer-readable medium that stores instructions (i.e. code) to execute tasks such as image processing. In this case, SMID is a computing method that allows for processing of multiple data with a single instruction (i.e. code); Pg. 2, Col. 1: our visual odometry system uses feature tracks to estimate the relative motion between two frames. Corner feature points are detected in the left image of each stereo pair and tracked across frames; Pg. 2, Col. 2: commonly used features for feature-based approaches are the Harris corners [8] or the more stable SIFT features [11]; Pg. 3, Col. 1: Harris corners [8] are detected in the left and the right image of each frame in the video sequence) based on at least a comparison of colors or highlights within the rectified stereoscopic images; 
an executable code portion to match points of a detected feature in a first image of the rectified stereoscopic pair with corresponding detected features in a second image of the stereoscopic pair of images to generate a feature disparity map (Pg. 3: since each of the hypotheses generated during the 
an executable code portion to calculate a depth at each matched points of the detected features to obtain a sparse three-dimensional depth map of the road; 
and an executable code portion to determine motion and orientation of the vehicle between successive stereoscopic image frames based on execution of an optical flow process by the data processing system that determines estimates of motion and orientation between successive stereoscopic pairs of images (Pg. 3: since each of the hypotheses generated during the RANSAC needs to be scored with all the feature correspondences, it is extremely important to code this efficiently. We have coded these routines using SIMD instructions; Note: computer vision consists of training computers to interpret and understand visual inputs (i.e. images and video). Computers have a computer-readable medium that stores instructions (i.e. code) to execute tasks such as image processing. In this case, SMID is a computing method that allows for processing of multiple data with a 
Agrawal et al. does not expressly disclose the following limitations underlined above: the computer-readable program code portions comprising: an executable code portion to rectify each stereoscopic pair of images to a common epipolar plane; rectified stereoscopic pair; based on at least a comparison of colors or highlights within the rectified stereoscopic images; an executable code portion to calculate a depth at each matched points of the detected features to obtain a sparse three-dimensional depth map of the road.
However, Krombach et al. teaches, the computer-readable program code portions comprising: an executable code portion to rectify each stereoscopic pair of images to a common epipolar plane (Pg. 2: SLAM broadens this task by also requiring to compute a representation of the robot's surrounding referred to as map; Pg. 2: we extend monocular LSD-SLAM [1] to work with a stereo setup and restrict semi-dense matching to key frames for achieving a higher frame rate. In order to estimate the motion between key frames, we employ a feature-based VO method and use the estimated motion as initialization for the direct image alignment. Thus, we restrict the search space for direct image alignment and gain real-time performance. This paper builds upon our recent work [2] where we introduced a VO algorithm deploying both feature-based and semi-direct matching techniques. Here, we expand this approach to a fully- fledged SLAM system; Note: SLAM algorithms are based on concepts in computer vision and these algorithms (i.e. code) are stored in the memory of the computer system to carry out processing; Pg. 29: a general prerequisite for stereo computation is to rectify the images. To 
rectified stereoscopic pair (As shown in Pg. 29 and Fig. 17, stereo images are rectified);
an executable code portion to calculate a depth at each matched points of the detected features to obtain a sparse three-dimensional depth map of the road (Pg. 2: SLAM broadens this task by also requiring to compute a representation of the robot's surrounding referred to as map; Pg. 2: we extend monocular LSD-SLAM [1] to work with a stereo setup and restrict semi-dense matching to key frames for achieving a higher frame rate. In order to estimate the motion between key frames, we employ a feature-based VO method and use the estimated motion as initialization for the direct image alignment. Thus, we restrict the search space for direct image alignment and gain real-time performance. This paper builds upon our recent work [2] where we introduced a VO algorithm deploying both feature-based and semi-direct matching techniques. Here, we expand this approach to a fully-fledged SLAM system; Note: SLAM algorithms are based on concepts in computer vision and these algorithms (i.e. code) are stored in the memory of the computer system to carry out processing; Pg. 7: in the depth map estimation, tracked frames are then used to refine the existing depth map of the key frame by many short-baseline stereo comparisons; Pg. 7: the depth is calculated by finding the best matching point along the epipolar line; Pg. 11: the depth map of each key frame is updated with instant stereo measurements as well as with propagated depth from the previous key frame; Pg. 6: deriving a dense 3D map from a set of sparse points as provided by any of the above SLAM systems).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include rectifying the stereoscopic images and calculating a depth at each matched point to obtain a 3-D depth map as taught by Krombach et al. into the image processing of Agrawal et al. in order to track the motion fast and reliably and have an accurate 
The combination of Agrawal et al. and Krombach et al. does not expressly disclose the following limitation underlined above: based on at least a comparison of colors or highlights within the rectified stereoscopic images.
However, Chou et al. teaches, based on at least a comparison of colors or highlights within the rectified stereoscopic images (As shown in Col. 2, lines 54-57, a plurality of feature point correspondences are identified to calculate a homography matrix according to color information of a plurality of neighboring points of each feature point in the rectified images; Note: a homography matrix is a mapping of image planes that contain color information of the features. The mapping is a comparison, which means the color information (i.e. color components of the pixels) are compared during mapping).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include detecting features in the rectified images based on a comparison of colors as taught by Chou et al. into the combined image processing of Agrawal et al. and Krombach et al. in order to determine color information for a plurality of neighboring points in an image (Chou et al., Col. 2).
Regarding claim 19, the combination of Agrawal et al. Krombach et al., and Chou et al. teaches the limitations as explained above in claim 16.
Agrawal et al. in the combination further teaches, wherein the disparity map measures pixel displacement between the matched features (Pg. 3, Col. 1: the features detected in the left image are matched to features in the right image of the same row using normalized cross correction 11×11 (NCC) over a window. Similarly, the features in the right image are matched to the left image; Pg. 3, Col. 1: The point M projects in the left image to the (x, y) point and its disparity is d; Pg. 3, Col. 1: for each feature point in the current frame, its NCC is evaluated for every feature point in the next frame that lies within .

Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over "Real-time Localization in Outdoor Environments using Stereo Vision and Inexpensive GPS" by Agrawal et al. in view of "Feature-based Visual Odometry Prior for Real-time Semi-dense Stereo SLAM" by Krombach et al. and further in view of Chou et al. (US 9,325,899 B1) and Borisov (US 2017/0064287 A1).
Regarding claim 2, the combination of Agrawal et al., Krombach et al., and Chou et al. teaches the limitations as explained above in claim 1.
The combination of Agrawal et al., Krombach et al., and Chou et al. does not expressly disclose the following limitation: comprising by the data processing system, converting each image of the rectified stereoscopic pair of images to a grayscale format.
However, Borisov teaches, comprising by the data processing system, converting the image of the stereoscopic pair of images to a grayscale format (Abstract: a method of producing a 3-dimensional model of a scene; Para. 0029: we convert RGB images to grayscale… where I(x,y) is grayscale intensity of a pixel with coordinates (x,y)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include converting images to grayscale as taught by Borisov into the combined image processing of Agrawal et al., Krombach et al., and Chou et al. in order to reduce the processing resources for processing the images.

Claims 5-11, and 13 are rejected under 35 U.S.C. 103 as being unpatentable over "Real-time Localization in Outdoor Environments using Stereo Vision and Inexpensive GPS" by Agrawal et al. in view of "Feature-based Visual Odometry Prior for Real-time Semi-dense Stereo SLAM" by Krombach et al. and further in view of Chou et al. (US 9,325,899 B1) and Susca et al. (US 2009/0279741 A1).
Regarding claim 5, the combination of Agrawal et al., Krombach et al., and Chou et al. teaches the limitations as explained above in claim 1.
The combination of Agrawal et al., Krombach et al., and Chou et al. does not expressly disclose the following limitation: further comprising by the data processing system, dividing an image region above a road surface into multiple tracks, and processing the multiple tracks to obtain multiple independent estimates of a trajectory of the vehicle.
However, Susca et al. teaches, further comprising by the data processing system, dividing an image region above a road surface into multiple tracks (Para. 0015: as the object moves, the field of view of imaging device 104 changes. Based on these changes in the images taken of the field of view, system 100 (or system 200) determines the motion of the object; Para. 0016: method 300 is used to determine the motion of an object within the field of view of imaging device 104. For example, in one embodiment, imaging device 104 is mounted in a substantially stationary position. In another embodiment, imaging device 104 is not stationary; however, the movement of imaging device 104 is known and accounted for. Overtime, as the object moves through the scene viewed by imaging device 104, imaging device 104 obtains images of the object. Based on the change in position of the object relative to imaging device 104 and/or other features within the scene, the motion of the object is determined; Note: the imaging region is divided into multiple captures (i.e. multiple tracks)),
 and processing the multiple tracks to obtain multiple independent estimates of a trajectory of the vehicle (Para. 0015-0016; Para. 0017: method 300 determines the motion of an object by comparing images from different points in time and analyzing the changes between the images. The motion of an object from one image to the next image can be separated into the objects rotation between the two 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include dividing an image region into tracks and obtaining multiple trajectory estimates of the tracks as taught by Susca et al. into the combined image processing of Agrawal et al., Krombach et al., and Chou et al. in order to determine position and attitude with sensors that measure incremental motion (Susca et al., Para. 0002).
Regarding claim 6, the combination of Agrawal et al., Krombach et al., Chou et al., and Susca et al. teaches the limitations as explained above in claim 5.
Susca et al. in the combination further teaches, further comprising by the data processing system, combining multiple ones of the estimates to determine estimates of motion and orientation of the vehicle between successive stereoscopic frames (Para. 0015-0016; Para. 0017: method 300 determines the motion of an object by comparing images from different points in time and analyzing the changes between the images. The motion of an object from one image to the next image can be separated into the objects rotation between the two images and objects translation between the two images; Para. 0029-0030; Para. 0033: when LIDAR 202 is used as imaging device 104, the 3D feature information for the first image is built up based on a combination of the intensity and range images from LIDAR 202 at time X. Similarly, the 3D feature information for the second image is built up using the intensity and range images from LIDAR 202 at time X+1; Note: motion (i.e. trajectory) estimates are obtained at different points in time in the field of view and the motion is combined to determine the 3D feature information).
Regarding claim 7, the combination of Agrawal et al., Krombach et al., Chou et al., and Susca et al. teaches the limitations as explained above in claim 6.
Susca et al. in the combination further teaches, further comprising by the data processing 
Regarding claim 8, the combination of Agrawal et al., Krombach et al., Chou et al, and Susca et al. teaches the limitations as explained above in claim 5.
Susca et al. in the combination further teaches, further comprising by the data processing system, processing each of the tracks independently of other tracks to obtain a respective set of features for each stereographic image in the frame for each track (Para. 0015-0016; Para. 0017: method 300 determines the motion of an object by comparing images from different points in time and analyzing the changes between the images. The motion of an object from one image to the next image can be separated into the objects rotation between the two images and objects translation between the two images; Para. 0020: to compare the first image and the second image, features within each of the images identified. Features within first image are then compared to the same features in the second image. As mentioned above, the location, size, and/or perspective of the feature changes as either the feature or imaging device 104 moves; Note: the imaging region is divided into multiple captures (i.e. multiple tracks)).
Regarding claim 9, the combination of Agrawal et al., Krombach et al., Chou et al., and Susca et al. teaches the limitations as explained above in claim 8.
Susca et al. in the combination further teaches, further comprising by the data processing system, determining a respective set of disparity and depth maps for each track (Para. 0025: to match feature A of the first image with a feature in the second image, the matching algorithm calculates the Euclidian distance between the descriptor of feature A and each of the features within the second 
Regarding claim 10, the combination of Agrawal et al., Krombach et al., Chou et al., and Susca et al. teaches the limitations as explained above in claim 9.
Susca et al. in the combination further teaches, further comprising by the data processing system, dividing an imaged road region into a rectangular array multiple tracks wide and multiple tracks long (Para. 0015: as the object moves, the field of view of imaging device 104 changes. Based on these changes in the images taken of the field of view, system 100 (or system 200) determines the motion of the object; Para. 0016: method 300 is used to determine the motion of an object within the field of view of imaging device 104. For example, in one embodiment, imaging device 104 is mounted in a substantially stationary position. In another embodiment, imaging device 104 is not stationary; however, the movement of imaging device 104 is known and accounted for. Overtime, as the object moves through the scene viewed by imaging device 104, imaging device 104 obtains images of the object. Based on the change in position of the object relative to imaging device 104 and/or other features within the scene, the motion of the object is determined; Para. 0020: in order to effectively the imaging region is divided into multiple captures (i.e. multiple tracks) and the multiple scales of an image are the dimensions (i.e. length and width) or the tracks), 
and processing the tracks to obtain respective independent sets of estimates of motion and orientation along a trajectory of the vehicle (Para. 0015-0016; Para. 0017: method 300 determines the motion of an object by comparing images from different points in time and analyzing the changes between the images. The motion of an object from one image to the next image can be separated into the objects rotation between the two images and objects translation between the two images; Note: motion (i.e. trajectory) estimates are obtained at different points in time in the field of view).
Regarding claim 11, the combination of Agrawal et al., Krombach et al., Chou et al., and Susca et al. teaches the limitations as explained above in claim 10.
Susca et al. in the combination further teaches, wherein the dividing comprises dividing the imaged road region into the rectangular array by capturing a stereographic image set and windowing the imaged road surface into the sets of tracks (Para. 0004: such sensor systems are stereo and monocular vision systems, Light Detection and Ranging (LIDARS), and RADARS. Such sensors have been used primarily as incremental navigation systems; Para. 0013; Para. 0012: optical camera 110 and optical camera 112 are positioned such that optical camera 110 and optical camera 112 are focused on  the imaging region is divided into multiple captures (i.e. multiple tracks) and the multiple scales of an image are the dimensions (i.e. length and width) or the tracks that can be imaged by LIDAR. LIDAR is used for incremental navigation (i.e. windowing image regions or tracks)).
Regarding claim 13, the combination of Agrawal et al., Krombach et al., Chou et al., and Susca et al. teaches the limitations as explained above in claim 10.
Susca et al. in the combination further teaches, wherein the dividing comprises dividing the imaged road region into the rectangular array using respective pairs of stereoscopic image devices arrayed according to the track layout (Para. 0004: such sensor systems are stereo and monocular vision systems, Light Detection and Ranging (LIDARS), and RADARS. Such sensors have been used primarily as  the imaging region is divided into multiple captures (i.e. multiple tracks) and the multiple scales of an image are the dimensions (i.e. length and width) or the tracks that can be imaged by LIDAR. LIDAR is used for incremental navigation (i.e. image regions or arrays of regions of the tracks)).

Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over "Real-time Localization in Outdoor Environments using Stereo Vision and Inexpensive GPS" by Agrawal et al. in view of "Feature-based Visual Odometry Prior for Real-time Semi-dense Stereo SLAM" by Krombach et al. and further in view of Chou et al. (US 9,325,899 B1), Susca et al. (US 2009/0279741 A1), and Yamaguchi et al. (US 2017/0045889 A1).
Regarding claim 12, the combination of Agrawal et al., Krombach et al., Chou et al, and Susca et al. teaches the limitations as explained above in claim 10.
The combination of Agrawal et al., Krombach et al., Chou et al., and Susca et al. does not expressly disclose the following limitation: wherein the dividing comprises dividing the imaged road region into the rectangular array using optical elements configured to direct light reflecting from a surface of the road to respective ones of the sets of tracks.
However, Yamaguchi et al. teaches, wherein the dividing comprises dividing the imaged road region into the rectangular array using optical elements configured to direct light reflecting from a surface of the road to respective ones of the sets of tracks (Para. 0006: projects a patterned light beam onto a road surface around a vehicle; captures and thus obtains an image of the road surface around the vehicle including an area of the projected patterned light beam; calculates an orientation angle of the vehicle relative to the road surface from a position of the patterned light beam on the obtained image; sets a feature-point detection region surrounding the area of the projected patterned light beam on the obtained image; Para. 0046: the height at and direction in which to set the camera 12 are adjusted in a way that enables the camera 12 to capture images of feature points (textures) on the road surface 31 in front of the vehicle 10 and the patterned light beam 32b projected from the light projector 11. The focus and diaphragm of the lens of the camera 12 are automatically adjusted as well. The camera 12 repeatedly captures images at predetermined time intervals, and thereby obtains a series of image (frame) groups; Note: the projected light beam from the vehicle is an optical element and the image of the road surface around the vehicle are the tracks divided by the patterned light beam (i.e. arrays)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include dividing the imaged road region using optical elements .

Claims 14-15, and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Susca et al. (US 2009/0279741 A1) in view of Takiguchi et al. (US 2010/0034426 A1) and further in view of “Extensions to the Visual Odometry Pipeline for the Exploration of Planetary Surfaces” by Furgale.
Regarding claim 14, Susca et al. teaches, a method of processing a sequence of successive stereographic image frames to determine estimates of motion and orientation of a vehicle (Para. 0012: Optical camera 110 and optical camera 112 are positioned such that optical camera 110 and optical camera 112 are focused on the same scene (although from different angles). This is sometimes referred to in the art as a stereo camera system; Para. 0014: system 100 is included within a vehicle and used as a navigation aid for the vehicle; Para. 0020: method 300 begins at step 302 where a 2D image of a scene which is in the view of imaging device 104 is obtained at time X. In step 304, at time X+1 a second 2D image of a scene which is in the view of imaging device 104 is obtained. The first image and the second image are compared to determine the rotation between the first and second image. In order to compare the first image and the second image, features within each of the images identified. Features within first image are then compared to the same features in the second image. As mentioned above, the location, size, and/or perspective of the feature changes as either the feature or imaging device 104 moves; Para. 0029-0030), 
the method comprising: dividing an image region above a road surface into multiple tracks (Para. 0015: as the object moves, the field of view of imaging device 104 changes. Based on these changes in the images taken of the field of view, system 100 (or system 200) determines the motion of the object; Para. 0016: method 300 is used to determine the motion of an object within the field of view  the imaging region is divided into multiple captures (i.e. multiple tracks)); 
for each track, by the data processing system, estimating motion and orientation between corresponding features in respective stereographic image pairs (Para. 0015-0016; Para. 0017: method 300 determines the motion of an object by comparing images from different points in time and analyzing the changes between the images. The motion of an object from one image to the next image can be separated into the objects rotation between the two images and objects translation between the two images. In one embodiment, two images are used to determine the motion; Para. 0019; Para. 0027: the rotation from the eight-point algorithm is then input into a modified absolute orientation algorithm, shown in steps 416-418 to determine translation. In on embodiment, the inputs to the eight point algorithm are two sets of eight matched features from the first and second image. In other words, eight features from the first image were matched with eight features from the second image and those two sets of eight features are input into the eight point algorithm Note: the imaging region is divided into multiple captures (i.e. multiple tracks)) based on at least a comparison of colors or highlights within the stereoscopic images; 
removing, by the data processing system, inconsistent motion and orientation estimates as outliers (Para. 0026: the RANdom SAmple Consensus (RANSAC) algorithm is used to reject outliers. Outliers are sets of point features that do not preserve a unique motion; Para. 0029-0030; Para. 0034: RANSAC algorithm discussed above is used to reject outliers); 
determining, by the data processing system, estimates of motion and orientation based on an aggregation of consistent motion and orientation estimates; 
and processing the multiple tracks to obtain multiple independent estimates of a trajectory of the vehicle (Para. 0015-0016; Para. 0017: method 300 determines the motion of an object by comparing images from different points in time and analyzing the changes between the images. The motion of an object from one image to the next image can be separated into the objects rotation between the two images and objects translation between the two images; Note: motion (i.e. trajectory) estimates are obtained at different points in time in the field of view).
Susca et al. does not expressly disclose the following limitations underlined above: based on at least a comparison of colors or highlights within the stereoscopic images; determining, by the data processing system, estimates of motion and orientation based on an aggregation of consistent motion and orientation estimates.
However, Takiguchi et al. teaches, based on at least a comparison of colors or highlights within the stereoscopic images (As shown in Para. 0057, motion stereo images are captured by a camera mounted on a vehicle and orientation of the feature is measured; As shown in Para. 0188, color information of pixels of the feature points on an image is compared with color information of each pixel on the epipolar line on another image; As shown in Para. 0105, orientation of the features on the road are obtained in time series; Note: motion and orientation of images are detected and color information is compared between the images).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to estimating motion and orientation between features based on a comparison of colors as taught by Takiguchi et al. into the image processing of Susca et al. in order to discriminate corresponding points during matching (Takiguchi et al., Para. 0190).
The combination of Susca et al. and Takiguchi et al. does not expressly disclose the following 
However, Furgale teaches, determining, by the data processing system, estimates of motion and orientation based on an aggregation of consistent motion and orientation estimates (Fig. 4.1; Pg.13: the goal of the nonlinear numerical solution is to find the state variables—a set of variables encoding (i) the change in position and orientation of the camera between images; Pg. 51: the points are accumulated into voxels at each frame. Outliers are filtered out using their local orientation and by selecting the threshold of range measurements required per voxel for a valid mesh vertex).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include determining estimates of motion and orientation based on an aggregation of consistent motion and orientation as taught by Furgale into the combined image processing of Susca et al. and Takiguchi et al. in order to have smoother and more realistic reconstructions (Furgale, Pg. 50).
Regarding claim 15, the combination of Susca et al., Takiguchi et al., and Furgale teaches the limitations as explained above in claim 14.
Furgale in the combination further teaches, wherein consistent motion and orientation estimates are averaged to determine the motion and orientation estimates (As seen in Fig. 5.13, the mean feature count for lateral and angular deviations is determined; Pg. 11: tracks is subject to outlier rejection using RANSAC. After outlier rejection, all remaining tracks support a single motion hypothesis; Pg. 12: an algorithm that repeatedly (i) generates a model from a randomly selected minimal set of data, and (ii) scores the model by counting the number of data points with error below a fixed threshold; Pg. 12: after outlier rejection, the remaining feature tracks are consistent with a single motion estimate hypothesis (Figure 2.5)).
Regarding claim 17, the combination of Susca et al., Takiguchi et al., and Furgale teaches the 
Susca et al. in the combination further teaches, further comprising by the data processing system, processing each of the tracks independently of other tracks to obtain a respective set of features for each stereographic image in the frame for each track (Para. 0015-0016; Para. 0017: method 300 determines the motion of an object by comparing images from different points in time and analyzing the changes between the images. The motion of an object from one image to the next image can be separated into the objects rotation between the two images and objects translation between the two images; Para. 0020: to compare the first image and the second image, features within each of the images identified. Features within first image are then compared to the same features in the second image. As mentioned above, the location, size, and/or perspective of the feature changes as either the feature or imaging device 104 moves; Note: the imaging region is divided into multiple captures (i.e. multiple tracks)).
Regarding claim 18, the combination of Susca et al., Takiguchi et al., and Furgale teaches the limitations as explained above in claim 17.
Susca et al. in the combination further teaches, further comprising by the data processing system, determining a respective set of disparity and depth maps for each track (Para. 0025: to match feature A of the first image with a feature in the second image, the matching algorithm calculates the Euclidian distance between the descriptor of feature A and each of the features within the second image. The feature within the second image having the minimum distance from the descriptor of feature A in the first image is selected as the match to feature A; Para. 0030: two additional quality criteria are applied: the epipolar and the disparity constraints. To impose the epipolar constraint the difference of row indices of the two features is required to be within a certain threshold. To impose the disparity constraint, the difference between the column index for the matched features from the image of camera 110 and the image of camera 112 is required to be negative; Para. 0033: when LIDAR 202 is .

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over "Real-time Localization in Outdoor Environments using Stereo Vision and Inexpensive GPS" by Agrawal et al. in view of "Feature-based Visual Odometry Prior for Real-time Semi-dense Stereo SLAM" by Krombach et al and further in view of Chou et al. (US 9,325,899 B1) and Bellaiche (US 2018/0024562 A1).
Regarding claim 20, the combination of Agrawal et al., Krombach et al., Chou et al. teaches the limitations as explained above in claim 16.
The combination of Agrawal et al., Krombach et al., Chou et al. does not expressly disclose the following limitations: further comprising by the data processing system, dividing the first image and the second image into multiple tracks, and processing the multiple tracks to obtain multiple independent estimates of a trajectory of the vehicle.
However, Bellaiche teaches, further comprising by the data processing system, dividing the first image and the second image into multiple tracks (Para. 0245; As shown in Para. 0246 and Fig. 11C, there are different lanes of a road (i.e. multiple tracks); As shown in Para. 0260, a plurality of images are captured by the camera installed on a vehicle in which the position of the vehicle is identified in each frame; Note: a plurality of images taken of the road segmented in lanes shows there are multiple images taken in which the images are divided into lanes (i.e. tracks)), 
and processing the multiple tracks to obtain multiple independent estimates of a trajectory of the vehicle (As shown in Para. 0246, there are trajectories for one or more lanes of the road segments 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include dividing two images into tracks and obtaining estimates of a trajectory of a vehicle for each track as taught by Bellaiche into the combined image processing of Agrawal et al., Krombach et al., Chou et al. in order to correct a position of a vehicle navigating a road segment (Bellaiche, Abstract).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Qu et al. teaches (US 2018/0210465 A1) teaches rectifying images from each camera within the pair using epipolar lines, matching features, creating a disparity map describing the offset between pixels in the images, and calculating depth information for each pixel (Para. 0063).

Park et al. (US 2018/0211400 A1) teaches a stereo matching method in which feature points are matched in the first and second images (Abstract).

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Daniella M. DiGuglielmo whose telephone number is (571)272-2682. The examiner can normally be reached Monday - Friday 7:30 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nay Maung can be reached on 571-272-7882. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/PING Y HSIEH/Primary Examiner, Art Unit 2664