Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Claim Rejections - 35 USC § 101



35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 14 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim does not fall within at least one of the four categories of patent eligible subject matter because this claim recites: “a computer-readable medium …”. The specification discloses in paragraph [0025]: “… a computer-readable medium stores computer-executable program instructions that, when executed by a processor, cause the processor to perform operations including specifying a virtual camera pose by using 3-dimensional (3D) model data which is based on an image of an outdoor space captured from the air; rendering the image of the outdoor space from a perspective of the virtual camera, by using the virtual camera pose and the 3D model data; and generating a feature point map by using the rendered image and the virtual camera pose.”
The broadest reasonable interpretation of a claim drawn to ‘a computer-readable medium’ typically covers forms of non-transitory tangible media and transitory propagating signals per se in view of the ordinary and customary meaning of ‘computer-readable medium’, particularly when the specification is otherwise silent.
Applicant is advised to amend the claim to exclude such transitory embodiments by adding ‘non-transitory’ to ‘computer-readable medium …’, as “non-transitory computer-readable medium …”, which would render the claim statutory.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Čadík et al. ("Automated Outdoor Depth-Map Generation and Alignment", published 16 May 2018, pp. 109-118, 'ČADÍK') in view of Piemonte et al. (U.S. PG-PUB 2013/0321403, 'PIEMONTE').
Regarding claim 1, ČADÍK discloses a method of generating a map for visual localization, comprising: 
specifying a virtual camera pose by using 3-dimensional (3D) model data which is based on an image of an outdoor space (ČADÍK; p. 109, § 1. Intro., right col.; “… we present a fully automatic framework for depth-map generation and alignment for an outdoor photograph [‘image of an outdoor space’]. A virtual camera is … localized with the geo-tagging information of a photo and recent camera pose estimation … the 3D terrain model is rendered at the virtual camera to produce an … approximate depth map” p. 111, left column; “In … an outdoor scene, the DEM is … sufficient for camera localization and pose estimation …”) captured from the air (ČADÍK; p. 110, § 3.; “… digital terrain models are … available for the … planet. They are acquired from satellites and/or planes …”); and
rendering the image of the outdoor space from a perspective of the virtual camera, by using the virtual camera pose and the 3D model data (ČADÍK; p. 111, § 3., right col.; “Camera pose estimation. Given the camera location, we automatically estimate its pose, i.e. all the unknown camera orientation angles (yaw, pitch, and roll). We implemented a visual camera orientation estimation … The method is based on matching the edges detected in the photograph with the synthetic silhouettes rendered from the terrain model. … we … exploit the image EXIF data (assisted by a camera database) to perform rectilinear projection with the known field-of-view. We … render model silhouettes as depth discontinuities into a 2D cylindrical image, which is vectorized into a silhouette edge map.”; caption of FIG. 2; “… fully automatic depth-map generation framework [is] from a … landscape photograph. Based on the EXIF information of the photograph, the camera pose [used] to render the 3D terrain model is automatically aligned with the image. … the … coarse depth map is rendered from the model using the estimated camera pose …”).
ČADÍK does not explicitly disclose generating a feature point map by using the rendered image and the virtual camera pose, which PIEMONTE discloses (PIEMONTE; FIG. 8; ¶ 0158-159; “… a method for highlighting a feature in a 3D map [‘generating a feature point map’] while preserving depth information … the method may include displaying a view of a map that visually indicates 3D depth for … the features within the map (e.g., buildings …, geological features, signs, or 3D representations of labels on the map). … the method may include a map tool of a mapping application or a navigation application displaying a map drawn with perspective so that a person viewing the map receives visual clues about the relative depth of the real-world features depicted in the map” ¶ 0160; “… the method may include performing a blur operation on the map to produce a blurred version of the map that maintains the indication of 3D depth for the blurred map features, as in 820. … performing the blur operation may … include generating … additional (i.e., alternate) views of the map, each of which represents the map from the perspective of a different viewing position and/or angle (e.g., as viewed through a virtual camera in different positions and/or orientations), and blending those additional views together (with or without the original map view). … the method may also include rendering the blurred version of the map (as in 830), and rendering an unblurred version of the selected 3D feature of the map within the blurred version of the map (as in 840). … an unblurred version of the selected 3D feature may be superimposed on the blurred map in its original position in the originally displayed (i.e., unblurred) map view. Superimposing a crisp version of the selected 3D feature on the blurred version of the map may allow the map tool to provide visual clues about the relative depth of the real world features depicted in the blurred version of the map (e.g., through perspective), and may serve to highlight the selected feature in the map while visually indicating its position in 3D space relative to other features that are depicted in the map.”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method of generating a map for visual localization of ČADÍK to include the generating a feature point map by using the rendered image and the virtual camera pose of PIEMONTE. The motivation for this modification could have been to generate a sense of depth for objects within a map that a user may wish to use for interactive navigation.
Regarding claim 14, ČADÍK-PIEMONTE disclose a computer-readable medium [that is] storing computer-executable program instructions that, when executed by a processor, cause the processor to perform operations (PIEMONTE; FIG. 10; ¶ 0178; “… any or all of these modules may be implemented as program instructions encoded on a non-transitory, computer-readable medium that when executed on … computers cause the computers to perform the functionality described herein.”) including: … ([The remaining limitations are repeated verbatim from those of claim 1.]).
Regarding claim 2, ČADÍK-PIEMONTE disclose the method of claim 1, wherein the rendering of the image of the outdoor space comprises: 
rendering, together with the rendered image, a depth map corresponding to the rendered image by using the virtual camera pose (ČADÍK; p. 111, § 3., right col.; “Camera pose estimation. Given the camera location, we automatically estimate its pose, i.e. all the unknown camera orientation angles (yaw, pitch, and roll). We implemented a visual camera orientation estimation …”) and the 3D model data (ČADÍK; p. 112, § 3., left col.; “Depth map rendering. Given the camera parameters estimated in the steps described above, we can easily render the depth map from the terrain model [‘3D model data’] … The obtained depths calibrated in meters are … stored in an image file of high-dynamic-range format (to facilitate the absolute distance estimation).”).
Regarding claim 3, ČADÍK-PIEMONTE disclose the method of claim 2, wherein the generating of the feature point map comprises: 
extracting a feature point of an object positioned at the outdoor space by using the rendered image (PIEMONTE; ¶ 0031; “… when viewing a 3D map in perspective, a user may pick or select an annotation, building, or other map object in 3D space to be highlighted in the map. In response to the selection of the map feature, the display of the map may be modified in order to visually accent the selected feature. … the systems and methods described herein may enable a computing device to implement a map tool of mapping application or a navigation application that highlights a selected feature of a 3D map (e.g., a selected building, subway station, park, sign, 3D map annotation, or another 3D object visible in the scene) while preserving the perception of depth. … the map tools … pivot a 3D scene (e.g., a view of a 3D map) to generate … alternate views of the scene, and … create a blurred version of the 3D map by blending those views, while maintaining the crispness of a selected feature in the scene.”), and 
extracting 3D coordinates of the feature point (PIEMONTE; ¶ 0164; “… a ray is received from the virtual camera … into the 3D scene (e.g., into the center of the selected map feature). The pivot point P is the ray intersection of an eye forward vector (i.e., a vector in the direction in which the … camera is looking) from the eye into the selected feature in the map. The length of the vector L [is] determined based on the known position of the virtual camera in the scene (i.e., its position in 3D space, as represented by its x, y, z coordinates) and the determined position of the selected feature … (also represented by x, y, z coordinates).”) by using the rendered depth map (PIEMONTE; ¶ 0148; “FIG. 4 illustrates a … device 300 on which a mapping or navigation application is displaying a … bird's eye view of a 3D map … the map (which is drawn in perspective to provide visual clues indicating the relative depth of various features in the scene) depicts an area that contains two roads, a collection of buildings (modeled simply as 3D blocks of various sizes and shapes, and labeled as 402-414), and a fountain 416. … a map tool of the mapping or navigation application … detects selection of a feature of the map (or receive information specifying such a selection) and to highlight the selected feature using a blur operation on the map that preserves the visual clues indicating the depth of the illustrated features in 3D space.”).
Regarding claim 4, ČADÍK-PIEMONTE disclose the method of claim 3, wherein the feature point map includes the feature point, the 3D coordinates, and the virtual camera pose (PIEMONTE; ¶ 0164; “… a ray is received from the virtual camera … into the 3D scene [‘feature point map’] (e.g., into the center of the selected map feature [‘feature point’]). The pivot point P is the ray intersection of an eye forward vector (i.e., a vector in the direction in which the … camera is looking) from the eye into the selected feature in the map. The length of the vector L [is] determined based on the known position of the virtual camera [‘virtual camera pose’] in the scene (i.e., its position in 3D space, as represented by its x, y, z coordinates) and the determined position of the selected feature … (also represented by x, y, z coordinates [‘3D coordinates’]).”).
Claims 5-6 are rejected under 35 U.S.C. 103 as being unpatentable over ČADÍK in view of PIEMONTE as applied to claim 1 above, and further in view of Rogan et al. (U.S. PG-PUB 2014/0368493, 'ROGAN').
Regarding claim 5, ČADÍK-PIEMONTE disclose the method of claim 1; however, ČADÍK-PIEMONTE do not explicitly disclose that the rendering of the image of the outdoor space from the perspective of the virtual camera comprises:26Atty. Dkt. No. 18805-000010-US

    PNG
    media_image1.png
    732
    460
    media_image1.png
    Greyscale

distinguishing unnecessary objects and necessary objects from each other, which ROGAN discloses (ROGAN; ¶ 0027; “These … techniques for evaluating a lidar point cloud 210 to detect and classify a set of objects 102 in an environment 100 may facilitate the process of generating a rendering of the environment 100 omitting such objects 102. FIG. 4 presents an illustration of an exemplary scenario 400 featuring an omission of such objects 102 from a rendering 408 of an environment 100. In … scenario 400, a representation of the environment 100 is captured from a capture perspective 402 (e.g., a position within the environment 100), which may include both the environment 100 and the objects 102 present therein, including vehicles, individuals 110, signs 112, and buildings 114. Some objects (such as the signs 112 and buildings 114) [‘necessary objects’] [are] regarded as part of the environment 100 that are to be included in the rendering of the environment 100 (e.g., as fixed-ground objects and background objects), while other objects 102 [are] regarded as transients to be removed [‘unnecessary objects’] from the rendering of the environment 100 (e.g., as moving objects and stationary foreground objects). … some objects 102 may include only an object portion of the object 102 that is to be omitted. … rather than omitting an entire individual 110 or vehicle, it may be desirable to omit only an object portion of the object 102 that may be associated with a particular individual 110, such as the individual's face, or a license plate 404 of a vehicle [‘unnecessary objects’].”), and 
rendering the image of the outdoor space from the perspective of the virtual camera by excluding the unnecessary objects, which ROGAN also discloses (ROGAN; ¶ 0028; “In order to generate a rendering 408 of the environment 100 satisfying these considerations, the representation of the environment 100, including the lidar point cloud 210 captured by a lidar detector 208, [is] evaluated to identify the objects 102 in the environment 100, and a movement classification 316 of such objects 102. A rendering 408 of the environment 100 assembled from the capturing 406 (e.g., a stitched-together image assembled from a set of panoramic and/or spherical images) … presents a spherical view 410 from the capture perspective 402 that omits any portions of the capturing 406 depicting the objects 102 detected within the environment 100 and according to the movement classification 316. … the rendering 408 … excludes all objects 102 that are classified to be moving. Objects 102 that are classified as stationary [are] evaluated to distinguish stationary foreground objects (e.g., objects 102 that are within a particular range of the capture perspective 402) from fixed-ground objects (such as signs 112) and/or background objects (such as buildings 114). … the rendering 408 of the environment 100 may contain omitted portions 412, e.g., spots in the rendering 408 that have been blurred, blackened, or replaced with a depiction of the environment 100 that is not obscured by an object 102. … it may be desirable to omit only an object portion 414 of an object 102, such as the license plate 404 of the vehicle. … various techniques [are] applied to utilize a lidar point cloud 210 (including … the evaluation of the lidar point cloud 210 in the exemplary scenario 300 of FIG. 3) in the omission of objects 104 in a rendering 408 of an environment 100 in view of the classification 316 of the objects 312 according to the lidar point cloud 210 …”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method of claim 1 of ČADÍK-PIEMONTE to include the distinguishing unnecessary objects and necessary objects from each other and the rendering the image of the outdoor space from the perspective of the virtual camera by excluding the unnecessary objects of ROGAN. The motivation for this modification could have been to render a static image of an outdoor, unrestricted scene by omitting transient objects such as vehicles and pedestrians. Further, it may be desirable to omit faces of people and license plates of vehicles to maintain privacy if the image were to be made available publicly, such as on a website or a social media application.
Regarding claim 6, ČADÍK-PIEMONTE-ROGAN disclose the method of claim 5, wherein the unnecessary objects include at least one of (ROGAN; ¶ 0027; “In … scenario 400, a representation of the environment 100 is captured from a capture perspective 402 (e.g., a position within the environment 100), which may include both the environment 100 and the objects 102 …, including vehicles, individuals 110, … other objects 102 may be regarded as transients to be removed from the rendering of the environment 100 (e.g., as moving objects and stationary foreground objects). Moreover, some objects 102 may include only an object portion of the object 102 that is to be omitted. … rather than omitting an entire individual 110 or vehicle, it may be desirable to omit only an object portion of the object 102 that may be associated with [an] … individual 110, such as the individual's face, or a license plate 404 of a vehicle” ¶ 0028; “… the rendering 408 may exclude all objects 102 that are classified to be moving. Objects 102 that are classified as stationary may further be evaluated to distinguish stationary foreground objects (e.g., objects 102 that are within a particular range of the capture perspective 402) from fixed-ground objects (such as signs 112) and/or background objects (such as buildings 114). As a result, the rendering 408 of the environment 100 may contain omitted portions 412, e.g., spots in the rendering 408 that have been blurred, blackened, or replaced with a depiction of the environment 100 that is not obscured by an object 102. … it may be desirable to omit only an object portion 414 of an object 102, such as the license plate 404 of the vehicle.”).
Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over ČADÍK in view of PIEMONTE and ROGAN as applied to claim 5 above, and further in view of Filip et al. (U.S. Patent 9,477,368; 'FILIP').

    PNG
    media_image2.png
    390
    621
    media_image2.png
    Greyscale

Regarding claim 7, ČADÍK-PIEMONTE-ROGAN disclose the method of claim 5; however, ČADÍK-PIEMONTE-ROGAN do not explicitly disclose that the image of the outdoor space captured from the air includes an area having no road between buildings separated from each other, which FILIP discloses (FILIP; FIG. 5; Col. 6, Lines 23-62; “… another format stores the objects in the street level image as 3D models. … rather than storing collections of points, the facades … of the buildings [are] represented as rectangles, triangles or other shapes defined by vertices having positions in space, such as latitude, longitude and altitude. … front facade 511 of building 510 [are] represented as a rectangular plane having four points 551-554, with each point defined in an (x, y, z) format (such as latitude, longitude, altitude). … side facade 512 [is] represented as a rectangular plane defined by four points 553-556, two of which are shared with the front facade 511. … the buildings … in a street level image [are] stored as 3D models comprising polygons. (36) … another format stores the objects as a set of planes corresponding with the object surfaces facing the camera that captured the street level image. Each plane may be associated with a unique index number, and the vertex of each plane may be defined in an (x, y, z) format (such as latitude, longitude, altitude). The data defining the planes may also associate each pixel of the street level image with one of the planes. Thus, instead of defining the distance between the camera and the object represented at the pixel, the value would represent the index of the plane representing the object surface at that pixel. Representing surfaces in this fashion may permit a processor to quickly retrieve and determine the position and orientation of each surface at each pixel of a street level image. Pixels that are not associated with a surface may be associated with a null or default surface value. (37) Many of the formats permit the surface information to be stored independently of the street level images taken by the camera. … if the building surfaces are stored as 3D models relative to the latitude and longitude of the earth, the data associated with the 3D model of building 510 does not change regardless of whether the camera view of the building is at first position 591 or second position 592. … the surface information may be stored without regard to a particular street level image” [The Examiner notes that FIG. 5 depicts two separated buildings as 3-D façade models without any depiction/representation of the road/street from which the buildings were originally captured.]).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method of claim 5 of ČADÍK-PIEMONTE-ROGAN to include the disclosure that the image of the outdoor space captured from the air includes an area having no road between buildings separated from each other of FILIP. The motivation for this modification could have been to develop an urban architectural model omitting roadways to save memory in a database.
Claims 8-11 are rejected under 35 U.S.C. 103 as being unpatentable over ČADÍK in view of PIEMONTE as applied to claim 1 above, and further in view of Qian et al. (U.S. PG-PUB 2017/0200309, 'QIAN') and Yoshimura (U.S. PG-PUB 2020/0329227, 'YOSHIMURA').
Regarding claim 8, ČADÍK-PIEMONTE disclose the method of claim 1; however, ČADÍK-PIEMONTE do not explicitly disclose that the method of claim 1 further comprises: 
generating lattice coordinates along a sidewalk positioned near a road by using two-dimensional (2D) map data of an outdoor space; and 
extracting vertical coordinates corresponding to the lattice coordinates, from the 3D model data which is based on the image of the outdoor space captured from the air, both of which QIAN discloses (QIAN; ¶ 0030; “… the LiDAR point cloud data is first converted to a digital surface model (e.g., a digital elevation model including geolocations of buildings [‘3D model data which is based on the image of the outdoor space captured from the air’] … on the surface of the scanned location). Digital surface models (DSMs) is a well understood model, commonly used to refer to the digital elevation models (DEMs) that do not exclude man-made structures from surface information provided by the model. In contrast, digital terrain models are the digital elevation models that only contain the elevation of the barren terrain of the earth and may ignore the existence of man-made structures on such terrain. The LiDAR point cloud data is first converted to a digital surface model through rasterization. … the digital surface model (DSM) may be derived from overlying a grid (e.g., parallel to ground level) over the cityscape [‘outdoor space’] point cloud data and selecting an appropriate height based on point data within the grid element (e.g., maximum height or average height of point cloud points within the grid element). … each grid element may be identified by an (x, y) coordinate [‘generating lattice coordinates … by using two-dimensional (2D) map data’] and be associated with a z coordinate [‘extracting vertical coordinates corresponding to the lattice coordinates’] to obtain the 3D surface model. The 3D surface model may include surface information (e.g., heights and (x, y) coordinate locations) of urban structures (e.g., buildings …) and surface information of the terrain (streets, sidewalks [‘sidewalk positioned near a road’], earth [etc.].”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method of claim 1 of ČADÍK-PIEMONTE to include the generating lattice coordinates along a sidewalk positioned near a road by using two-dimensional (2D) map data of an outdoor space and the extracting vertical coordinates corresponding to the lattice coordinates, from the 3D model data which is based on the image of the outdoor space captured from the air of QIAN. The motivation for this modification could have been to use a relatively constant, predictable surface such as a sidewalk to overlay coordinates from a digital elevation model.
ČADÍK-PIEMONTE-QIAN do not explicitly disclose that the virtual camera pose is set based on 3D coordinates defined by the lattice coordinates and the vertical coordinates, which YOSHIMURA discloses (YOSHIMURA; ¶ 0052; “…  indicate the position of the virtual camera by [3-D] coordinates and indicate the orientation of the virtual camera by the enumeration of the values of yaw, roll, and pitch.”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method of claim 1 of ČADÍK-PIEMONTE-QIAN to include the disclosure that the virtual camera pose is set based on 3D coordinates defined by the lattice coordinates and the vertical coordinates of YOSHIMURA. The motivation for this modification could have been to create novel viewpoints of a cityscape or street-view based on digital elevation models.
Regarding claim 9, ČADÍK-PIEMONTE-QIAN-YOSHIMURA disclose the method of claim 8, wherein the generating of the lattice coordinates comprises: 
detecting longitude and latitude coordinates of nodes on the sidewalk (QIAN; FIG. 1; ¶ 0027; “The satellite image may include metadata to provide a geolocation (e.g., latitude, longitude and height) of the satellite when the image was taken (which may be referred to herein as a satellite image location) as well as other information (e.g., RPC model data) to assist in determining geolocations of features in the image (e.g., positions of buildings, building corners, streets, etc. in the image) or simply the locations associated with … pixels of the image.” [The Examiner asserts that a person having ordinary skill in the art would have understood that ‘nodes on the sidewalk’ are analogous to the features in the image, such as those on the streets as disclosed by QIAN.]) by using the 2D map data; and 
converting the longitude and latitude coordinates into the lattice coordinates (QIAN; ¶ 0031; “Known real world locations associated with the point cloud data may be used to provide (or to derive) real world locations for locations within the 3D mesh model. … known real world longitude, latitude and height may be derived from the 3D surface (mesh) model for locations within the 3D surface (mesh) model” FIG. 8; ¶ 0032; “Each of the x, y, z geolocation coordinates of the obtained digital surface model may be output directly to shader 806a of GPU (graphics processing unit) 806. Shader 806a of GPU may provide a new set of vertex Cartesian coordinates representing (or used to determine) vertices of the polygons of the mesh.” ¶ 0033; “Known real world locations (geolocations) associated with the point cloud data may be used to provide (or to derive) real world locations (geolocations) for locations within the 3D surface (mesh) model. … known real world longitude, latitude and height may be derived from the point cloud data for locations within the 3D surface (mesh) model [‘lattice coordinates’].”).
Regarding claim 10, ČADÍK-PIEMONTE-QIAN-YOSHIMURA disclose the method of claim 8, wherein the 2D map data of the outdoor space includes plane coordinates of the sidewalk (QIAN; FIG. 1; ¶ 0028; “In step S102, point cloud data of … cityscapes [is] obtained. Point cloud data may comprise a plurality of points, where each point is identified by a coordinate and represents a point of a determined surface location of an element of a cityscape. … the points may be defined using x, y and z coordinates in a Cartesian coordinate system, where the collection of points represents the surface of the cityscape, such as the surface of roads, sidewalks, buildings, etc.”).
Regarding claim 11, ČADÍK-PIEMONTE-QIAN-YOSHIMURA disclose the method of claim 8, wherein 
the 3D model data includes a digital elevation model representing a bare earth of the outdoor space (QIAN; ¶ 0030; “… digital terrain models are the digital elevation models that only contain the elevation of the barren terrain of the earth and may ignore the existence of man-made structures on such terrain.”), and wherein 
the vertical coordinates are extracted from the digital elevation model (QIAN; ¶ 0030; “The LiDAR point cloud data is first converted to a digital surface model through rasterization. … the digital surface model (DSM) may be derived from overlying a grid (e.g., parallel to ground level) over the cityscape point cloud data and selecting an appropriate height based on point data within the grid element (e.g., maximum height or average height of point cloud points within the grid element). … each grid element may be identified by an (x, y) coordinate and be associated with a z coordinate [‘vertical coordinates are extracted’] to obtain the 3D surface model. The 3D surface model may include surface information (e.g., heights and (x, y) coordinate locations) of urban structures (e.g., buildings or other man-made structures) and surface information of the terrain (streets, sidewalks, earth [etc.].”).
Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over ČADÍK in view of PIEMONTE as applied to claim 1 above, and further in view of Wang (US PG-PUB 2014/0133741, 'WANG').
Regarding claim 12, ČADÍK-PIEMONTE disclose the method of claim 1, wherein 
the image of the outdoor space captured from the air includes a plurality of picture images captured (ČADÍK; p. 110, § 3., right col., last paragraph; “… Google-Earth-like digital terrain models are … available for the whole planet. They are acquired from satellites and/or planes and published in form of geo-referred digital elevation maps (DEMs) even for less accessible regions. Such models are sufficient for our purposes (i.e., outdoor photographs)”) while the camera is moving ([The Examiner invokes OFFICIAL NOTICE, wherein it is widely known that planes are moving with respect to the surface of the earth, as well as non-geostationary satellites; therefore, any camera mounted onto a plane or non-geostationary satellite is necessarily in motion.]).
ČADÍK-PIEMONTE do not explicitly disclose that the 3D model data is generated by using a disparity among the plurality of picture images, which WANG discloses (WANG; ¶ 0105; “… to obtain the disparity, it is necessary to … find out the corresponding regions in the left and right aerial images. For the two stereo images, stereo matching is performed to obtain the correlation of regions in the right and left images.” ¶ 0363; “The [3-D] feature data generating device … in which the plane combining unit generates the [3-D] rooftop structure of each feature based on all the planes forming the rooftop, which are extracted by the plane extracting unit, and utilizes the disparity information obtained by the stereo disparity correcting unit to generate the [3-D] model of the feature.”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the method of claim 1 of ČADÍK-PIEMONTE to include the disclosure that the 3D model data is generated by using a disparity among the plurality of picture images of WANG. The motivation for this modification could have been to use aerial images from stereo capture to generate three-dimensional feature data.
Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over ČADÍK in view of PIEMONTE and FILIP.
Regarding claim 13, ČADÍK discloses a visual localization system, comprising: 
a data base configured to store a feature point map (ČADÍK; p. 111, right col.; “… we first exploit the image EXIF data (assisted by a camera database …”); and 
a driving unit executed in a mobile device (ČADÍK; p. 111, right col., top paragraph; “… approximate locations can be easily found from the picture itself in many cases; GPS is integrated with many recent cameras and smartphones [‘mobile devices’] and its information is recorded in EXIF tags. … we assume the approximate location of the camera is already known and in the following steps we sample only near proximity of the known location”) and configured to perform visual localization by using the feature point map and images captured by the mobile device (ČADÍK; p. 111, left col., bottom paragraph; “To render the initial depth map, the location of a real camera (used to capture the outdoor photograph) [is] required to be known in advance. By default, we can locate the position of unknown camera using the structural features of an input photo. … we use skyline/contours and … geometric constraints … when the image lacks information about camera location.”) , and wherein
the 3D model data is generated based on an image of an outdoor space (ČADÍK; p. 109, § 1. Intro., right col.; “… we present a fully automatic framework for depth-map generation and alignment for an outdoor photograph [‘image of an outdoor space’]. A virtual camera is … localized with the geo-tagging information of a photo and recent camera pose estimation … the 3D terrain model is rendered at the virtual camera to produce an … approximate depth map” p. 111, left column; “In … an outdoor scene, the DEM is … sufficient for camera localization and pose estimation …”) captured from the air (ČADÍK; p. 110, § 3.; “… digital terrain models are … available for the … planet. They are acquired from satellites and/or planes …”).
ČADÍK does not explicitly disclose that the feature point map is generated by using 3D coordinates and 3D model data, which PIEMONTE discloses (PIEMONTE; ¶ 0164; “… a ray is received from the virtual camera … into the 3D scene [‘feature point map’] (e.g., into the center of the selected map feature). The pivot point P is the ray intersection of an eye forward vector (i.e., a vector in the direction in which the … camera is looking) from the eye into the selected feature in the map. The length of the vector L [is] determined based on the known position of the virtual camera in the scene (i.e., its position in 3D space [‘3D model data’], as represented by its x, y, z coordinates) and the determined position of the selected feature … (also represented by x, y, z coordinates [‘3D coordinates’]).”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the visual localization system of ČADÍK to include the disclosure that the feature point map is generated by using 3D coordinates and 3D model data of PIEMONTE. The motivation for this modification could have been to generate a sense of depth for objects within a map that a user may wish to use for interactive navigation.
ČADÍK-PIEMONTE do not explicitly disclose that the 3D coordinates are generated by using the 3D model data, which FILIP discloses (FILIP; FIG. 5; Col. 6, Lines 23-62; “… another format stores the objects in the street level image as 3D models. … rather than storing collections of points, the facades … of the buildings [are] represented as rectangles, triangles or other shapes defined by vertices having positions in space, such as latitude, longitude and altitude. … front facade 511 of building 510 [are] represented as a rectangular plane having four points 551-554, with each point defined in an (x, y, z) format [‘3D coordinates’] (such as latitude, longitude, altitude). … side facade 512 [is] represented as a rectangular plane defined by four points 553-556, two of which are shared with the front facade 511. … the buildings … in a street level image [are] stored as 3D models comprising polygons. (36) … another format stores the objects as a set of planes corresponding with the object surfaces facing the camera that captured the street level image. Each plane may be associated with a unique index number, and the vertex of each plane may be defined in an (x, y, z) format [‘3D coordinates’] (such as latitude, longitude, altitude).”).
Before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to modify the visual localization system of ČADÍK-PIEMONTE to include the disclosure that the 3D coordinates are generated by using the 3D model data of FILIP. The motivation for this modification could have been to develop an urban architectural model using vertices and surfaces to represent architectural facades to save memory in a database.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONATHAN M COFINO whose telephone number is (303) 297-4268. The examiner can normally be reached Monday-Friday 10A-4P MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KENT W CHANG can be reached on (571) 272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JONATHAN M COFINO/             Examiner, Art Unit 2619                                                                                                                                                                                           

/KENT W CHANG/             Supervisory Patent Examiner, Art Unit 2619