DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 06/24/2021 has been entered.

 Response to Arguments
Applicant’s arguments with respect to claim(s) 1-10, have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Applicant Remarks:
	A. Kottenstette in View of Faroogi Does Not Disclose "a second feature depicted in the one or more images" as Recited by Kottenstette in View of Faroogi In response to Applicant's previous arguments that Farooqi does not disclose determining measurements for one feature based on the measurements of another feature, the Advisory Action stated that the "examiner 
Independent Claims 1 and 13 each recite: estimate a particular real-world measurement of a particular feature of a particular real-world structure based on a particular digital image set containing one or more images of the particular real-world structure; wherein the particular digital image set comprises metadata indicating a measurement of a second feature depicted in the one or more images other than the particular feature, wherein the measurement of the second feature is a value based on an actual real-world distance of a dimension of the second feature ... . 
(Emphasis added). Thus, the "particular digital image set" includes "a particular feature of a particular real-world structure" in "one or more images", and also a "second feature" in the same "one or more images". (Emphasis added). The "particular feature" and the "second feature" are both things which are captured and depicted in the same "one or more images". The "second feature" is not metadata itself, as asserted by the Advisory Action. Moreover, Claims 1 and 13 clearly recite the "metadata" and the "second feature" as separate features, reciting the "metadata indicating a measurement of [the] second feature". Consequently, because the asserted metadata does not correspond to the recited "second feature" as asserted by the Advisory Action, Applicant respectfully submits that Kottenstette in view of Farooqi does not disclose "a second feature depicted in the one or more images" as by independent Claims 1 and 13 and, as such, does not establish a prima facie case of obviousness for these claims or their dependents. (Emphasis added).
Examiner Response:
	The examiner notes that for arguments regarding the newly amended limitations and new reference has been added to treat any deficiencies. Therefore the arguments are moot. 
	However, the examiner notes that Kottenstette in view of Farooqi in fact do teach features from images. Although, not explicitly recited as a first feature and second feature the examiner notes that Kottenstette teaches in paragraph [0111] “In some embodiments, the analytical module can implement and/or execute to detect objects, extract objects, align images” The examiner notes that features in an image are interpreted as objects within an image. Therefore, if multiple objects are detected then a first and second feature have been sufficiently taught.
Applicant Remarks:
B. The Asserted Modification of Kottenstette in View of Farooii Would Not Provide Real-World Measurements 
In response to Applicant's previous arguments that the asserted modification of Kottenstette in view of Farooqi would not provide real-world measurements, the Examiner stated: ... according the applicants claim the criteria for a measurement to be a real- world measurement is based on the type of object being measured. ... the objects being measured are all real-world objects.   Therefore, the corresponding measurements are interpreted as real-world measurements. The same applies to the objects in Farooqi. Farooqi teaches measurements of a real-world object such as an apple. 
The Advisory Action, p. 4. The Examiner further stated "... the claims of the instant application don't provide any type of dimension ... The Examiner suggests amending the claims to include details from the applicants specification." The Advisory Action, p. 4. 

Applicant respectfully submits that, if combined, Kottenstette and Farooqi would not satisfy all features of amended independent Claims 1 and 13. The Office Action relied upon Farooqi to teach "wherein the particular digital image set comprises metadata indicating a measurement of a second feature depicted in the one or more images other than the particular feature" and "... of the particular feature based, at least in part, on the measurement of the second feature". Farooqi, however, does not teach that "the measurement of the second feature is a value based on an actual real-world distance of a dimension of the second feature" as recited by Claims 1 and 13. (Emphasis added). Instead, Farooqi teaches measurements defined relative to an arbitrary grid that is overlaid its images. Farooqi's paragraph [0044] states: 
FIGS. 7A and 7B illustrate how some of such metadata can be generated. With reference to 700A, an apple is pictured within a proposed grid point mask (e.g., 20x20, etc.) for measurements. This grid point mask can be applied to a cropped binary image so that the length and height can be calculated using the grid points (Emphasis added). Farooqi does not teach the the grid corresponds to or is used to determine particular dimensions. Indeed, Farooqi does not 
Farooqi's paragraph [0044] further teaches that "[w]ith these calculations, the shape of the object can be characterized." Notably, the grid size for each of the above figures Fig. 7A and 7B is different and, as noted above, Farooqi discloses no particular significance to the sizes of the grid nor their correspondence with real-world dimensions. It will be appreciated that, so long as the measurements retain a common reference (the grid point mask), characteristics such as the shape of the object can be determined since the relative sizes of different portions of an image remain consistent. Thus, Farooqi has not been established to disclose "the measurement of the second feature is a value based on an actual real-world distance", nor does it disclose a need for such real-world measurements, nor how to relate its measurements to "an actual real-world distance". As such, Applicant respectfully submits that the combination of Kottenstette with Farooqi does not disclose all features of Claims 1 and 13, including that "the particular digital image set comprises metadata indicating a measurement of a second feature depicted in the one or more images other than the particular feature, wherein the measurement of the second feature is a -12- value based on an actual real-world distance of a dimension of the second feature." (Emphasis added). Accordingly, Applicant submits that the combination of Kottenstette and Farooqi does not establish aprimafacie case of obviousness for at least these reasons.

Examiner Response:
	The examiner notes that a new reference has been added to teach the deficiencies. Wang (U.S. 20070269102) further teaches the use of a LIDAR which captures direct measurements from a real world scene. Refer to Wang in paragraph [0016] which recites “LIDAR ground 
	The examiner suggests further amending the claims to include further details from the specification.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1, 3, 6, 8-10, 12, 13, 15, 18, 20-22, 24 and 33-36 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kottenstette (U.S.2017/0076438) in view of Farooqi (U.S.20180150713) and Wang (U.S. 20070269102).

Regarding claim 13, A system comprising: one or more processors (Kottenstette: Paragraph [0044] “processor”); one or more non-transitory computer-readable media storing instructions which, when executed by the one or more processors (Kottenstette: Paragraph [0045] “computer readable media that include executable instructions (e.g., computer program of instructions)”), cause performance of: 

training a machine learning model (Kottenstette: Paragraph [0045] “a classifier can be created based on the training images and the parameter sets. The classifier can be configured to determine a parameter set for an image. In some embodiments, the classifier can be created using a machine learning system, such as an ANN or a CNN as would be appreciated by one of ordinary skill in the art.” Training a machine learning model is taught as creating a machine learning model based on training images via a machine learning system.) to estimate real-world measurements of features of real-world structures (Kottenstette: Paragraph [0182] “Height is taken to be relative to local environment (e.g., the height of buildings above the surface of neighbors). With one or more images of the same area, an above ground height model (AGHM) image can be generated using a classifier, where every pixel of the AGHM image can be assigned to a height above ground. The classifier can estimate elevations or heights of geographic regions and/or objects in images by direct or relative prediction without performing feature search or extraction.” To estimate real-world measurements of features of real-world structures is taught as the classifier can estimate elevations or heights of geographic regions and/or objects in images[height of buildings is taught as the real-world measurement of real-world structures]); 

wherein training the machine learning model includes providing to the machine learning model: a plurality of image sets (Kottenstette: Paragraph [0252] “a first type of image sets can be received. In some embodiments, each first type of image set can include one or more first type of images.” A plurality of image sets is taught as the first type of image sets. The prior art further teaches a second type of image set.), wherein each image set of the plurality of image sets includes one or more images (Kottenstette: Paragraph [0252] “a first type of image sets can be received. In some embodiments, each first type of image set can include one or more first type of images.” Wherein each image set of the plurality of image sets includes one or more images is taught as the first image set including one or more first type of images.) of a corresponding real-world structure (Kottenstette: Paragraph [0005] “ the structural and geospatial data include the following: the area of real property including land and/or buildings; the square footage of a building; the roof size and/or type; the presence of a pool and its size and/or location; and the presence of trees and its type, size, and/or location.” A corresponding real-world structure is taught as the image geospatial data including buildings, land, roof and pool.), 

and a plurality of real-world measurements (Kottenstette: Figure 12. Step 1204. “Receive labels that indicate heights” A plurality of real-world measurements are taught as labels that indicate heights of objects in images. Paragraph [0231] “Each geometric object property can identify a corresponding geometric object by, for example, identifying a property or attribute of the corresponding geometric object. In some embodiments, a geometric object property can be one of slope, pitch, dominant pitch, material, area, height, or volume.” The objects in the images can be identified by slope, pitch, dominant pitch, material, area, height, or volume.), wherein the plurality of real-world measurements includes, for each image set of the plurality of image sets, a real-world measurement of a feature of the corresponding real-world structure (Kottenstette: Paragraph [0182] “The classifier can estimate elevations or heights of geographic regions and/or objects in images by direct or relative prediction without performing feature search or extraction. The classifier can generate an AGHM based on the estimation.” For each image set of the plurality of image sets, a real-world measurement of a feature of the corresponding real-world structure is taught as estimated elevations or heights of geographic regions and/or objects in images by direct or relative prediction without performing feature search or extraction.); 

and after training the machine learning model, using the machine learning model to estimate a particular real-world measurement of a particular feature of a particular real-world structure based on a particular digital image set containing one or more images of the particular real-world structure (Kottenstette: Paragraph [0207] “At step 1206, a regression model can be created based on the received training images and the received training labels. The regression model can be configured to determine a height of one or more regions of an image. In some embodiments, the regression model can be created using a machine learning system, such as an ANN or a CNN as would be appreciated by one of ordinary skill in the art.” After training the machine learning model, using the machine learning model to estimate a particular real-world measurement of a particular feature of a particular real-world structure based on a particular digital image set containing one or more images of the particular real-world structure is taught as a regression model can be created based on the received training images and the received training labels. The regression model can be configured to determine a height of one or more regions or objects of an image [including a building (real-world structure)]. The prior art is directed at estimating the height or slope[real-world measurement] of the roof [particular feature]of a building [particular real-world structure]and what type it is.)… wherein said using the machine learning model to estimate the particular real-world measurement of the particular feature comprises using the machine learning model to estimate the particular real-world measurement (Kottenstette: Paragraph [0207] “At step 1206, a regression model can be created based on the received training images and the received training labels. The regression model can be configured to determine a height of one or more regions of an image. In some embodiments, the regression model can be created using a machine learning system, such as an ANN or a CNN as would be appreciated by one of ordinary skill in the art.” Using the machine learning model to estimate the particular real-world measurement of the particular feature comprises using the machine learning model to estimate the particular real-world measurement is taught as a regression model configured to determine a height of one or more regions or objects of an image [including a building (real-world structure)]. The prior art is directed at estimating the height or slope [i.e. real-world measurement] of the roof [i.e. particular feature] of a building [i.e. particular real-world structure] and what type it is.)
Kottenstette does not explicitly disclose …, wherein the particular digital image set comprises metadata indicating a measurement of a second feature depicted in the one or more images other than the particular feature; … , wherein the measurement of the second feature is a value based on an actual real-world distance of a dimension of the second feature; … of the 

	Farooqi further teaches  …, wherein the particular digital image set comprises metadata indicating a measurement of a second feature depicted in the one or more images other than the particular feature (Farooqi: Paragraph [0043] “extracts features from the image data which can, for example, include metadata characterizing the image. In some cases, the metadata is included as part of the image data while, in other implementations, the metadata can be stored separately (or derived separately from the optical sensor(s) that generated the image data). For example, the metadata can include measurements of an object within the bounding polygon such as, for example, length, height, depth, world-coordinates (3-D coordinates), average color, size and shape, time of day of image capture, and the like.” Wherein the particular digital image set comprises metadata indicating a measurement of a second feature depicted in the one or more images other than the particular feature is taught by the features [i.e. multiple features] that are extracted including metadata that can include measurements of an object within the bounding polygon such as, for example, length, height, depth, world-coordinates (3-D coordinates), average color, size and shape, time of day of image capture, and the like. [i.e. a second feature with metadata indicating a measurement of the second feature in the image]. The examiner notes that multiple features may be extracted from an image along with metadata including measurements of 
an object.); … of the particular feature based (Farooqi: Paragraph [0041] “the training of such machine learning models, features are established for image data which are then extracted from the historical image data to facilitate future predictions/determinations using the binary classifier 640. In some cases, the binary classifier 640 utilizes the depth information in the RGB-D data as one of the features used in both training the machine learning model and in determining whether a proposed bounding polygon encapsulates an object.” A machine learning model is used to make predictions/determinations based on features.), at least in part, on the measurement of the second feature (Farooqi: Paragraph [0043] “extracts features from the image data which can, for example, include metadata characterizing the image. In some cases, the metadata is included as part of the image data while, in other implementations, the metadata can be stored separately (or derived separately from the optical sensor(s) that generated the image data). For example, the metadata can include measurements of an object within the bounding polygon such as, for example, length, height, depth, world-coordinates (3-D coordinates), average color, size and shape, time of day of image capture, and the like.” The classification is based on the features (i.e. second feature) and metadata including measurements of an object in the image including height or length etc. The examiner interprets features to be multiple features which encompasses the scope of a second feature.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the machine learning object detection system of Kottenstette with the features and metadata of Farooqi in order to allow extracting features from the image data which can include metadata characterizing the image, thereby reducing the n class problem to an easier problem because the metadata can be used for pre-classification (Farooqi: Paragraph [0050] “The meta data is used here as pre-classification and reduces the n class problem to an easier problem because each machine learning model is only responsible for a subset of these n classes.”).

Kottenstette in view of Farooqi does not explicitly disclose, wherein the measurement of the second feature is a value based on an actual real-world distance of a dimension of the second feature;… , wherein the particular real-world measurement of the particular feature is a value based on an actual real-world distance of a dimension of the particular feature.

	Wang further teaches , wherein the measurement of the second feature is a value based on an actual real-world distance of a dimension of the second feature (Wang: Paragraph [0016] “The scene covered and represented by such a 3D Image is a three-dimensional real world scene where every visible thing in the 3D Image has 3D coordinates. The three-dimensional XYZ coordinates of all the pixels of a 3D Image are attributed by the method and system of this invention that generates 3D Images with airborne oblique/vertical imagery, GPS/IMU, and LIDAR ground surface elevation or range data, which are to be described in this document. A 3D Image allows direct measurements of the location, length, distance, height, area, and volume and indirect measurements including but not limited to profile and sight of view all in 3D.” The measurement of the second feature is a value based on an actual real-world distance of a dimension of the second feature is taught as a three-dimensional real world scene where the objects in the image are directly measured based on location, length, distance, height, area, and volume. For example, the Lidar system measures objects such as buildings and roads within images. Refer to Paragraph [0008].);… , wherein the particular real-world measurement of the particular feature is a value based on an actual real-(Wang: Paragraph [0016] “The scene covered and represented by such a 3D Image is a three-dimensional real world scene where every visible thing in the 3D Image has 3D coordinates. The three-dimensional XYZ coordinates of all the pixels of a 3D Image are attributed by the method and system of this invention that generates 3D Images with airborne oblique/vertical imagery, GPS/IMU, and LIDAR ground surface elevation or range data, which are to be described in this document. A 3D Image allows direct measurements of the location, length, distance, height, area, and volume and indirect measurements including but not limited to profile and sight of view all in 3D.” The particular real-world measurement of the particular feature is a value based on an actual real-world distance of a dimension of the particular feature is taught as a three-dimensional real world scene where the objects in the image are directly measured based on location, length, distance, height, area, and volume. The examiner notes that Wang teaches a method of measuring objects in images in a real world scene. The measurements are for a 3D Image that has 3D XYZ coordinates. The Lidar system measures objects such as buildings and roads within images. It is obvious that an image may capture multiple objects. Refer to Paragraph [0008].).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Kottenstette and Farooqi with the 3D image model of Wang in order to capture a 3D image that allows direct measurements of the location, length, distance, height, area, and volume, thereby reducing cost and time in building 3D images with direct measurements (Wang: Paragraph [0005] “the entire process to produce such a 3D ground surface model is very time consuming and labor intensive, particularly for the making of building facets, which so far limits its use only to a limited number of big cities.” [0015] “ In a 3D Image based GR3DGSM system, all building facets are automatically geo-referenced and automatically at where they are supposed to be, which saves tremendous time and therefore reduces a lot of cost.”).
Claim 1 is similarly rejected refer to claim 13 for further analysis.

Regarding claim 15, Kottenstette in view of Farooqi and Wang teaches the system of claim 13, wherein the instructions, when executed, further cause6Application 15/701,321, filed 09/11/2017 Attorney Docket No. 60452-0012comparing the particular real-world structure with the plurality of real-world structures to identify how different the particular real-world structure is from the plurality of real-world structures (Kottenstette: Paragraph [0177] “The accuracy and precision of the output from an object detector using machine learning can vary depending on many factors. These factors include, for example, the object detection method used, the type of input, the complexity of the input, the amount of data used in training the object detector, the quality of the data used in training the object detector, and the similarity between the training data and the input.”  CAttorney Docket No. 60452-0012omparing the particular real-world structure with the plurality of real-world structures to identify how different the particular real-world structure is from the plurality of real-world structures is taught by the accuracy and precision of the output from an object detector using machine learning can vary depending on many factors including the similarity between the training data and the input (i.e. the comparison distinguishing how different the real-world structures are from each other). The more similar they are the higher the accurate the output is. If the real-world structure is not as similar one could interpret that they are different.); based on how different the particular real-world structure is from the plurality of real- world structures determining a confidence level associated with the particular real-world measurement (Kottenstette: Paragraph [0116] “M-Turk validation 2832 can determine whether the ground truth candidate(s) in ground truth candidate DB 2830 is valid. In some embodiments, a ground truth candidate can be valid if its accuracy level can be determined to exceed a threshold value. If the ground truth candidate(s) is valid, it can be stored in ground truth DB 2828.” Determining a confidence level associated with the particular real-world measurement is taught as a ground truth candidate can be valid if its accuracy level can be determined to exceed a threshold value. [accuracy level is the confidence level]). 
Claim 3 is similarly rejected refer to claim 15 for further analysis.

Regarding claim 18, Kottenstette in view of Farooqi and Wang teaches the system of claim 13, wherein training the machine learning model is based on one or more metadata associated with at least one image in each image set of the plurality of image sets (Kottenstette: Paragraph [0291] “In the case of geospatial images, multiple sets of images can share common, known locations. The images of a common location can be identified via their GIS metadata, and the labels can be shared or transferred. ” Training the machine learning model is based on one or more metadata associated with at least one image in each image set of the plurality of image sets is taught as the geospatial images, multiple sets of images can be identified via their GIS metadata and labels that are used for training the classifier.). 
Claim 6 is similarly rejected refer to claim 18 for further analysis.

Regarding claim 20, Kottenstette in view of Farooqi and Wang teaches the system of claim 13, wherein training the machine learning model is based on output from a second machine learning model (Kottenstette: Paragraph [0142] “The classifier can be configured to identify the class of each pixel of an image. In some embodiments, the classifier can be created using a machine learning system, which includes one or more classifiers such as an artificial neural network (ANN) including but not limited to a convolutional neural network (CNN), as would be appreciated by one of ordinary skill in the art.” Training the machine learning model is based on output from a second machine learning model is taught as the use of multiple classifiers that are created by the machine learning system.). 
Claim 8 is similarly rejected refer to claim 20 for further analysis.
Regarding claim 21, Kottenstette in view of Farooqi and Wang teaches the system of claim 13, wherein the particular real-world structure is a particular building and the plurality of image sets correspond to a plurality of buildings (Kottenstette: Paragraph [0215] “The machine learning network can be trained with various types of data, such as data regarding building height and location (e.g., AGHM and table of values), images of buildings (e.g., nadir or near-nadir imagery), and information regarding the location of the camera and the sun.” The particular real-world structure is a particular building and the plurality of image sets correspond to a plurality of buildings is taught as the machine learning network can be trained with various types of data, such as data regarding building height and location (e.g., AGHM and table of values), images of buildings.). 
Claim 9 is similarly rejected refer to claim 21 for further analysis.
Regarding claim 22, Kottenstette in view of Farooqi and Wang teaches the system of claim 21, wherein the particular feature of the particular building is a roof of the particular building (Kottenstette: Paragraph [0053] “FIG. 2 illustrates a method for producing a heat map for identifying roofs in an image.” The particular feature of the particular building is a roof of the particular building is taught as the identifying roofs in an image.) , and the particular real-world measurement is an area of the roof (Kottenstette: Paragraph [0231] “a geometric object property can be one of slope, pitch, dominant pitch, material, area, height, or volume. For example, if a corresponding geometric object is a roof, the geometric object property can identify the pitch of the roof.” The particular real-world measurement is an area of the roof is taught as a geometric object being a roof in which the area is calculated.).

Claim 10 is similarly rejected refer to claim 22 for further analysis.
Regarding claim 24, Kottenstette in view of Farooqi and Wang teaches the system of claim 22, Kottenstette further teaches wherein the machine learning model is a first machine learning model; and using the first machine learning model to estimate the particular real-world measurement (Kottenstette: Paragraph [0207] “At step 1206, a regression model can be created based on the received training images and the received training labels. The regression model can be configured to determine a height of one or more regions of an image. In some embodiments, the regression model can be created using a machine learning system, such as an ANN or a CNN as would be appreciated by one of ordinary skill in the art.” Using first the machine learning model to estimate a particular real-world measurement is taught as a regression model can be created based on the received training images and the received training labels. The regression model can be configured to determine a height of one or more regions or objects of an image [including a building (real-world structure)].) comprises: using the first machine learning model to estimate an outline of the roof (Kottenstette: Paragraph [0237] “ In some embodiments, object of interest 3406 can be outlined on a different background as shown in image 3404 than the original image (RGB image 3402)…. For example, the extractor can determine that the area of the pool is 40 m2 as shown in 3410.” Object of interest as indicated by the prior art may be a pool, roof or any object of interest. The object of interest may be outlined in order to calculate the area. In Paragraph [0236] “In some embodiments, the target geometric object property can be one of slope, pitch, dominant pitch, material, area, height, or volume. For example, if the corresponding geometric object is a roof, the target geometric object property can identify the dominant pitch of the roof.” The prior art indicated that the object property can be analyzed for the slope, pitch and area etc.), … to estimate a slope of the roof (Kottenstette: Paragraph [0237] “ In some embodiments, object of interest 3406 can be outlined on a different background as shown in image 3404 than the original image (RGB image 3402)…. For example, the extractor can determine that the area of the pool is 40 m2 as shown in 3410.” Object of interest as indicated by the prior art may be a pool, roof or any object of interest. The object of interest may be outlined in order to calculate the area. In Paragraph [0236] “In some embodiments, the target geometric object property can be one of slope, pitch, dominant pitch, material, area, height, or volume. For example, if the corresponding geometric object is a roof, the target geometric object property can identify the dominant pitch of the roof.” The prior art indicated that the object property can be analyzed for the slope, pitch and area etc.); and calculating the area of the roof based on the outline of the roof and the slope of the roof (Kottenstette: Paragraph [0236] “In some embodiments, the target geometric object property can be one of slope, pitch, dominant pitch, material, area, height, or volume. For example, if the corresponding geometric object is a roof, the target geometric object property can identify the dominant pitch of the roof.” The prior art indicated that the object property can be analyzed for the slope, pitch and area etc. based on the outline identified.).

Farooqi further teaches … using a second trained machine learning model (Farooqi: Paragraph [0039] “a first object classifier 660 and a second object classifier” Farooqi teaches a first and second classifier (i.e. a second trained machine learning model). Refer to the Abstract of Farooqi which states “a second object classifier having at least one machine learning model trained” Therefore each object classifier has at least one machine learning model.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the machine learning object detection system of Kottenstette with the features and metadata of Farooqi in order to allow the calculation of a final classification based on the output of the first and second classifier, thereby outputting the object having the highest score (Farooqi: Paragraph [0051] “the first object classifier 660 and the second object classifier 670 that indicate a likelihood of the objects identified by the respective object classifier 660, 670. The final object classification module 680 uses such scores/measures when selecting a final object. For example, the object having the highest score can be selected as the final object”).

Claim 12 is similarly rejected refer to claim 24 for further analysis.

Regarding claim 29, Kottenstette in view of Farooqi and Wang teaches the system of claim 13, Kottenstette further teaches wherein the estimate of the particular real-world measurement of (Kottenstette: Figure 12. Step 1204. “Receive labels that indicate heights” A plurality of real-world measurements are taught as labels that indicate heights of objects in images. Paragraph [0231] “Each geometric object property can identify a corresponding geometric object by, for example, identifying a property or attribute of the corresponding geometric object. In some embodiments, a geometric object property can be one of slope, pitch, dominant pitch, material, area, height, or volume.” The objects in the images can be identified by slope, pitch, dominant pitch, material, area, height, or volume.)…

Farooqi further teaches the particular feature is different than the measurement of the second feature (Farooqi: Paragraph [0050] “first machine learning model can be trained to classify small objects, a second machine learning model can be trained to classify large objects” The estimate of the particular real-world measurement of the particular feature is different than the measurement of the second feature is taught as the first machine learning model that classifies small objects and the second machine learning model that classifies big objects. The examiner interprets the measurements of both of the features or objects to be different.).  

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the machine learning object detection system of Kottenstette with the features and metadata of Farooqi in order to allow the calculation of a final classification based on the output of the first and second classifier, thereby outputting the object having the highest score (Farooqi: Paragraph [0051] “the first object classifier 660 and the second object classifier 670 that indicate a likelihood of the objects identified by the respective object classifier 660, 670. The final object classification module 680 uses such scores/measures when selecting a final object. For example, the object having the highest score can be selected as the final object”).

Claim 25 is similarly rejected refer to claim 29 for further analysis.

Regarding claim 30, Kottenstette in view of Farooqi and Wang teaches the system of claim 13, Kottenstette further teaches wherein the measurement (Kottenstette: Figure 12. Step 1204. “Receive labels that indicate heights” A plurality of real-world measurements are taught as labels that indicate heights of objects in images. Paragraph [0231] “Each geometric object property can identify a corresponding geometric object by, for example, identifying a property or attribute of the corresponding geometric object. In some embodiments, a geometric object property can be one of slope, pitch, dominant pitch, material, area, height, or volume.” The objects in the images can be identified by slope, pitch, dominant pitch, material, area, height, or volume.)… 

Farooqi further teaches of the second feature comprises output of a second machine learning model (Farooqi: Paragraph [0050] “a second machine learning model can be trained to classify large objects” The measurement of the second feature comprises output of a second machine learning model is taught as a second machine learning model can be trained to classify large objects).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the machine learning object detection system of Kottenstette with the features and metadata of Farooqi in order to allow the calculation of a final classification based on the output of the first and second classifier, thereby outputting the object having the highest score (Farooqi: Paragraph [0051] “the first object classifier 660 and the second object classifier 670 that indicate a likelihood of the objects identified by the respective object classifier 660, 670. The final object classification module 680 uses such scores/measures when selecting a final object. For example, the object having the highest score can be selected as the final object”).

Claim 26 is similarly rejected refer to claim 30 for further analysis.

Regarding claim 31, Kottenstette in view of Farooqi and Wang teaches the system of claim 13, Kottenstette further teaches wherein using the machine learning model to estimate the particular real-world measurement of the particular feature (Kottenstette: Figure 12. Step 1204. “Receive labels that indicate heights” A plurality of real-world measurements are taught as labels that indicate heights of objects in images. Paragraph [0231] “Each geometric object property can identify a corresponding geometric object by, for example, identifying a property or attribute of the corresponding geometric object. In some embodiments, a geometric object property can be one of slope, pitch, dominant pitch, material, area, height, or volume.” The objects in the images can be identified by slope, pitch, dominant pitch, material, area, height, or volume.)…

Farooqi further teaches is further based, at least in part, on a spatial relationship between the particular feature and the second feature (Farooqi: Paragraph [0003] “A final classification for each bounding polygon is then determined based on the output of the first classifier machine learning model and the output of the second classifier machine learning model. Data characterizing the final classification for each bounding polygon can then be provided.” At least in part, on a spatial relationship between the particular feature and the second feature is taught as the final calculation that is determined based on the first and second machine learning models or the first and second features (i.e. a spatial relationship between the particular feature and the second feature). The features are scored and measured against each other to determine the likelihood of the objects classified.).  

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the machine learning object detection system of Kottenstette with the features and metadata of Farooqi in order to allow the calculation of a final classification based on the output of the first and second classifier, thereby outputting the object having the highest score (Farooqi: Paragraph [0051] “the first object classifier 660 and the second object classifier 670 that indicate a likelihood of the objects identified by the respective object classifier 660, 670. The final object classification module 680 uses such scores/measures when selecting a final object. For example, the object having the highest score can be selected as the final object”).

Claim 27 is similarly rejected refer to claim 31 for further analysis.

Regarding claim 33, (New) Kottenstette in view of Farooqi and Wang teach the system of Claim 1, Wang further teaches wherein the particular real-world measurement of the particular feature is one or more of an actual real-world length of the particular feature, an actual real-world area of the particular feature, an actual real-world volume of the particular feature, and an actual real-world slope of the particular feature (Wang: Paragraph [0016] “The scene covered and represented by such a 3D Image is a three-dimensional real world scene where every visible thing in the 3D Image has 3D coordinates. The three-dimensional XYZ coordinates of all the pixels of a 3D Image are attributed by the method and system of this invention that generates 3D Images with airborne oblique/vertical imagery, GPS/IMU, and LIDAR ground surface elevation or range data, which are to be described in this document. A 3D Image allows direct measurements of the location, length, distance, height, area, and volume and indirect measurements including but not limited to profile and sight of view all in 3D.” The particular real-world measurement of the particular feature is a value based on an actual real-world distance of a dimension of the particular feature is taught as a three-dimensional real world scene where the objects in the image are directly measured based on location, length, distance, height, area, and volume. The examiner notes that Wang teaches a method of measuring objects in images in a real world scene based on length, distance, height, area, and volume. The measurements are for a 3D Image that has 3D XYZ coordinates. The Lidar system measures objects such as buildings and roads within images. It is obvious that an image may capture multiple objects. Refer to Paragraph [0008].).  

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Kottenstette and Farooqi with the 3D image model of Wang in order to capture a 3D image that allows direct measurements of the location, length, distance, height, area, and volume, thereby reducing cost and time in building 3D images with direct measurements (Wang: Paragraph [0005] “the entire process to produce such a 3D ground surface model is very time consuming and labor intensive, particularly for the making of building facets, which so far limits its use only to a limited number of big cities.” [0015] “ In a 3D Image based GR3DGSM system, all building facets are automatically geo-referenced and automatically at where they are supposed to be, which saves tremendous time and therefore reduces a lot of cost.”).

Regarding claim 34, (New) Kottenstette in view of Farooqi and Wang teach the system of Claim 33, wherein the length of the particular feature is a height or a perimeter length (Kottenstette: Figure 12. Step 1204. “Receive labels that indicate heights” A plurality of real-world measurements are taught as labels that indicate heights of objects in images. Paragraph [0231] “Each geometric object property can identify a corresponding geometric object by, for example, identifying a property or attribute of the corresponding geometric object. In some embodiments, a geometric object property can be one of slope, pitch, dominant pitch, material, area, height, or volume.” The objects in the images can be identified by slope, pitch, dominant pitch, material, area, height, or volume.).  

Regarding claim 35, (New) Kottenstette in view of Farooqi and Wang teach the system of Claim 13, wherein the particular real-world measurement of the particular feature is one or more of an actual real-world length of the particular feature, an actual real-world area of the particular feature, an actual real-world volume of the particular feature, and an actual real-world slope of the particular feature (Wang: Paragraph [0016] “The scene covered and represented by such a 3D Image is a three-dimensional real world scene where every visible thing in the 3D Image has 3D coordinates. The three-dimensional XYZ coordinates of all the pixels of a 3D Image are attributed by the method and system of this invention that generates 3D Images with airborne oblique/vertical imagery, GPS/IMU, and LIDAR ground surface elevation or range data, which are to be described in this document. A 3D Image allows direct measurements of the location, length, distance, height, area, and volume and indirect measurements including but not limited to profile and sight of view all in 3D.” The particular real-world measurement of the particular feature is a value based on an actual real-world distance of a dimension of the particular feature is taught as a three-dimensional real world scene where the objects in the image are directly measured based on location, length, distance, height, area, and volume. The examiner notes that Wang teaches a method of measuring objects in images in a real world scene based on length, distance, height, area, and volume. The measurements are for a 3D Image that has 3D XYZ coordinates. The Lidar system measures objects such as buildings and roads within images. It is obvious that an image may capture multiple objects. Refer to Paragraph [0008].).  

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Kottenstette and Farooqi with the 3D image model of Wang in order to capture a 3D image that allows direct measurements of the location, length, distance, height, area, and volume, thereby reducing cost and time in building 3D images with direct measurements (Wang: Paragraph [0005] “the entire process to produce such a 3D ground surface model is very time consuming and labor intensive, particularly for the making of building facets, which so far limits its use only to a limited number of big cities.” [0015] “ In a 3D Image based GR3DGSM system, all building facets are automatically geo-referenced and automatically at where they are supposed to be, which saves tremendous time and therefore reduces a lot of cost.”).

Regarding claim 36, (New) Kottenstette in view of Farooqi and Wang teach the system of Claim 35, wherein the length of the particular feature is a height or a perimeter length (Kottenstette: Figure 12. Step 1204. “Receive labels that indicate heights” A plurality of real-world measurements are taught as labels that indicate heights of objects in images. Paragraph [0231] “Each geometric object property can identify a corresponding geometric object by, for example, identifying a property or attribute of the corresponding geometric object. In some embodiments, a geometric object property can be one of slope, pitch, dominant pitch, material, area, height, or volume.” The objects in the images can be identified by slope, pitch, dominant pitch, material, area, height, or volume.).

Claim 7, 19, 28, and 32 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kottenstette (U.S.2017/0076438) in view of Farooqi (U.S.20180150713), Wang (U.S.20070269102) and Mercep (U.S.20180314253).
Regarding claim 19, Kottenstette in view of Farooqi and Wang teaches the system of claim 13, Kottenstette further teaches further comprising: after training the machine learning model (Kottenstette: Paragraph [0227] “After the machine learning network has been trained,”  After training the machine learning model is taught by after the classifier has been trained), using the machine learning model to estimate a second real-world measurement of a second feature of a second real-world structure based on a dataset  (Kottenstette: Paragraph [0227] “After the machine learning network has been trained, the classifier can take the appearances of each face in multiple spectra (e.g., RGB, IR) and a model that describes the reflectance of the sun light from a surface with given material properties. Additionally, the material constants can be estimated whenever a sufficient number of facets are visible. The output of the production classifier can include a map of the buildings and roof pitch.” Using the machine learning model to estimate a second real-world measurement of a second feature of a second real-world structure based on a dataset is taught as after the machine learning classifier is trained the classifier can take IR or images in order to output an estimation of the buildings and roof pitch.) …
(Farooqi: Paragraph [0043] “extracts features from the image data which can, for example, include metadata characterizing the image. In some cases, the metadata is included as part of the image data while, in other implementations, the metadata can be stored separately (or derived separately from the optical sensor(s) that generated the image data). For example, the metadata can include measurements of an object within the bounding polygon such as, for example, length, height, depth, world-coordinates (3-D coordinates), average color, size and shape, time of day of image capture, and the like.” Comprising metadata describing second one or more images of the second real-world structure is taught as extracting the metadata from the images which include measurements of an object.), wherein using the machine learning model to estimate the second real-world measurement (Farooqi: Paragraph [0041] “the training of such machine learning models, features are established for image data which are then extracted from the historical image data to facilitate future predictions/determinations using the binary classifier 640. In some cases, the binary classifier 640 utilizes the depth information in the RGB-D data as one of the features used in both training the machine learning model and in determining whether a proposed bounding polygon encapsulates an object.” Using the machine learning model to estimate the second real-world measurement is taught as a machine learning model is used to make predictions/determinations based on features.) …

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the machine learning object detection system of Kottenstette with the features and metadata of Farooqi in order to allow extracting features from the image data which can include metadata characterizing the image, thereby reducing the n class problem to an easier problem because the metadata can be used for pre-classification (Farooqi: Paragraph [0050] “The meta data is used here as pre-classification and reduces the n class problem to an easier problem because each machine learning model is only responsible for a subset of these n classes.”).

Kottenstette in view of Farooqi and Wang does not explicitly disclose involves generating an estimate of the second real-world measurement without providing the machine learning model the second one or more images themselves.

Mercep further teaches involves generating an estimate of the second real-world measurement without providing the machine learning model the second one or more images themselves (Mercep: Paragraph[0008] “a machine learning object classifier to generate a classification for the sensor measurement data corresponding to the detection events, and optionally a confidence level associated with the generated classification. The computing system can input a matchable representation of the sensor measurement data into the machine learning object classifier, which can compare the matchable representation of the sensor measurement data to at least one object model describing a type of object 
capable of being located proximate to the vehicle.” Generating an estimate of the second real-world measurement without providing the machine learning model the second one or more images themselves is taught as a machine learning object classifier used to generate a classification for the sensor measurement data which defines a type of object. The applicants specification recites in paragraph [0034] “In an embodiment, metadata associated with a digital image includes sensor data from the image capture device that captured the digital image.” that metadata can be sensor data. The examiner notes that the sensor data is interpreted as metadata which does not input images but instead the sensor data itself is used.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Kottenstette,  Farooqi and Wang with the sensor data of Mercep in order to allow utilizing a classification system which classifies sensor measurement data, thereby including functionality in which an object model can identify an object based on various poses, orientations, transitional states, potential deformations for the poses or orientations, textural features, or the like, (Mercep: Paragraph [0008] “The object model can include matchable data for a certain object type, which can have various poses, orientations, transitional states, potential deformations for the poses or orientations, textural features, or the like, to be compared against the matchable representation. ”).
Claim 7 is similarly rejected refer to claim 19 for further analysis.

Regarding claim 32, Kottenstette in view of Farooqi, Wang and Mercep teach the system of Claim 19, Farooqi further teaches wherein: the dataset consists of the metadata describing the second one or more images of the second real-world structure (Farooqi: Paragraph [0043] “extracts features from the image data which can, for example, include metadata characterizing the image. In some cases, the metadata is included as part of the image data while, in other implementations, the metadata can be stored separately (or derived separately from the optical sensor(s) that generated the image data). For example, the metadata can include measurements of an object within the bounding polygon such as, for example, length, height, depth, world-coordinates (3-D coordinates), average color, size and shape, time of day of image capture, and the like.” Comprising metadata describing second one or more images of the second real-world structure is taught as extracting the metadata from the images which include measurements of an object.);…

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the machine learning object detection system of Kottenstette with the features and metadata of Farooqi in order to allow extracting features from the image data which can include metadata characterizing the image, thereby reducing the n class problem to an easier problem because the metadata can be used for pre-classification (Farooqi: Paragraph [0050] “The meta data is used here as pre-classification and reduces the n class problem to an easier problem because each machine learning model is only responsible for a subset of these n classes.”).

Mercep further teaches and said using the machine learning model to estimate the second real-world measurement of the second feature of the second real-world structure is based exclusively on the dataset (Mercep: Paragraph[0008] “a machine learning object classifier to generate a classification for the sensor measurement data corresponding to the detection events, and optionally a confidence level associated with the generated classification. The computing system can input a matchable representation of the sensor measurement data into the machine learning object classifier, which can compare the matchable representation of the sensor measurement data to at least one object model describing a type of object capable of being located proximate to the vehicle.” Using the machine learning model to estimate the second real-world measurement of the second feature of the second real-world structure is based exclusively on the dataset is taught as a machine learning object classifier used to generate a classification for the sensor measurement data which defines a type of object. The applicants specification recites in paragraph [0034] “In an embodiment, metadata associated with a digital image includes sensor data from the image capture device that captured the digital image.” that metadata can be sensor data. The examiner notes that the sensor data is interpreted as metadata which does not input images but instead the sensor data itself is used. Therefore, the classification is based exclusively on the sensor dataset [i.e. metadata]).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Kottenstette, Farooqi and Wang with the sensor data of Mercep in order to allow utilizing a classification system which classifies sensor measurement data, thereby including functionality in which an object model can identify an object based on various poses, orientations, transitional states, potential deformations for the poses or orientations, textural features, or the like, (Mercep: Paragraph [0008] “The object model can include matchable data for a certain object type, which can have various poses, orientations, transitional states, potential deformations for the poses or orientations, textural features, or the like, to be compared against the matchable representation. ”).
Claim 28 is similarly rejected refer to claim 32 for further analysis.

Claim 2 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kottenstette (U.S.2017/0076438) in view of Farooqi (U.S.20180150713), Wang (U.S. 20070269102) and Xiong (U.S. 2017/0024642).
Regarding claim 14, Kottenstette in view of Farooqi and Wang teaches the system of claim 13, Kottenstette further teaches wherein estimating the particular real-world measurement (Kottenstette: Paragraph [0182] “Height is taken to be relative to local environment (e.g., the height of buildings above the surface of neighbors). With one or more images of the same area, an above ground height model (AGHM) image can be generated using a classifier, where every pixel of the AGHM image can be assigned to a height above ground. The classifier can estimate elevations or heights of geographic regions and/or objects in images by direct or relative prediction without performing feature search or extraction.” To estimate real-world measurements of features of real-world structures is taught as the classifier can estimate elevations or heights of geographic regions and/or objects in images[height of buildings is taught as the real-world measurement of real-world structures])… 

Xiong further teaches. …is based on a polynomial regression (Xiong: Paragraph [0058] “polynomial regression” The polynomial regression is used for image classification in the prior art.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the methods for analyzing remote sensing imagery of Kottenstette, Farooqi and Wang with the Polynomial regression of Xiong. Doing so would allow utilizing polynomial regression for image classification in order to yield improved accuracy and yield a substantial improvement (Xiong: Paragraph [0004]).

Claim 2 is similarly rejected refer to claim 14 for further analysis.

Claim 4, 5, 16 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kottenstette (U.S.2017/0076438) in view of Farooqi (U.S.20180150713), Wang (U.S.20070269102) and Hovden (U.S. 2018/0365496).
Regarding claim 16, Kottenstette in view of Farooqi and Wang teaches the system of claim 13 wherein the particular digital image set includes an orthographic photo (Kottenstette: Paragraph [0211] “FIG. 13 illustrates example images in a feature space 1302 and a label space 1304. Feature space 1302 includes two input images—Input1 and Input2. Label space 1304 includes one input image, which is an AGHM image. An extractor 1306 can take these three images as input, in addition to other images, for training and creating a regression model that is configured to determine the heights of different regions in an image.” The example input images in Figure 13 are orthographic photos taken from above at an aerial view.) and…
Hovden further teaches one or more lateral photos (Hovden: Figure 2. One or more lateral photos is taught by the image in Figure 2. The photo is taken near ground level aiming at objects in the structure.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Kottenstette, Farooqi and Wang with the lateral images of Hovden. Doing so would allow utilizing lateral images in order to construct a 3D model of “house X” (Hovden: Paragraph [0070]).

Claim 4 is similarly rejected refer to claim 16 for further analysis.
Regarding claim 17, Kottenstette in view of Farooqi and Wang teaches the system of claim 13, Kottenstette does not explicitly disclose wherein the particular digital image set includes one or more lateral photos and no orthographic photos. 
Hovden further teaches wherein the particular digital image set includes one or more lateral photos and no orthographic photos (Hovden: Figure 3. Has a set of lateral images of a kitchen without any orthographic photos.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Kottenstette, Farooqi and Wang with the lateral images of Hovden. Doing so would allow utilizing lateral images in order to construct a 3D model of “house X” (Hovden: Paragraph [0070]).

Claim 5 is similarly rejected refer to claim 17 for further analysis.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AHSIF A. SHEIKH whose telephone number is (571)272-2607.  The examiner can normally be reached on Mon-Fri 7:30-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to 






/A.A.S./Examiner, Art Unit 2123                                                                                                                                                                                                        
/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123