Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 16-18, 20-22 and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Yu et al.: ("LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop.") and further in view of Xu et al.: ("Show, attend and tell: Neural image caption generation with visual attention.").

Regarding claim 16, Yu et al. teaches a computer-implemented method for constructing a model in a neural network for object detection in an unprocessed image (Yu [Title and Abstract] To overcome the bottleneck of human labeling speed during dataset construction, we propose to amplify human effort using deep learning with humans in the loop. Our procedure comes equipped with precision and recall guarantees to ensure labeling quality, reaching the same level of performance as fully manual annotation. To demonstrate the power of our annotation procedure and enable further progress of visual recognition, we construct a scene-centric database called “LSUN” containing millions of labeled images in each scene category.), the construction of the model being performed based on at least one image training batch (Yu [Section 3, lines 4-5] As shown in Figure 1, we continuously train new models based on the labels in the set of images (i.e. image training batch) we want to separate), and the neural network configured with a set of specifications (Yu [Figure 1] These small number of labeled images are utilized to train a binary classifier with deep
 learning feature (3). Then, we run the binary classifier on the whole unlabeled images.), the method comprising:
establishing at least one image training batch which comprises at least one training image that includes one or more objects, wherein an individual object of the objects is a member of an object class (Yu [Section 2, lines 1-5] To build a large-scale image database, we have to acquire a large number of images for each scene category from the Internet to start with. As with the previous works, we turn to existing image search engine to avoid crawling the Internet by ourselves. Currently, all of our image URLs are from Google Images search. We use a technique based on scene adjective queries and synonyms to overcome image limit of each search query and we are able to obtain nearly 100 million URLs to relevant images for each query); 

with a graphical user interface (GUI), displaying a training image from the image training batch (Yu [Section 4.1, lines 1-2 and Figure 2] Built on top of the labeling interface from [19], our labeling interface is to let the human annotators to go through the images one by one.);

iteratively performing (Yu [Figure 1] The overview of our pipeline…The pipeline runs
iteratively until the number of images in unknown set is affordable for exhaustive labeling.): 

a) annotating the one or more objects in the training image via a user interaction so as to generate individually annotated one of more objects (Yu [Section 4, lines 1-2; Figure 2] A critical part of our deep learning with humans in the loop annotation pipeline is to harvest high quality annotation from human annotators (i.e. user interaction), 

b) associating an annotation with the object class for the annotated one or more objects in the training image via the user interaction (Yu [Section 4.1 and Figure 2] Built on top of the labeling interface from [19], our labeling interface is to let the human annotators to go through the images one by one. As shown in Figure 2, on the top of a screen, with a question like “Is this a kitchen?”, a single image at a time is shown at the center, with small thumbnails for previous three images on the left and upcoming three images on the right. The worker can press the left arrow key on their keyboard to go through the images, with default answer of each image set to “No” when the image is viewed. The worker is required to press the space key on the keyboard to toggle the answer if the current answer is wrong, encoded by the color of the boundary box.)

c) returning a user annotated image training dataset comprising at least one training image with the annotated one or more objects, each individual one of the one or more annotated objects being associated with the object class (Yu [Figure 1 and Figure 2] The overview of our pipeline. To annotate a large number of unlabeled images, we first randomly sample a small portion (1), and label them on Amazon Mechanical Turk using our interface (2). These small number of labeled images are utilized to train a binary classifier with deep learning feature (3). Then, we run the binary classifier on the whole unlabeled images. The images with high or low scores are labeled as positive or negative automatically, while the images ambiguous to the classifier are fed into the next iteration as the unlabeled images (4). The pipeline runs iteratively until the number of images in unknown set is affordable for exhaustive labeling.);
 
d) generating the model by training one or more collective model variables in the neural network to classify the individual annotated one or more objects as a member of the object class wherein, the model, together with the set of specifications when implemented in the neural network, is configured to effectuate the object detection in the unprocessed image with the particular probability of the object detection (Yu [Section 3, lines 4-9; Figure 1] As shown in Figure 1, we continuously train new models based on the labels in the set of images we want to separate. We then use statistical tests to take out of the images that 3the models are sure about for certain quality goal. Then new images are labeled from the remaining image set to train better models for it, and the iterations continue. The quality we seek for each split is to keep the ratio of positive images higher than 95% and to lose less than 1% recall among the negative images).

	Yu however fails to disclose performing an intelligent augmentation which includes processing the annotated objects in the training image, and providing each resulting augmented one of the annotated objects a weighting for a particular probability of an occurrence.

Xu et al. teaches the computer-implemented method further comprising a performing an intelligent augmentation which includes processing the annotated objects in the training image, and providing each resulting augmented one of the annotated objects a weighting for a particular probability of an occurrence. (Xu [Section 3.1.2, Page 3, Col. 1, lines 1-5, Page 4, Col. 1 and Equation 2] We use a long short-term memory (LSTM) network…In this work, we use a deep output layer (Pascanu et al., 2014) to compute the output word (i.e. annotated object) probability. Its input are cues from the image (the context vector), the previously generated word (i.e. annotated object), and the decoder state (ht) where Lo ∈ R K×m, Lh ∈ R m×n, Lz ∈ R m×D, and E are learned parameters initialized randomly). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Yu et al. with that of  Xu et al. or order to allow for weighting for a particular probability of an occurrence as both references are from the same field of endeavor, which is constructing a model in a neural network for classifying an object in an image. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having both the teachings of Yu et al. and Xu et al. to assign a weighting to known objects in the neural network to calculate the probability of occurrence. It is known to combine prior art elements according to known methods to yield predictable results. Therefore, it would have been obvious to combine Yu et al. with Xu et al. to obtain the invention as specified in the instant application claim.

Regarding claim 17, Yu et al. in view of Xu et al. teaches the computer-implemented method according to claim 16, further comprising iteratively performing: 
e) displaying the training image comprising one or more machine marked objects associated with a machine performed classification of the one or more individual objects, modifying at least one of a machine object marking or a machine object classification (Yu [Figure 1; Figure 2 and Figure 3] Our AMT Labeling Interface… Life time of a HIT on AMT. Then, we run the binary classifier on the whole unlabeled images. The images with high or low scores are labeled as positive or negative automatically, while the images ambiguous to the classifier are fed into the next iteration as the unlabeled images), and 
f) evaluating a level of the training of the collective model variables for terminating the training of the model (Yu [Section 3.5, lines 8-12] We use 20,000 of the labels as testing set to determine the threshold for 95% precision and 99% recall. We repeat the same process for a few more iterations, until the number of images in the unknown set is less than 300,000. However, if the fine tuning can still separate the unknown set efficiently, we will keep splitting. In the end, we will just manually label all the unknown images).

Regarding claim 18, Yu et al. in view of Xu et al. teaches the computer-implemented method according to claim 16, wherein the sub steps (a)- (c) are performed iteratively before subsequently performing the sub step (d). (Yu [Figure 1] The overview of our pipeline. To annotate a large number of unlabeled images, we first randomly sample a small portion (1), and label them on Amazon Mechanical Turk using our interface (2). These small number of labeled images are utilized to train a binary classifier with deep learning feature (3). Then, we run the binary classifier on the whole unlabeled images. The images with high or low scores are labeled as positive or negative automatically, while the images ambiguous to the classifier are fed into the next iteration as the unlabeled images (4). The pipeline runs iteratively until the number of images in unknown set is affordable for exhaustive labeling.)

Regarding claim 20, Yu et al. in view of Xu et al. teaches the computer-implemented method according to claim 16, further comprising a establishing at least one image verification batch for testing the generated model with a subsequent generated model that is generated after a subsequent training by comparing the particular probability of the object detection reached with the generated models. (Yu [Table 1; Section 4.2, line 4 and Section 5.1, lines 1-3] Given a set of images that need to be labeled… The statistics of kitchen is shown in Table 1 as an example. The third column shows the cumulative number of positive images in each iteration. The fifth column shows the cumulative number of positive labels provided by human. The last column shows the ratio of human labeled images compared to all the positive images we obtained. We use the testing set to calculate the sampled positive image ratio, which is shown in the fourth column)

Regarding claim 21, Yu et al. in view of Xu et al. teaches the computer-implemented method according to claim 16, further comprising a utilizing an accuracy by which the object detection is performed for at least one of (i) evaluating the generated model or a use of the neural network specifications, and for evaluating a use of a simpler model or a simpler neural network for reducing a complexity of the model, or (ii) reducing the specifications. (Yu [Section 3.5, lines 7-9 and Figure 3] We use 20,000 of the labels as testing set to determine the threshold for 95% precision and 99% recall. To test our labeling pipeline and justify the need of bigger dataset for training, we have constructed an initial version of LSUN with millions of labeled images in each scene category. We experiment with popular deep nets using our dataset, and obtain a significant performance gain with the same model but trained using our bigger training set.)

Regarding claim 22, Yu et al. in view of Xu et al. teaches further comprising utilizing an accuracy by which the object detection is performed for evaluating an accuracy of the object detection of the generated model. (Yu [Section 3.5, lines 4-11] Before each iteration of fine tuning, we sample 100,000 images and get the labels. However, we don’t label new images at the first fine tuning, because the existing labels can train a better features than the those pretrained. We use 20,000 of the labels as testing set to determine the threshold for 95% precision and 99% recall. We repeat the same process for a few more iterations, until the number of images in the unknown set is less than 300,000. However, if the fine tuning can still separate the unknown set efficiently, we will keep splitting. In the end, we will just manually label all the unknown images.), and for evaluating a use of a reduced image training batch (Yu [Section 3.5, lines 2-4] Given a unknown set that is hard to separate with the pretrained features on PLACES, we fine tune the deep learning model trained by PLACES to do binary classification using the labels in the unknown set.)

Regarding claim 27, Yu et al. in view of Xu et al. teaches the computer-implemented method according to claim 16, wherein navigation in the image training batch is performed using a computer-implemented navigation tool which facilitates: - a navigation by an image management procedure, and - a status on a progression of evaluating the image training batch (Yu [Figure 2] Our AMT Labeling Interface).

Claims 23-26 and 28-31 are rejected under 35 U.S.C. 103 as being unpatentable over Yu et al.: ("LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop.") in view of Xu et al.: ("Show, attend and tell: Neural image caption generation with visual attention.") and further in view of Cherivadat et al. (US2015/001668 A1).

Regarding claim 23, Yu et al. in view of Xu et al. teaches all of the elements of the current invention stated in the computer-implemented method according to claim 16 (see claim 16 above) except wherein the annotating of the one or more objects is performed by an area-selection of the training image comprising an object-segmentation or a pixel-segmentation of the one or more objects.
Cheriyadat et al. teaches wherein the annotating of the one or more objects is performed by an area-selection of the training image comprising an object-segmentation or a pixel-segmentation of the one or more objects (Cheriyadat [Figure 7 and Figure 9] FIG. 7 is a graphical user interface showing user identified areas used in training a settlement mapping system…FIG. 9 is a settlement extraction process.) 
Yu et al. in view of Xu et al. and Cheriyadat et al. are all analogous to the art because they are in the same field of endeavor of constructing a model in a neural network for object detection in images. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having both the teachings of Yu et al. in view of Xu et al. and Cheriyadat et al. to provide multiple ways of annotating one or more objects in a training image when constructing a model based on images. Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art. 

Regarding claim 24, Yu et al. in view of Xu et al. teaches all of the elements of the current invention stated in the computer-implemented method according to claim 16 (see claim 16 above) except where annotating of the one or more objects is performed with a computer-implemented annotation tool configured with a zoom-function so as to at least one of:  provide an area-selection interface for an adjustable area-selection of the one or more objects in the training image via the user interaction.
Cheriyadat et al. teaches a graphical-user interface or GUI, where the annotating of the one or more objects is performed with a computer-implemented annotation tool configured with a zoom-function (Cheriyadat [0028, lines 12-19] Zoom objects (shown as magnifiers) 25 and a pan object (represented as a hand) are also rendered near the top of the window on the display. The zoom object allows a user to enlarge a selected portion of the image to fill the window on the screen. It may allow a user to detect, label or train enlarged portions of an image to render a finer or greater level of detail discrimination when identifying settlements. The pan object allows the user to move across the image parallel 30 to the current view pane. In other words, the view rendered on the display moves perpendicular to the direction it is pointed with the direction not changing.) so as to at least one of: - provide an area-selection interface for an adjustable area-selection of the one or more objects in the training image via the user interaction (Cheriyadat [0028], lines 12-19 and Figure 4] Zoom objects (shown as magnifiers) 25 and a pan object (represented as a hand) are also rendered near the top of the window on the display. The zoom object allows a user to enlarge a selected portion of the image to fill the window on the screen. It may allow a user to detect, label or train enlarged portions of an image to render a finer or greater level of detail discrimination when identifying settlements. The pan object allows the user to move across the image parallel 30 to the current view pane. In other words, the view rendered on the display moves perpendicular to the direction it is pointed with the direction not changing.) or provide a pixel-segmentation interface for a pixel-segmentation of the one or more objects in the training image via the user interaction, wherein the pixel-segmentation is configured to pre-segment pixels by grouping the pixels similar to a small selection of the pixels chosen via the user interaction, wherein the annotation tool is configured to transform the annotation from the pixel- segmentation of the one or more objects into the area-selection of the one or more objects in the training image (Cheriyadat [0029, lines 1-9; 0031, lines 4-8; Figure 7; Figure 9 and Figure 13] Activating the detect graphic object (or element) under the classification function activates the settlement extraction engine on the loaded image. The extraction engine may manage and execute programs and functions including those programmed and linked to text objects in the pull-down menu adjacent to the detect object. On a large image, the 5 settlement extraction engine may operate in block mode. The spacing of the edges of a selected image object, the relationship of the edges of the image object to surrounding materials or other image objects, the co-occurrence distribution of the image, etc., for example, may allow the extraction engine to identify discrete settlement structures within images as shown in the detections highlighted in Figures 2, 3, 5, 6 and 8.).
Before the effective filing date of the invention, it would have been obvious to try to one of ordinary skill in the art to have modified Yu et al. in view of Xu et al. to incorporate the teachings of Cheriyadat et al. to provide a graphical user interface for annotation to include tools to help the user manipulate a training image for better viewing and multiple ways for the user to annotate objects in the training image.  

Regarding claim 25, Yu et al. in view of Xu et al. in further view of  Cheriyadat et al. teach the computer-implemented method of claim 24 and wherein the computer- implemented annotation tool facilitates at least one of: - a color-overlay annotation, wherein a color is associated with an object classification that is associated with the annotation, wherein a color is associated with an object classification that is associated with the annotation, (Cheriyadat [Section 4.1, lines 6-7 and Figure 2] The worker is require to press the space key on the keyboard to toggle the answer if the current answer is wrong, encoded by the color of the boundary box. ) or - a re-classification of at least one of the one or more individual annotated objects or machine marked objects (Cheriyadat [Figure 2; Figure 3; 0009 and 0010] FIG. 2 is a graphical user interface displaying an automated detected settlement… FIG. 3 is a graphical user interface displaying a second automated detected settlement), wherein the annotation tool is configured to show all annotations and machine marks associated with an object class in the at least one training image (Cheriyadat [Figure 2; Figure 3; 0009 and 0010] FIG. 2 is a graphical user interface displaying an automated detected settlement… FIG. 3 is a graphical user interface displaying a second automated detected settlement). 

Regarding claim 26, Yu et al. in view of Xu et al. in further view of  Cheriyadat et al. teach the compute-implemented method of claim 25 and where the computer- implemented annotation tool further provides a history of the performed annotation (Cheriyadat [0028, lines 5-7; Figure 1] A log window rendered on the graphical user display records and displays the actions executed by the user and the settlement mapping system. The log window also provides relevant 20 image information including image dimensions, number of bands, and the bit depth. The output file names and locations may also be displayed through the log window.) 

Regarding claim 28, Yu et al. in view of Xu et al. teaches the method of claim 16 but fails to disclose wherein the at least one image training batch is collected using an airborne vehicle. 
Cheriyadat et al. discloses a settlement mapping system tool that detects maps and characterizes land use by analyzing high resolution bitmapped satellite and aerial images through a graphical user interface. (Cheriyadat [0026, lines 1-6] This disclosure introduces technology that analyses high resolution bitmapped, satellite, and aerial images. It discloses a settlement mapping system (settlement mapping 15 system/tool or SMTool) that automatically detects, maps, and characterizes land use. The system includes a settlement extraction engine and a settlement characterization engine. The settlement extraction engine identifies settlement regions from high resolution satellite and aerial images through a graphic element.). 
Before the effective filing date of the invention,  it would have been obvious to one of ordinary skill in the art to combine Yu et al. in view of Xu et al. and Cheriyadat et al. to train aerial images to construct a model in a neural network for object detection. It is a known especially within the area of geographic information system mapping. Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentive or other market forces if variations are predictable to one of ordinary skill in the art. 

Regarding claim 29, Yu et al. teaches a computer-implemented method provided in a neural network for an object detection in an unprocessed image having a particular probability of the object detection (Yu [Abstract] To overcome the bottleneck of human labeling speed during dataset construction, we propose to amplify human effort using deep learning with humans in the loop. Our procedure comes equipped with precision and recall guarantees to ensure labeling quality, reaching the same level of performance as fully manual annotation. To demonstrate the power of our annotation procedure and enable further progress of visual recognition, we construct a scene-centric database called “LSUN” containing millions of labeled images in each scene category.), the method comprising: 

providing a generated model, the generation of the model being performed based on at least one image training batch, and the neural network configured with a set of specifications, comprising: - establishing at least one image training batch which comprises at least one training image that includes one or more objects, wherein an individual object of the objects is a member of an object class (Yu [Section 2, lines 1-2 and Figure 4] To build a large-scale image database, we have to acquire a large number of images for each scene category from the Internet to start with.); 

- with a graphical user interface (GUI), displaying a training image from the image training batch (Yu [Figure 2] Our AMT Interface); and 

- iteratively performing: (Yu [Figure 1] The overview of our pipeline…The pipeline runs
iteratively until the number of images in unknown set is affordable for exhaustive labeling.)

a) annotating the one or more objects in the training image via a user interaction so as to generate individually annotated one of more objects (Yu [Figure 1 and Figure 2] The overview of our pipeline. To annotate a large number of unlabeled images, we first randomly sample a small portion (1), and label them on Amazon Mechanical Turk using our interface (2). These small number of labeled images are utilized to train a binary classifier with deep learning feature (3). Then, we run the binary classifier on the whole unlabeled images. The images with high or low scores are labeled as positive or negative automatically, while the images ambiguous to the classifier are fed into the next iteration as the unlabeled images (4). The pipeline runs iteratively until the number of images in unknown set is affordable for exhaustive labeling.), 

b) associating an annotation with the object class for the annotated one or more objects in the training image via the user interaction (Yu [Figure 2] Our AMT Labeling Interface), 

c) returning a user annotated image training dataset comprising the at least one training image with the annotated one or more objects, each individual one of the annotated one or more annotated objects associated with the object class (Yu [Figure 1 and Figure 2] Our AMT Labeling Interface)

d) generating the model by training one or more collective model variables in the neural network to classify the individual annotated one or more objects as a member of the object class, wherein, the model, together with the set of specifications when implemented in the neural network, is configured to effectuate the object detection in the unprocessed image with a particular probability of the object detection; establishing at least one unprocessed image batch that comprises at least one unprocessed image to be subject for the object detection (Yu [Figure 1] The overview of our pipeline. To annotate a large number of unlabeled images, we first randomly sample a small portion (1), and label them on Amazon Mechanical Turk using our interface (2). These small number of labeled images are utilized to train a binary classifier with deep learning feature (3). Then, we run the binary classifier on the whole unlabeled images.).

While Yu et al. does teach a GUI displaying one or more unprocessed images (Yu [Figure 2] Our AMT Labeling Interface). Yu et al. does not teach the GUI with a set of marked objects each one of the marked objects being associated with the object class; performing the object detection in an unprocessed image; returning the unprocessed image with the set of marked objects, each of the marked objects being associated with the object class; and performing an intelligent augmentation which includes processing the annotated objects in the training image, and providing each resulting augmented one of the annotated objects a weighting for a particular probability of an occurrence.

Xu et al. teaches the computer-implemented method further comprising a performing an intelligent augmentation which includes processing the annotated objects in the training image, and providing each resulting augmented one of the annotated objects a weighting for a particular probability of an occurrence. (Xu [Section 3.1.2, Page 3, Col. 1, lines 1-5, Page 4, Col. 1 and Equation 2] We use a long short-term memory (LSTM) network…In this work, we use a deep output layer (Pascanu et al., 2014) to compute the output word (i.e. annotated object) probability. Its input are cues from the image (the context vector), the previously generated word (i.e. annotated object), and the decoder state (ht) where Lo ∈ R K×m, Lh ∈ R m×n, Lz ∈ R m×D, and E are learned parameters initialized randomly). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Yu et al. with that of  Xu et al. or order to allow for weighting for a particular probability of an occurrence as both references are from the same field of endeavor, which is constructing a model in a neural network for classifying an object in an image. Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art, having both the teachings of Yu et al. and Xu et al. to assign a weighting to known objects in the neural network to calculate the probability of occurrence. It is known to combine prior art elements according to known methods to yield predictable results. Therefore, it would have been obvious to combine Yu et al. with Xu et al. to obtain the invention as specified in the instant application claim.

 However Yu et al. in view of Xu et al. does not discloses a GUI with a set of marked objects each one of the marked objects being associated with the object class; performing the object detection in an unprocessed image; returning the unprocessed image with the set of marked objects, each of the marked objects being associated with the object class.

Cheriyadat et al. teaches the GUI displaying one or more unprocessed images with a set of marked objects, each one of the marked objects being associated with the object class (Cheriyadat [0006; 0007; 0008; 0011; Figure 2; Figure 3 and Figure 6] Figure 1 is a graphical user interface displaying a high-resolution image. Figure 2 is a graphical user interface displaying an automated detected settlement. Figure 6 is a graphical user interface displaying an automated detection of two 15 settlements within Figure 4.) performing the object detection in an unprocessed image and returning the unprocessed image with the set of marked objects, each of the marked objects being associated with the object class (Cheriyadat [0010, lines 1-2; [0011, lines 1-2; Figure 5 and Figure 6] Figure 5 is a graphical user interface displaying an automated detection of one settlement within Figure 4. Figure 6 is a graphical user interface displaying an automated detection of two 15 settlements within Figure 4.). 
Before the effective filing date of the claimed invention it would have been obvious to a person of ordinary skill in the art to combine the teachings of Yu et al. in view of Xu et al. and Cheriyadat et al. to include a graphical user interface to display the image being processed and see the detected objects and their respective object class. Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art. The GUI configured in this way

Regarding claim 30, Yu et al in view of Xu et al. in view of Cherivadat et al. disclose the method of claim 29, further comprising providing access to the neural network for further training of one or more collective model variables of the model, such that the model is subject to an improved accuracy of the object detection. (Yu [Section 3.5, lines 1-11] After one to two iterations of splitting based on pretrained deep learning features and SVM models, the difficulty of further splitting the remaining unknown image set increases. Given an unknown set that is hard to separate with the pretrained features on PLACES, we fine tune the deep learning model trained by PLACES to do binary classification using the labels in the unknown set. Before each iteration of fine tuning, we sample 100,000 images and get the labels. However, we don’t label new images at the first fine tuning, because the existing labels can train a better features than the those pretrained. We use 20,000 of the labels as testing set to determine the threshold for 95% precision and 99% recall. We repeat the same process for a few more iterations, until the number of images in the unknown set is less than 300,000. However, if the fine tuning can still separate the unknown set efficiently, we will keep splitting. In the end, we will just manually label all the unknown images.)

Regarding claim 31, Yu et al in view of Xu et al. in view of Cherivadat et al. disclose the method of claim 29, wherein the unprocessed image is collected using an airborne vehicle. (Cheriyadat [0026, lines 1-6] This disclosure introduces technology that analyses high resolution bitmapped, satellite, and aerial images. It discloses a settlement mapping system (settlement mapping 15 system/tool or SMTool) that automatically detects, maps, and characterizes land use. The system includes a settlement extraction engine and a settlement characterization engine. The settlement extraction engine identifies settlement regions from high resolution satellite and aerial images through a graphic element. Aerial imagery refers to all imagery taken from an airborne craft. Therefore, providing an image training batch consisting of images collected from an airborne vehicle.)

Response to Arguments
Applicant's arguments filed 14 April 2022 have been fully considered but they are not persuasive. 
The examiner withdraws the objections to the drawings and the rejection under 35 USC 112b for indefiniteness.
The applicant argues the rejection under 35 USC 103 should be withdrawn as Xu does not disclose calculation of a particular probability of an occurrence in an image. Furthermore Xu bases the netw word in the image caption on the previously generated words. This varies from the claimed invention in that the annotated object is provided with a weighting for a particular probability of an occurrence, completely independent of other objects in the image. 
The examiner respectfully traverses the applicant’s argument as the prior art does indeed disclose a weighting for objects in a neural network setup and disclose a probability of a word occurrence in Xu Section 3.1.2, Page 3, Col. 1, lines 1-5, Page 4, Col. 1 and Equation 2. The argument that prior art probability is based on the prior words while the instant applications probability is not does not remove Xu as prior art as there is not limitation in the claims that would exclude the probability from being based on other words or objects. As such the examiner maintains the rejection.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAULINHO E SMITH whose telephone number is (571)270-1358. The examiner can normally be reached Mon-Fri. 10AM-6PM CST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on 571-270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/PAULINHO E SMITH/Primary Examiner, Art Unit 2127