DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 21 January 2022 has been entered.

Response to Amendment
Applicant’s response, filed 21 January 2022, to the last office action has been entered and made of record. 
In response to the cancellation of claim 8, it is acknowledged and made of record.
In response to the amendments to the claims, they are acknowledged, supported by the original disclosure, and no new matter is added.
Amendments to the independent claims 1, 6, and 13 have necessitated an updated ground of rejection over the applied prior art. Please see below for the updated interpretations and rejections.

Response to Arguments
Applicant's arguments filed 21 January 2022 have been fully considered but they are not persuasive.
In response to Applicant’s remarks on p. 8-9 of applicant’s reply, that the combined teachings of Lin, Childress, and Gallagher fail to teach or suggest the amended claimed subject matter of “prior to receiving the plurality of unlabeled images, generating, in a single batch, one or more first signatures for the labeled images” and “comparing said second signature to at  least one of the one or more first signatures to determine if the second signature matches any one of the first signatures”, the Examiner respectfully disagrees.
Examiner notes the claims are treated with their broadest reasonable interpretations consistent with the specification. See MPEP 2111. Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). Furthermore, the test for obviousness is what the combined teachings of the references would have suggested to those of ordinary skill in the art. See In re Keller, 642 F.2d 413, 208 USPQ871 (CCPA 1981).
Upon further review of the cited prior art references, Lin further teaches that in the training process for a classifier algorithm, semantic features are identified by analyzing the semantic content of the training images, where the training images include different associated tags (see Lin [0033] and [0036]). The trained classifier is used to automatically tag a received input image (see Lin [0044]), and that the training process for the classifier algorithm is performed prior to executing the trained classifier algorithm to determine than an input image is semantically similar to an example tagged image (see Lin Fig. 5 and [0063]-[0068]). Thus, Lin suggests that the training process, which performs the semantic features analysis of training images, is performed prior to receiving the input images, and provides a prior to receiving the plurality of unlabeled images, generating, in a single batch, one or more first signatures for the labeled images”. 
Additionally, Lin teaches that a received input image is provided to a trained classifier algorithm which generates a feature vector for the input image, the feature vector representing the semantic features for the input images can be matched with the feature vector representing the semantic features of an example tagged image (see Lin [0044]-[0045]), and further suggests that the input image is of a plurality of input images (see Lin [0066]). The trained classifier algorithm matches an input image to the example tagged images based on the corresponding feature vectors (see Lin [0069]). Thus, Lin further suggests that generating feature vectors representing semantic features of an input image to be compared with feature vectors of exampled tagged images / training images to determine a matching exampled tagged image / training image, and provides a teaching for the broadest reasonable interpretation of “comparing said second signature to at  least one of the one or more first signatures to determine if the second signature matches any one of the first signatures”.

Examiner further notes that originally filed specification paragraph [0028], referenced by Applicant as exemplary support for the amendments to claims 2, 7, and 14 (see p. 8 of Applicant’s reply), do not appear to relate to the amended claim subject matter of generating first and second signatures using different neural network technologies, but that specification paragraph [0028] discloses that another neural network may be used to compare the first and second signatures.
However, originally filed specification paragraphs [0023]-[0027] provides the disclosure that the signature generation module preferably includes a neural network, and that the signature generation module would separately generate a first signature based on the labeled image, and a second signature based on the unlabeled image. The disclosure at [0023]-[0027] is relied upon to provide support for the broadest reasonable interpretation of the amended claim term “different neural network technologies”.
CLAIM INTERPRETATION

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 

Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitations are: “signature-generation module”, “execution module”, “comparison module”, and “labeling module” in claims 13-17, and 20.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.


Claim Objections
Claim 2 and 5 are objected to because of the following informalities:  
Amended claim 2 recites “(support 0028)” after the claim ends, which the Examiner assumes is a typographical error.  
Amended claim 5 recites a grammatical error and, “wherein said labeled images [[is]]are from a first video sequence and said unlabeled images [[is]]are from a second video sequence” is assumed to be intended.
Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-6 and 10-20 are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al. (US 2016/0379091), herein Lin, in view of Childress et al. (US 2018/0341854, effectively filed 26 May 2017), .
Regarding claim 1, Lin discloses a method for associating a plurality of unlabeled images with one or more labeled images, the method comprising: 
prior to receiving the plurality of unlabeled images, generating, in a single batch, one or more first signatures for the labeled images (see Lin [0033], where in a training process for a classifier algorithm, semantic features are identified by analyzing the semantic content of the training images; see Lin [0036], where the training images include different associated tags; see Lin [0044], where a trained classifier is used to automatically tag a received input image; and see Lin Fig. 5 and [0063]-[0068], where the training process for the classifier algorithm is performed prior to executing the trained classifier algorithm to determine than an input image is semantically similar to an example tagged image, suggesting that the training process, which performs the semantic features analysis of training images, is performed prior to receiving the input images);
for each current unlabeled image of the plurality of unlabeled images (see Lin [0044], where an input image is received and provided to a trained classifier algorithm; see also Lin [0066], where a plurality of input images is suggested):
generating a second signature based on said unlabeled image (see Lin [0045], where the trained classifier algorithm can generate a second feature vector for the input image; see also Lin [0069], where one or more image features vectors are generated for the input image); 
comparing said second signature to at  least one of the one or more first signatures to determine if the second signature matches any one of the first signatures indicating that the at least one labeled images and the current unlabeled image are of the same (see Lin [0045], where the feature vector representing the semantic features for the input images can be matched with the feature vector representing the semantic features of an example tagged image; see also Lin[0069], where the trained classifier algorithm matches the input image to the example tagged images based on the feature vectors), wherein said second signature matches any one of the first signatures when a difference there between is within a margin of tolerance (see Lin [0071], where the distance between a feature vector for a first example tagged image and an input feature vector for the input image is less than a threshold distance used for determining similarity); and 
when the at least one labeled image and the current unlabeled image are determined to be of the same, applying one or more structural labels of the at least one labeled image to the current unlabeled image (see Lin [0073], where a tag is generated for the input image using tag content from the tagged image based on the semantic similarity with the tagged image).
Although Lin discloses that the tag used for associating with images includes locations associated with or depicted in images (see Lin [0028]); Lin does not explicitly disclose that comparing said second signature to at least one or more first signatures to determine if the second signature matches any one of the first signatures indicating that the at least one labeled images and the current unlabeled image are of the same location, and that when the at least one labeled image and the current unlabeled image are determined to be of the same location, applying one or more structural labels of the at least one labeled image to the current unlabeled image.
Childress teaches in a related and pertinent method and system for location tagging a piece of visual data using deep learning (see Childress Abstract), where untagged visual data is used to search for a corresponding location according to a dataset in a neural network and the visual data is tagged with the determine location (see Childress [0046]-[0048]).
At the time of filing, one of ordinary skill in the art would have found it obvious use Childress’s technique to the teachings of Lin such that a determined location tag is assigned to an untagged input image that is semantically similar to an example tagged image which is associated with the determined location tag. This modification is a use of a known technique to improve similar methods in the same 
Although Lin suggests that the images include one or more structural labels, such as a “room” tag (see Lin [0049]); Lin and Childress do not explicitly disclose that said labeled images having one or more structural labels associated with respective structural features present therein.
Gallagher teaches in a related and pertinent method for grouping images captured in a common location (see Gallagher Abstract), where the background in images is taught to be made up of typically large-scale and immovable elements in images (see Gallagher [0025]) and non-background elements from images are identified detected removed, leaving the remaining image areas as the image background (see Gallagher [0025]-[0029]), and color and texture features are computed for the background regions of the images and the features are used to cluster the images where images that have similar backgrounds are likely to have been captured at the same location and a label may be (see Gallagher [0031]-[0042]).
At the time of filing, one of ordinary skill in the art would have found it obvious apply Gallagher’s technique to the teachings of Lin and Childress, such that example tagged images include background features that are associated with corresponding location tags. This modification is rationalized as an application of a known technique to a known method ready for improvement to yield predictable results. In this instance, Lin and Childress teach a base method for tagging input images based on semantically similar example tagged images, where a location tag is generated for an input image by using the location tag from a tagged image that is semantically similar to the input image, where the distance between a feature vector for an example tagged image and an input feature vector of the input image being less than a threshold distance is used for determining similarity. Gallagher teaches a known technique of detecting background regions of images, clustering images with similar background region features and associating the cluster with the same location, and providing a label to identify the location depicted by each cluster and tagging all images in that cluster of similar backgrounds. One of ordinary skill in the art would have recognized that by applying Gallagher’s technique to the method of Lin and Childress would allow for the example tagged images are tagged with a location tag where the location tag is associated with background region features of the example tagged images, and using features of the background regions of images to determine semantically similar images for determining if images have a similar location, predictably resulting in an improved location tagging method which uses more consistent features of the image background.

Regarding claim 2, please see the above rejection of claim 1. Lin, Childress, and Gallagher disclose the method according to claim 1, wherein said first signatures and said second signatures are generated using different neural network technologies (see Lin [0031]-[0034], where the classifier algorithm may be an algorithm that uses a neural network model to identify associations between semantic features, and that in the training process the classifier algorithm, the classifier algorithm can generate one or more feature vectors representing the semantic features of the training images; see also see Childress [0044]-[0048], where a trained classifier algorithm can generate the feature vector representing the semantic features for the input image, where the suggested use of an untrained neural network based classifier algorithm to generate feature vectors of training images and the use of a trained neural network based classifier algorithm to generate feature vectors of an input images provides a teaching for the broadest reasonable interpretation of different neural network technologies).

Regarding claim 3, please see the above rejection of claim 1. Lin, Childress, and Gallagher the method according to claim 2, wherein one of the neural networks is a convolutional neural network (see Lin [0042], where the classifier algorithm may be a deep convolutional neural network algorithm; see also Childress [0024]-[0026], where a trained convolutional neural network may be used by the visual data location tagging program).

Regarding claim 4, please see the above rejection of claim 1. Lin, Childress, and Gallagher disclose the method according to claim 1, wherein said first signatures and said second signatures are numerical tensors (see Lin [0069], where feature vectors are computed for the input image and the example tagged images; Examiner notes that one of ordinary skill in the art would understand that tensors are a mathematical object which can be a generalizations of scalars, vectors, and matrices, please see the pertinent art section of the non-final Office action, dated 11 January 2021, for corresponding evidence).

(see Lin [0027], where an image may be one or more frames selected from a video).

Regarding claim 6, Lin, Childress, and Gallagher disclose a method for associating a plurality of  unlabeled frames with one or more labeled frames, the method comprising: 
prior to receiving the plurality of unlabeled frames, generating, in a single batch, one or more first signatures for the labeled frames (see Lin [0033], where in a training process for a classifier algorithm, semantic features are identified by analyzing the semantic content of the training images; see Lin [0036], where the training images include different associated tags; see Lin [0044], where a trained classifier is used to automatically tag a received input image; and see Lin Fig. 5 and [0063]-[0068], where the training process for the classifier algorithm is performed prior to executing the trained classifier algorithm to determine than an input image is semantically similar to an example tagged image, suggesting that the training process, which performs the semantic features analysis of training images, is performed prior to receiving the input images), said labeled frames having one or more structural labels associated with respective structural features present therein (see Lin [0044] and [0068], where an example tagged images has an associated tag; see Lin [0027], where an image may be one or more frames selected from a video; see Lin [0049], where tags from training images may include “room”; see Gallagher [0031]-[0042], where features of the background regions of images are used to cluster images with similar backgrounds and are associated with a same location, and a label is provided to identify the location and used to tag all images in the cluster of similar backgrounds), wherein the plurality of unlabeled frames are from an unlabeled sequence (see Lin [0044] and [0068], where input images are received for automatic tagging; see Lin [0027], where an image may be one or more frames selected from a video);
for each current unlabeled frame of the plurality of unlabeled frames in the unlabeled sequence (see Lin [0044], where an input image is received and provided to a trained classifier algorithm; see also Lin [0066], where a plurality of input images is suggested; see Lin [0027], where an image may be one or more frames selected from a video):
generating at least one second signature based on the current unlabeled frame in said unlabeled sequence (see Lin [0045], where the trained classifier algorithm can generate a second feature vector for the input image; see also Lin [0069], where one or more image features vectors are generated for the input image); 
comparing said second signature to at least one of the one or more first signatures to determine if the second signature matches any one of the first signatures indicating that the at least one labeled frame and the current unlabeled frame are of the same location (see Lin [0045], where the feature vector representing the semantic features for the input images can be matched with the feature vector representing the semantic features of an example tagged image; see also Lin[0069], where the trained classifier algorithm matches the input image to the example tagged images based on the feature vectors; see Childress [0046]-[0048], where untagged visual data is used to search for a corresponding location according to a dataset in a neural network and the visual data is tagged with the determined location), wherein said second signature matches any one of the first signatures when a difference therebetween is within a margin of tolerance (see Lin [0071], where the distance between a feature vector for a first example tagged image and an input feature vector for the input image is less than a threshold distance used for determining similarity ); and 
when the at least one labeled frame and the current unlabeled frame are determined to be of the same location, applying one or more structural labels of the at least one labeled image (see Lin [0073], where a tag is generated for the input image using tag content from the tagged image based on the semantic similarity with the tagged image; see Childress [0046]-[0048], where untagged visual data is used to search for a corresponding location according to a dataset in a neural network and the visual data is tagged with the determined location).
Please see the above rejection for claim 1, as the rationale to combine the teachings of Lin, Childress, and Gallagher are similar, mutatis mutandis.

Regarding claim 10, see above rejection for claim 6. It is a method claim reciting similar subject matter as claim 2. Please see above claim 2 for detailed claim analysis as the limitations of claim 10 are similarly rejected.

Regarding claim 11, see above rejection for claim 10. It is a method claim reciting similar subject matter as claim 3. Please see above claim 3 for detailed claim analysis as the limitations of claim 11 are similarly rejected.

Regarding claim 12, see above rejection for claim 6. It is a method claim reciting similar subject matter as claim 4. Please see above claim 4 for detailed claim analysis as the limitations of claim 12 are similarly rejected.

Regarding claim 13, it recites a system performing the method of claim 1. Lin, Childress, and Gallagher teach a system performing the method of claim 1. Please see above for detailed claim analysis, with the exception to the following further limitations:
signature-generation module (see Lin [0101]-[0103], where computing devices including microprocessor based computer system accessing stored software to implement the disclosed teachings are described); 
execution module (see Lin [0101]-[0103], where computing devices including microprocessor based computer system accessing stored software to implement the disclosed teachings are described).
Please see the above rejection for claim 1, as the rationale to combine the teachings of Lin, Childress, and Gallagher are similar, mutatis mutandis.

Regarding claim 14, see above rejection for claim 13. It is a system claim reciting similar subject matter as claim 2. Please see above claim 2 for detailed claim analysis as the limitations of claim 14 are similarly rejected.

Regarding claim 15, see above rejection for claim 14. It is a system claim reciting similar subject matter as claim 3. Please see above claim 3 for detailed claim analysis as the limitations of claim 15 are similarly rejected.

Regarding claim 16, see above rejection for claim 13. It is a system claim reciting similar subject matter as claim 4. Please see above claim 4 for detailed claim analysis as the limitations of claim 16 are similarly rejected.

Regarding claim 17, please see the above rejection of claim 13. Lin, Childress, and Gallagher disclose the system according to claim 13, wherein said execution module further comprises a comparison module and a labeling module (see Lin [0101]-[0103], where computing devices including microprocessor based computer system accessing stored software to implement the disclosed teachings are described).

Regarding claim 18, please see the above rejection of claim 1. Lin, Childress, and Gallagher disclose the method according to claim 1, wherein the labeled image further comprises one or more transitory labels associated with respective transitory features present in the labeled image (see Gallagher [0025]-[0028], where people, vehicles, and main subject regions are identified in images according to the respective image features), wherein the transitory labels are not applied to the unlabeled image when the labeled image and the unlabeled image are determined to be of the same location (see Gallagher [0025]-[0029], where non-background elements from images are identified detected removed, leaving the remaining image areas as the image background; and see Gallagher [0031]-[0042], where location descriptor tags are used to tag images with similar background region features; where the combined teachings of Lin, Childress, and Gallagher would suggest to one of ordinary skill that an unlabeled input image is tagged with a location tag associated with an example tagged image with similar background region features and that labels related to the removed non-background region are not assigned to the unlabeled input image).

Regarding claim 19, see above rejection for claim 6. It is a method claim reciting similar subject matter as claim 18. Please see above claim 18 for detailed claim analysis as the limitations of claim 19 are similarly rejected.

Regarding claim 20, see above rejection for claim 13. It is a system claim reciting similar subject matter as claim 18. Please see above claim 18 for detailed claim analysis as the limitations of claim 20 are similarly rejected.

Claims 7 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Lin, Childress, and Gallagher as applied to claim 6 above, and further in view of Patten et al. (US 2008/0232765), herein Patten.
Regarding claim 7, please see the above rejection of claim 6. Lin, Childress, and Gallagher do not explicitly disclose the method according to claim 6, wherein the for loop is further exited when a predetermined number of comparisons is reached.
However, Lin does disclose where an image may be one or more frames selected from a video (see Lin [0027]).
Patten discloses in a related and pertinent method for detecting flash frames to categorize and tag frames of a video (see Patten Abstract), where processing sequence of video frames for detecting and tagging flash frames includes repeating the process flow for all frames (see Patten Fig. 3 and [0029]-[0035]).
At the time of filing one of ordinary skill in the art would have found it obvious to apply the teachings of Patten for processing video frame images for detecting and tagging feature to repeatedly process each frame of the video until no remaining frames are left to the combined teachings of Lin, Childress, and Gallagher. This modification is a use of a known technique to improve similar methods in the same way. In this instance, Lin, Childress, and Gallagher teach a base method for tagging input images based on semantically similar example tagged images, where images may be frames of a video. Patten teaches a comparable method for processing and tagging video frames, where processing sequence of video frames for detecting and tagging flash frames includes repeating the process flow for all frames. One of ordinary skill in the art could have applied Patten’s known technique in the same way to Lin, Childress, and Gallagher’s method for processing frames of a video, such that the semantic features generating and comparing of input video frames with example tagged images for tagging input images with tags associated with semantically similar tagged images is performed and repeated for all 

Regarding claim 9, please see the above rejection of claim 6. Lin, Childress, Gallagher, and Patten disclose the method according to claim 6, wherein a first unlabeled frame is an initial frame in said unlabeled sequence and a new unlabeled frame is a frame in said unlabeled sequence that is adjacent to said first unlabeled frame (see Lin [0027] where an image may be one or more frames selected from a video; see Patten Fig. 3 and [0029]-[0035], where when there are more frames to be processed, the net frame of a video is processed).
Please see the above rejection for claim 7, as the rationale to combine the teachings of Lin, , Gallagher, and Patten are similar, mutatis mutandis.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TIMOTHY WING HO CHOI whose telephone number is (571)270-3814. The examiner can normally be reached 9:00 AM to 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VINCENT RUDOLPH can be reached on (571) 272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To 



/TIMOTHY CHOI/Examiner, Art Unit 2661                                                                                                                                                                                                        

/VINCENT RUDOLPH/Supervisory Patent Examiner, Art Unit 2661