DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments
This action is in response to remarks and amendments submitted on 11/05/2021, in which claims 1, and 5-21 were presented for further examination. The applicant’s remarks and amendments to the claims were considered with the following results:
In response to the last Office Action: 
Claims 1, 6-7, 9-11, 13, 16-18, and 21 are currently amended. 
Claims 2-4 were cancelled.
Claims 1, and 5-21 are pending.
The previous objection to the title of the invention has been withdrawn –as necessitated by applicant’s amendment to the title filed 11/5/2021.
The previous objection to the abstract of the invention has been withdrawn –as necessitated by applicant’s amendment to the abstract of the invention filed 11/5/2021.

Response to Arguments
The applicant’s remarks and/or arguments, filed 11/05/2021 with respect to claim(s) 1, and 5-21, have been fully considered. 



Applicant asserts the processes of receiving an image from an electronic device and transmitting a reference dataset to the electronic device are not acts that can be performed by a human mind.   

The examiner notes the applicant’s arguments are not persuasive. The examiner notes, aside from the recitation of generic computing component, the claim contains limitation that can be reasonably performed mentally, as well as limitations that can be regarded as insignificant extra-solution activities. For instance, the calculation and determination limitations within the claim language can be reasonably performed in the human mind with the aid of pen and paper or slide rule. These limitations require observation, evaluation, judgement, and/or opinion. Further, the limitations directed to receiving information, extracting data, searching for data, generating a dataset, and transmitting information are all regarded as insignificant extra-solution activity (see MPEP 2106). The examiner suggests applicant explicitly point out on the record the computing technologies 

Applicant asserts Mensink (US PGPub 20120269436) does not teach the amended features of each independent claim.   

The examiner notes the applicant’s amendments and arguments are persuasive. However, the applicant's amendments and arguments necessitated a new ground of rejection. The new ground of rejection is made under the combination of Mensink and Ross (US PGPub 20170243230). The combination of Mensink and Ross is shown to teach the combination of elements presented within the impacted claims.

It should be noted that any citations to specific, pages, columns, lines, or figures in the prior art references and any interpretation of the reference should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. See MPEP 2123.


Claim Objections
Claims 1, 11, and 16-17 are objected to because of the following informalities:  
is a searched image…”.
Claim 16 would need to be properly reorganized. Dependent claim 16 is erroneously separated from dependent claim 12 –to which the impacted claim dependents. MPEP 608 states “A claim which depends from a dependent claim should not be separated by any claim which does not also depend from said dependent claim. It should be kept in mind that a dependent claim may refer to any preceding independent claim. See MPEP § 608.01(n)”. Appropriate action is required. 


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1, and 5-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.  Claim(s) 1, and 5-21 are directed to an abstract idea.  The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. 

Step 1
The computing device, as claimed in claim 11, is directed to a machine. The method, as claimed in claim 1, is directed to a process. The non-transitory computer-readable storage medium, as claimed in claim 17, is directed to an article of manufacture. 

Under revised Step 2A, Prong 1 of the 2019 PEG, the claim limitations, “calculating a respective difference between one or more feature vectors in the feature data relating to the object and corresponding one or more feature vectors in respective feature data of the image, and in accordance with a determination that the calculated difference is within a predetermined value, determining the image as a searched image containing the object”, are limitations that can reasonably be performed in the human mind with the aid of pen and paper or slide rule. The concepts recited in claim 1 represent an idea 'of itself'. An idea ’of itself’ is used to describe an idea standing alone such as a concept, plan, or scheme, as well as a mental process (thinking) that "can be performed in the human mind or by a human using a pen and paper or a slide rule". Mental processes are defined by the 2019 PEG as including “concepts performed in the human mind (including an observation, evaluation, judgement, opinion”). 
Under revised Step 2A, Prong 2 of the 2019 PEG, if it is determined that the claims recite a judicial exception, it is then necessary to evaluate whether the claims recite additional elements that integrate the judicial exception into a practical application of that exception. In this case, claim 1 includes additional elements such as computing device(s), claim 11 includes additional elements such as a server, memory and processor, and claim 17 includes additional elements such as a computing device. Further, the claims recite additional elements such as receiving information, extracting data, searching for data, generating a dataset, and transmitting information. In this case, the additional elements recited in the independent claims are recited and described in a generic manner and merely amount to no more than mere instructions to apply the exception, insignificant extra-solution activity, and/or a general link of the use of the abstract idea to a particular technological environment or field of use. 

Under Step 2B of the 2019 PEG, if it is determined that the claims recite a judicial exception that is not integrated into a practical application of that exception, it is then necessary to evaluate the additional elements individually and in combination to determine whether they provide an inventive concept (i.e., whether the additional elements amount to significantly more than the exception itself). As discussed above with respect to integration of the abstract idea into a practical application, the additional elements only amount to mere instructions to apply the exception using generic computer components, extra-solution activities and/or a generic link of the use of the exception to a particular technological environment or field of use. See court decisions: “Gathering and analyzing information using conventional techniques and displaying the result”, TLI Communications, 823 F.3d at 612-13, 118 USPQ2d at 1747-48; and “Selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display”, Electric Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354-55, 119 USPQ2d 1739, 1742 (Fed. Cir. 2016); 


Dependent claims 5-10, 12-16, and 18-21 do not aid in the eligibility of the respective independent claims. 
Claim 5 further specifies the information data. This can be described as insignificant extra solution activity, and does not aid in the eligibility and/or patentability of the respective independent claims.   
Claim 6 further specifies filtering data. This can be described as insignificant extra solution activity, and does not aid in the eligibility and/or patentability of the respective independent claims.   
Claim 7 further specifies generating a reference dataset in a tree form. The concept can be compared to concepts reasonably performed in the human mind (including an observation, evaluation, judgement, opinion) with the aid of pen and paper, and does not aid in the eligibility and/or patentability of the respective independent claims.   
Claim 8 further defines the reference dataset. The concept can be compared to concepts reasonably performed in the human mind (including an observation, evaluation, judgement, opinion) with the aid of pen and paper, and does not aid in the eligibility and/or patentability of the respective independent claims.   
Claim 9 further specifies provides suggested tags. This can be described as insignificant extra solution activity, and does not aid in the eligibility and/or patentability of the respective independent claims.   

Claim 12 further specifies the information data. This can be described as insignificant extra solution activity, and does not aid in the eligibility and/or patentability of the respective independent claims.   
Claim 13 further specifies generating a reference dataset in a tree form. The concept can be compared to concepts reasonably performed in the human mind (including an observation, evaluation, judgement, opinion) with the aid of pen and paper, and does not aid in the eligibility and/or patentability of the respective independent claims.   
Claim 14 further defines the reference dataset. The concept can be compared to concepts reasonably performed in the human mind (including an observation, evaluation, judgement, opinion) with the aid of pen and paper, and does not aid in the eligibility and/or patentability of the respective independent claims.   
Claim 15 further specifies providing suggested text. This can be described as insignificant extra solution activity, and does not aid in the eligibility and/or patentability of the respective independent claims.   
Claim 16 further specifies receiving user input and providing information related to the user input. This can be described as insignificant extra solution activity, and does not aid in the eligibility and/or patentability of the respective independent claims.   
 Claim 18 further specifies generating a reference dataset in a tree form. The concept can be compared to concepts reasonably performed in the human mind 
Claim 19 further defines the reference dataset. The concept can be compared to concepts reasonably performed in the human mind (including an observation, evaluation, judgement, opinion) with the aid of pen and paper, and does not aid in the eligibility and/or patentability of the respective independent claims.   
Claim 20 further specifies providing suggested text. This can be described as insignificant extra solution activity, and does not aid in the eligibility and/or patentability of the respective independent claims.   
Claim 21 further specifies receiving user input and providing information related to the user input. This can be described as insignificant extra solution activity, and does not aid in the eligibility and/or patentability of the respective independent claims.   
Thus, dependent claims 5-10, 12-16, and 18-21 are also ineligible. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1, and 5-21 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent Application Publication, US 20120269436, to Thomas Mensink et al, hereinafter “Mensink”, in view of U.S. Patent Application Publication, US 20170243230, to David Justin Ross et al, herein after "Ross”.

Regarding claim 1, Mensink teaches a method performed at a computing device that includes one or more processors and memory (Mensink, ¶ [0016], teaches FIG. 2 is a flow diagram illustrating a method for predicting labels for images in accordance with another aspect of the exemplary embodiment. Mensink, ¶ [0012], teaches instructions are provided in memory for generating feature-based predictions for values of labels in the set of labels based on features extracted from the image and for predicting a value for at least one label from the set of labels for the image based on the feature-based label predictions and predictive correlations of the structured model. The predicted value for the at least one label may also be based on an assigned value for at least one other label from the set of labels received for the image. A processor executes the instructions), the method comprising: receiving, from an electronic device, an input image containing an object (At least Mensink, FIG. 1, discloses images and image data being received. At least Mensink, FIG. 4, discloses an interactive image. Further, Mensink, ¶ [0014], teaches for each of the training images, for each of a set of labels, a feature function is generated, based on features extracted from the image, which is used to predict a value of the label for the image. Mensink, ¶ [0032], teaches images may be received by the system in any convenient file format, such as JPEG, TIFF, or the like … The images can be input to the system from an external source or generated within the system); in response to receiving the input image: extracting, from the input image, feature data relating to the object, the feature data including one or more feature vectors (Mensink, ¶ [0036], teaches on a new image, the classifier outputs a feature function which, for each label, indicates whether that label is true (e.g., as a binary value or a probability). In other embodiments, the feature representation extracted from the image may be used directly as the feature function. For example, a Fisher vector may be generated from features extracted from the image in which each value of the vector is associated with a visual word corresponding to one of the labels), performing a search of an image database based, at least in part, on the extracted feature data to determine one or more images containing the object (Mensink, ¶ [0047], teaches the interface shows the image to be labeled and may also display the set of labels predicted by the classifier, optionally updated to reflect the user's responses to a set of queries. The user clicks on a selection or otherwise indicates his response to each of the queries that are presented in turn, thereby assigning values to a subset (fewer than all) of the labels. Further, Mensink, ¶ [0062], teaches if at S122, if a stopping criterion has not been reached, the , obtaining information data of the one or more searched images from the image database (Mensink, ¶ [0034], teaches the images to be labeled may be images without any labels. In other embodiments, the images may have had some labels assigned (e.g., as tags obtained from a photo-sharing site such as Flikr™), but the annotation is not complete. Or, the images 16 may have received a small number of labels from a small label set automatically or manually applied and the objective is to expand the annotation to a larger label set); generating a reference dataset for the object based, at least in part, on the information data, the reference dataset including at least some of the information data (Mensink, ¶ [0047], teaches the feature representations of the training images and their respective labels may be use to train a classifier system. This step is optional. In other embodiments, the features vector (such as a Fisher vector) is used as a feature function of the image and is used directly as label predictions for each of the images), and transmitting the reference dataset to the electronic device (Mensink, FIG. 2, element S128, discloses outputting assigned labels/class for image).
Mensink teaches the limitations as identified above. 	Mensink does not explicitly teach: in response to receiving the image: performing a search of an image database based, at least in part, on the extracted feature data to determine one or more images containing the object, further including: for each of a plurality of images stored in the image database: calculating a respective difference between one or more feature vectors in the feature data relating to the object and corresponding one or more feature vectors in respective feature data of the image, and in accordance with a determination that the calculated difference is within a predetermined value, determining the image as a searched image containing the object; in accordance with the determination, obtaining information data of the one or more searched images from the image database. 
However, Ross teaches:
in response to receiving the image: performing a search of an image database based, at least in part, on the extracted feature data to determine one or more images containing the object, further including: for each of a plurality of images stored in the image database: calculating a respective difference between one or more feature vectors in the feature data relating to the object and corresponding one or more feature vectors in respective feature data of the image (Ross, ¶ [0058], teaches because we are exploiting natural features and often scanning the object under variable conditions, it is highly unlikely that two different “reads” will produce the exact same fingerprint. We therefore have to introduce the ability to look up items in the database when there is a near-miss. For example, two feature vectors [0, 1, 5, 5, 6, 8] and [0, 1, 6, 5, 6, 8] are not identical but (given the proper difference metric) may be close enough to say with certainty that they are from the same item that has been seen before. This is particularly true if, otherwise, the nearest feature vector of a different item is [5, 2, 5, 8, 6, 4]. For example, a distance between vectors of n-dimensions is easily calculated, and may be used as one metric of , and in accordance with a determination that the calculated difference is within a predetermined value, determining the image as a searched image containing the object (Ross, ¶ [0036], teaches once one or more suitable digital fingerprints of an object are acquired, the object (actually some description of it) and corresponding fingerprint may be stored or “registered” in a database. For example, in some embodiments, the fingerprint may comprise one or more feature vectors. The database should be secure. In some embodiments, a unique ID also may be assigned to an object. An ID may be a convenient index in some applications. However, it is not essential, as a digital fingerprint itself can serve as a key for searching a database. In other words, by identifying an object by the unique features and characteristics of the object itself, arbitrary identifiers, labels, tags, etc. are unnecessary and, as noted, inherently unreliable); in accordance with the determination, obtaining information data of the one or more searched images from the image database (Ross, ¶ [0034], teaches objects may have permanent labels or other identifying information attached to them. These can also be used as features for digital fingerprinting. For instance, wine may be put into a glass bottle and a label affixed to the bottle. Since it is possible for a label to be removed and reused, simply using the label itself as the authentication region is often not sufficient. In this case we may define the authentication region to include both the label and the substrate it is attached to—in this case some portion of the glass bottle. This “label and substrate” approach may be useful in defining authentication regions for many types of objects, such as consumer goods and pharmaceutical packaging. If a label has been moved . 
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Mensink (disclosing labeling images and for generating an annotation system) to include the teachings of Ross (disclosing acquiring digital image data of an image of at least a portion of a target physical object) and arrive at functions directed to identifying differences in image features. One of ordinary skill in the art would have been motivated to make this combination to improve the effectiveness of searching for and retrieving relevant image data (see at least Ross, ¶ [0041]). In addition, the references Mensink and Ross teach features that are directed to analogous arts and they are directed to the same field of endeavor related to analyzing images.

Regarding claim 5, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches the information data includes at least one of a title, a detailed description, a category and a brand relating to the one or more images (Mensink, ¶ [0033], teaches the labels are drawn .  

Regarding claim 6, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches 122871-5003-US26filtering the information data based on a predetermined list (Mensink, ¶ [0030], teaches the term predictions (image labels/attribute labels) can be used in tools for clustering, classification, retrieval and visualization and find application, for example, in multimedia content management systems, stock photography database indexing, and for exploration such as exploring images on photo-sharing websites), wherein the generating the reference dataset is based, at least in part, on the filtered information data (Mensink, ¶ [0075], teaches the feature vectors of an image are assigned to clusters. For example, a visual vocabulary is previously obtained by clustering low-level features extracted from training images, using for instance K-means. Each patch vector is then assigned to a nearest cluster and a histogram of the assignments can be generated).  

Regarding claim 7, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches the generating the reference dataset includes constructing at least one tree data structure for the at least some of the information data (Mensink, ¶ [0075], teaches one or more structured models are generated, based on the labels 14 and on either the image features or the classifier output. This includes computing a graph structure based on the maximum spanning tree over a fully connected graph over the label variables with edge weights given by the mutual information between the label variables. The graph contains node potentials, which are a weighted sum of the image features or image classifier scores, and edge potentials of the tree-structured conditional model, which are scalar values. The parameters are then computed by log-likelihood maximization. At S108, one or more structured models are generated, based on the labels and the classifier output. This includes computing the maximum spanning tree over a fully connected graph over the label variables with edge weights given by the mutual information between the label variables).  

Regarding claim 8, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches the reference dataset includes at least one title data for suggesting a title of the object(Mensink, ¶ [0103], teaches the clustering aims to associate labels in a group, based on the co-occurrence of their states in the training set) and at least one detailed description data for suggesting a detailed description of the object (Mensink, ¶ [0042], teaches through inference in the graphical model 44, the system fuses the information from the image content and the user responses, and is able to identify labels that are highly informative, once provided with some information by the user. Mensink, ¶ [0152], teaches any informative question will at least rule out one of the possible classes, and .  

Regarding claim 9, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches providing a suggested text relating to the object based, at least in part, on the reference dataset (Mensink, ¶ [0047], teaches the interface shows the image to be labeled and may also display the set of labels predicted by the classifier, optionally updated to reflect the user's responses to a set of queries. The user clicks on a selection or otherwise indicates his response to each of the queries that are presented in turn, thereby assigning values to a subset (fewer than all) of the labels).  

Regarding claim 10, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches receiving a user input relating to the object (Mensink, ¶ [0042], teaches structured models are able to transfer the user input for one image label to more accurate predictions on other image labels); and providing at least one candidate text relating to the user input based, at least in part, on the reference dataset  (Mensink, ¶ [0118], teaches the user interaction, in the case of the interactive mode, also takes place at the attribute level. The system asks for user input on the attribute level labels to improve the class predictions, rather than to improve the attribute prediction).  

Regarding claim 11, Mensink teaches a computing device, comprising: at least one processor; and at least one memory storing instructions (Mensink, ¶ [0012], teaches instructions are provided in memory for generating feature-based predictions for values of labels in the set of labels based on features extracted from the image and for predicting a value for at least one label from the set of labels for the image based on the feature-based label predictions and predictive correlations of the structured model. The predicted value for the at least one label may also be based on an assigned value for at least one other label from the set of labels received for the image. A processor executes the instructions) that, when executed by the at least one processor, cause the at least one processor to perform operations comprising: receiving, from an electronic device, (At least Mensink, FIG. 1, discloses images and image data being received. At least Mensink, FIG. 4, discloses an interactive image. Further, Mensink, ¶ [0014], teaches for each of the training images, for each of a set of labels, a feature function is generated, based on features extracted from the image, which is used to predict a value of the label for the image. Mensink, ¶ [0032], teaches images may be received by the system in any convenient file format, such as JPEG, TIFF, or the like … The images can be input to the system from an external source or generated within the system); in response to receiving the input image: extracting, from the input image, feature data relating to the object, the feature data including one or more feature vectors (Mensink, ¶ [0036], teaches on a new image, the classifier outputs a feature function which, for each label, indicates whether that label is true (e.g., ; 122871-5003-US4 Response to Non-Final Office Actionperforming a search of an image database based, at least in part, on the extracted feature data to determine one or more images containing the object (Mensink, ¶ [0047], teaches the interface shows the image to be labeled and may also display the set of labels predicted by the classifier, optionally updated to reflect the user's responses to a set of queries. The user clicks on a selection or otherwise indicates his response to each of the queries that are presented in turn, thereby assigning values to a subset (fewer than all) of the labels. Further, Mensink, ¶ [0062], teaches if at S122, if a stopping criterion has not been reached, the system decides to repeat the querying, the next label to be queried is selected at S124. The stopping criterion may be a predetermined number of questions to be asked of the user or may depend on a confidence the system has on the remained label predictions, or a combination thereof), obtaining information data of the one or more searched images from the image database (Mensink, ¶ [0034], teaches the images to be labeled may be images without any labels. In other embodiments, the images may have had some labels assigned (e.g., as tags obtained from a photo-sharing site such as Flikr™), but the annotation is not complete. Or, the images 16 may have received a small number of labels from a small label set automatically or manually applied and the objective is to expand the annotation to a larger label set); generating a reference dataset for the object based, at least in part, on the information data, the reference dataset including at least some of the information data (Mensink, ¶ [0047], teaches the feature representations of the training images and their respective labels may be use to train a classifier system. This step is optional. In other embodiments, the features vector (such as a Fisher vector) is used as a feature function of the image and is used directly as label predictions for each of the images); and transmitting the reference dataset to the electronic device (Mensink, FIG. 2, element S128, discloses outputting assigned labels/class for image). 
Mensink teaches the limitations as identified above. 	Mensink does not explicitly teach: in response to receiving the image: performing a search of an image database based, at least in part, on the extracted feature data to determine one or more images containing the object, further including: for each of a plurality of images stored in the image database: calculating a respective difference between one or more feature vectors in the feature data relating to the object and corresponding one or more feature vectors in respective feature data of the image, and in accordance with a determination that the calculated difference is within a predetermined value, determining the image as a searched image containing the object; in accordance with the determination, obtaining information data of the one or more searched images from the image database. 
However, Ross teaches:
in response to receiving the image: performing a search of an image database based, at least in part, on the extracted feature data to determine one or more images containing the object, further including: for each of a plurality of images stored in the image database: calculating a respective difference between one or more feature vectors in the feature data relating to the object and corresponding one or more feature vectors in respective feature data of the image (Ross, ¶ [0058], teaches because we are exploiting natural features and often scanning the object under variable conditions, it is highly unlikely that two different “reads” will produce the exact same fingerprint. We therefore have to introduce the ability to look up items in the database when there is a near-miss. For example, two feature vectors [0, 1, 5, 5, 6, 8] and [0, 1, 6, 5, 6, 8] are not identical but (given the proper difference metric) may be close enough to say with certainty that they are from the same item that has been seen before. This is particularly true if, otherwise, the nearest feature vector of a different item is [5, 2, 5, 8, 6, 4]. For example, a distance between vectors of n-dimensions is easily calculated, and may be used as one metric of similarity or “closeness of match” between the vectors. One may also consider the distance to the next nearest candidate), and in accordance with a determination that the calculated difference is within a predetermined value, determining the image as a searched image containing the object (Ross, ¶ [0036], teaches once one or more suitable digital fingerprints of an object are acquired, the object (actually some description of it) and corresponding fingerprint may be stored or “registered” in a database. For example, in some embodiments, the fingerprint may comprise one or more feature vectors. The database should be secure. In some embodiments, a unique ID also may be assigned to an object. An ID may be a convenient index in some applications. However, it is not essential, as a digital fingerprint itself can serve as a key for searching a database. In other words, by identifying an object by the unique features ; in accordance with the determination, obtaining information data of the one or more searched images from the image database (Ross, ¶ [0034], teaches objects may have permanent labels or other identifying information attached to them. These can also be used as features for digital fingerprinting. For instance, wine may be put into a glass bottle and a label affixed to the bottle. Since it is possible for a label to be removed and reused, simply using the label itself as the authentication region is often not sufficient. In this case we may define the authentication region to include both the label and the substrate it is attached to—in this case some portion of the glass bottle. This “label and substrate” approach may be useful in defining authentication regions for many types of objects, such as consumer goods and pharmaceutical packaging. If a label has been moved from it's original position, this can be an indication of tampering or counterfeiting. If the object has “tamper-proof” packaging, such areas as may be damaged in attempts to counterfeit the contents may also be useful to include in the authentication region. Further, Ross, ¶ [0052], teaches as an example of the potential uses of sensor data, many products like food and beverages can degrade with exposure to certain environmental factors over the course of their storage and shipment. Examples of sensor data could include temperature, light exposure, altitude, oxygen level, or other factors, as well as location such as GPS data). 
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Mensink (disclosing labeling images and for generating an annotation system) to include the teachings of 

Regarding claim 12, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches the received information data includes at least one of a title, a detailed description, a category and a brand relating to the one or more images (Mensink, ¶ [0033], teaches the labels are drawn from a predefined set of labels (an “annotation vocabulary”), which may correspond to a set of visual categories, such as landscape, frees, rocks, sky, male, female, single person, no person, animal, and the like. In the exemplary embodiment, there are a large number of such categories, such as at least fifty categories. The training images are each manually labeled with one or more labels drawn from the set of labels).  

Regarding claim 13, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches the generating the reference dataset includes constructing at least one tree data structure for the at least some of the information data (Mensink, ¶ [0075], teaches one or more .  

Regarding claim 14, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches the reference dataset includes at least one title data for suggesting a title of the object (Mensink, ¶ [0103], teaches the clustering aims to associate labels in a group, based on the co-occurrence of their states in the training set) and at least one detailed description data for suggesting a detailed description of the object (Mensink, ¶ [0042], teaches through inference in the graphical model 44, the system fuses the information from the image content and the user responses, and is able to identify labels that are highly informative, once provided with some information by the user. Mensink, ¶ [0152], teaches any informative question will at least rule out one of the possible classes, and thus at most C−1 attributes need to be set by the user for the class to be known with .  

Regarding claim 15, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches the operations further comprise: providing a suggested text relating to the object based, at least in part, on the reference dataset (Mensink, ¶ [0047], teaches the interface shows the image to be labeled and may also display the set of labels predicted by the classifier, optionally updated to reflect the user's responses to a set of queries. The user clicks on a selection or otherwise indicates his response to each of the queries that are presented in turn, thereby assigning values to a subset (fewer than all) of the labels).  

Regarding claim 16, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches the operations further comprise: receiving, from the electronic device, a user input relating to the object (Mensink, ¶ [0042], teaches structured models are able to transfer the user input for one image label to more accurate predictions on other image labels); and providing at least one candidate text relating to the user input based, at least in part, on the reference dataset (Mensink, ¶ [0118], teaches the user interaction, in the case of the interactive mode, also takes place at the attribute level. The system 10 asks for user input on the attribute level labels to improve the class predictions, rather than to improve the attribute prediction).  

Regarding claim 17, Mensink teaches a non-transitory computer-readable storage medium having stored therein instructions executable by a computing device to cause the computing device to perform operations (Mensink, ¶ [0016], teaches FIG. 2 is a flow diagram illustrating a method for predicting labels for images in accordance with another aspect of the exemplary embodiment. Mensink, ¶ [0012], teaches instructions are provided in memory for generating feature-based predictions for values of labels in the set of labels based on features extracted from the image and for predicting a value for at least one label from the set of labels for the image based on the feature-based label predictions and predictive correlations of the structured model. The predicted value for the at least one label may also be based on an assigned value for at least one other label from the set of labels received for the image. A processor executes the instructions) comprising: receiving, from an electronic device, an input image containing an object (At least Mensink, FIG. 1, discloses images and image data being received. At least Mensink, FIG. 4, discloses an interactive image. Further, Mensink, ¶ [0014], teaches for each of the training images, for each of a set of labels, a feature function is generated, based on features extracted from the image, which is used to predict a value of the label for the image. Mensink, ¶ [0032], teaches images may be received by the system in any convenient file format, such as JPEG, TIFF, or the like … The images can be input to the system from an external source or generated within the system); in response to receiving the input image: extracting, from the input image, feature data relating to the object, the feature data including one or more feature vectors (Mensink, ¶ [0036], teaches on a new image, the ; performing a search of an image database based, at least in part, on the extracted feature data to determine one or more images containing the object (Mensink, ¶ [0047], teaches the interface shows the image to be labeled and may also display the set of labels predicted by the classifier, optionally updated to reflect the user's responses to a set of queries. The user clicks on a selection or otherwise indicates his response to each of the queries that are presented in turn, thereby assigning values to a subset (fewer than all) of the labels. Further, Mensink, ¶ [0062], teaches if at S122, if a stopping criterion has not been reached, the system decides to repeat the querying, the next label to be queried is selected at S124. The stopping criterion may be a predetermined number of questions to be asked of the user or may depend on a confidence the system has on the remained label predictions, or a combination thereof), obtaining information data of one or more images containing the object, the one or more images being searched from the image database (Mensink, ¶ [0034], teaches the images to be labeled may be images without any labels. In other embodiments, the images may have had some labels assigned (e.g., as tags obtained from a photo-sharing site such as Flikr™), but the annotation is not complete. Or, the images 16 may have received a small number of labels from a small label set automatically or manually applied and the objective is to expand the annotation ; generating a reference dataset for the object based, at least in part, on the information data (Mensink, ¶ [0047], teaches the feature representations of the training images and their respective labels may be used to train a classifier system. This step is optional. In other embodiments, the features vector (such as a Fisher vector) is used as a feature function of the image and is used directly as label predictions for each of the images); and transmitting the reference dataset to the electronic device (Mensink, FIG. 2, element S128, discloses outputting assigned labels/class for image).
Mensink teaches the limitations as identified above.
Mensink does not explicitly teach: in response to receiving the image: performing a search of an image database based, at least in part, on the extracted feature data to determine one or more images containing the object, further including: for each of a plurality of images stored in the image database: calculating a respective difference between one or more feature vectors in the feature data relating to the object and corresponding one or more feature vectors in respective feature data of the image, and in accordance with a determination that the calculated difference is within a predetermined value, determining the image as a searched image containing the object; in accordance with the determination, obtaining information data of the one or more images containing the object, the one or more images being searched from the image database. 
However, Ross teaches:
in response to receiving the image: performing a search of an image database based, at least in part, on the extracted feature data to determine one or more images containing the object, further including: for each of a plurality of images stored in the image database: calculating a respective difference between one or more feature vectors in the feature data relating to the object and corresponding one or more feature vectors in respective feature data of the image (Ross, ¶ [0058], teaches because we are exploiting natural features and often scanning the object under variable conditions, it is highly unlikely that two different “reads” will produce the exact same fingerprint. We therefore have to introduce the ability to look up items in the database when there is a near-miss. For example, two feature vectors [0, 1, 5, 5, 6, 8] and [0, 1, 6, 5, 6, 8] are not identical but (given the proper difference metric) may be close enough to say with certainty that they are from the same item that has been seen before. This is particularly true if, otherwise, the nearest feature vector of a different item is [5, 2, 5, 8, 6, 4]. For example, a distance between vectors of n-dimensions is easily calculated, and may be used as one metric of similarity or “closeness of match” between the vectors. One may also consider the distance to the next nearest candidate), and in accordance with a determination that the calculated difference is within a predetermined value, determining the image as a searched image containing the object (Ross, ¶ [0036], teaches once one or more suitable digital fingerprints of an object are acquired, the object (actually some description of it) and corresponding fingerprint may be stored or “registered” in a database. For example, in some embodiments, the fingerprint may comprise one or more feature vectors. The database should be secure. In some embodiments, a unique ; in accordance with the determination, obtaining information data of the one or more images containing the object, the one or more images being searched from the image database (Ross, ¶ [0034], teaches objects may have permanent labels or other identifying information attached to them. These can also be used as features for digital fingerprinting. For instance, wine may be put into a glass bottle and a label affixed to the bottle. Since it is possible for a label to be removed and reused, simply using the label itself as the authentication region is often not sufficient. In this case we may define the authentication region to include both the label and the substrate it is attached to—in this case some portion of the glass bottle. This “label and substrate” approach may be useful in defining authentication regions for many types of objects, such as consumer goods and pharmaceutical packaging. If a label has been moved from it's original position, this can be an indication of tampering or counterfeiting. If the object has “tamper-proof” packaging, such areas as may be damaged in attempts to counterfeit the contents may also be useful to include in the authentication region. Further, Ross, ¶ [0052], teaches as an example of the potential uses of sensor data, many products like food and beverages can degrade with exposure to certain environmental factors over the course of their storage and shipment. Examples of sensor data could include . 
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the teachings of Mensink (disclosing labeling images and for generating an annotation system) to include the teachings of Ross (disclosing acquiring digital image data of an image of at least a portion of a target physical object) and arrive at functions directed to identifying differences in image features. One of ordinary skill in the art would have been motivated to make this combination to improve the effectiveness of searching for and retrieving relevant image data (see at least Ross, ¶ [0041]). In addition, the references Mensink and Ross teach features that are directed to analogous arts and they are directed to the same field of endeavor related to analyzing images.

Regarding claim 18, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches the generating of the reference dataset includes constructing at least one tree data structure for at least some of the information data (Mensink, ¶ [0075], teaches one or more structured models are generated, based on the labels 14 and on either the image features or the classifier output. This includes computing a graph structure based on the maximum spanning tree over a fully connected graph over the label variables with edge weights given by the mutual information between the label variables. The graph contains node potentials, which are a weighted sum of the image features or image classifier scores, and edge potentials of the tree-structured conditional model, which are .  

Regarding claim 19, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches the reference dataset includes at least one title data for suggesting a title of the object (Mensink, ¶ [0103], teaches the clustering aims to associate labels in a group, based on the co-occurrence of their states in the training set) and at least one detailed description data for suggesting a detailed description of the object (Mensink, ¶ [0042], teaches through inference in the graphical model, the system fuses the information from the image content and the user responses, and is able to identify labels that are highly informative, once provided with some information by the user. Mensink, ¶ [0152], teaches any informative question will at least rule out one of the possible classes, and thus at most C−1 attributes need to be set by the user for the class to be known with certainty. Of course, as with label elicitation, the aim is to limit the number of attributes elicited from the user, while ensuring an acceptable probability of correctly identifying the class).  

Regarding claim 20, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches the operations further comprise: providing a suggested text relating to the object based, at least in part, on the reference dataset (Mensink, ¶ [0047], teaches the interface shows the image to be labeled and may also display the set of labels predicted by the classifier, optionally updated to reflect the user's responses to a set of queries. The user clicks on a selection or otherwise indicates his response to each of the queries that are presented in turn, thereby assigning values to a subset (fewer than all) of the labels).  

Regarding claim 21, the modification of Mensink and Ross teaches the claimed invention substantially as claimed, and Mensink further teaches the operations further comprise: receiving, from the electronic device, a user input relating to the object (Mensink, ¶ [0042], teaches structured models are able to transfer the user input for one image label to more accurate predictions on other image labels); and providing at least one candidate text relating to the user input based, at least in part, on the reference dataset (Mensink, ¶ [0118], teaches the user interaction, in the case of the interactive mode, also takes place at the attribute level. The system 10 asks for user input on the attribute level labels to improve the class predictions, rather than to improve the attribute prediction).


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US PGPub 20210224313 (Liu et al) discloses The reverse image search method may include receiving a search image; extracting feature points of the search image; finding classes corresponding to the feature points of the search image respectively in an image classification index table, the classes comprising images in an image library; and searching the classes corresponding to the feature points of the search image in the image classification index table to obtain a target image having the largest number of identical feature points of the search image.


Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALICIA M ANTOINE whose telephone number is (571)431-0687.  The examiner can normally be reached on Mon - Fri: 9am - 3pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, PIERRE M VITAL can be reached on 571-272-4215.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/PIERRE M VITAL/Supervisory Patent Examiner, Art Unit 2162