Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This Office Action is taken in response to Applicant’s Amendment and Remarks filed on 03/08/2021 regarding application 14/279,346 filed on 05/16/2014.
Claims 1, 11, 18, and 20 have been amended. Claims 1-2, and 4-21 are currently pending for consideration.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 02/01/2021 were filed before the mailing date of the non-final office action. The submission is in compliance with the provisions of 37 C.F.R. § 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Response to Amendment and Remark
The applicant’s amendments and remarks have been fully and carefully considered, with Examiner’s response set forth below.
Applicant remarks that “Claim Rejections based on U.S.C. § 112” with “… claims 11-12 were rejected because the claim term “similar” was alleged to be undefined by the claim and, thus, indefinite. Claims 11-12 have been amended accordingly. In light of these amendments, Applicant respectfully requests that Examiner withdraw this rejection of claims 11-12.“
The examiner reviews and agrees to withdraw the rejection.
Applicant remarks that “Claim Rejections based on U.S.C. § 103“ with “… claim 1 as amended recites, inter alia, “comparing the first distance and the second distance to one or more threshold values” and “when the first distance and the second distance do not exceed the one or more threshold values, selecting an image tag from the one or more image tags based on the first distance and the second distance” … claim 18 as amended… claim 20 as amended… " 
Applicant's arguments with respect to the amendments of the independent claims 1, 18 and 20 have been fully considered, however, disagrees because Hill discloses FIG. 2 shows a multi-faceted classification with filter scores in accordance with rules for a faceted taxonomy for aiding search the content for visual images of the content-based tagging artifacts… obtain empirical estimates from a separate validation to give a direct estimate of accuracy by comparing the threshold scores at s0 of P(y=1|s>=s0) in multiple feature spaces (first/second distance)… for the refined category tagging at different depths (i.e. the first distance) of the ontology… and the depth and the breadth of a target semantic similarity space (i.e. the second distance). (See [col 8 ln 16-38, col 5 ln 40-45, col 6 ln 21-25])
Also, Guo discloses [Claim 1, 16] perform an enhanced max-margin learning optimization in a dual space, dependent on inner products in a joint feature space of the feature vectors and the structured annotation word space (i.e. first distance) having the at least one constraint (i.e. threshold value)… and the structured semantic space (i.e. second distance) having the at least one constraint (i.e. threshold value) to minimize a prediction error… producing or identifying in response, comprising at least one of an response image and a response semantic expression.


Claim Rejections - 35 U.S.C. § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. § 102 and 103 (or as subject to pre-AIA  35 U.S.C. § 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior 
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.


Claims 1-2, 4-5, 18, and 20-21 are rejected under 35 U.S.C. § 103 as being unpatentable over Guo et al. (US 10007679 B2, “Guo”) in view of Hill et al. (US 9710760 B2, “Hill”).
As to claim 1, Guo discloses a computer-implemented method comprising: 
receiving a natural language query comprising one or more terms for searching an image in a data structure, where the image is tagged with one or more image tags describing one or more objects depicted in; (Guo discloses [col 2 ln 14- 35, col 9 ln 32-33, col 3 ln 57-60] learning the relationship between images and text in multimodal data mining to be formulated a structured prediction for image searching… receiving the word query (i.e. image searching query) as given words (i.e. terms) input to generate corresponding images in i and the corresponding annotation word set Wi (i.e. image tags describing objects)).
The examiner notes that the term “data structure” is not clearly defined in both the claim and the specification, however, the [0082] of the specification recites “Computer storage media, such as memory 910, includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.” The examiner interprets that Guo discloses “formulated a structured prediction for image searching” and recited “… for storage of information such as computer readable instructions, data structures, program modules or other data” in [col 14 ln 1-4] as the data structure.
computing a first distance in an image content ontology between the one or more terms and the one or more image tags, (Guo discloses [col 10 ln 63-67] in comparison with the retrieval based multimodal data mining analyzed… in across-stage inferencing using the computation the domain ontology in the word space (i.e. first distance)… in evaluations for the image database, this ontology is handcrafted and is implemented as a look-up table for word matching through an efficient hashing function between query terms and the image database with a respective associated label word set (i.e. selected image tags)).
computing a second distance in a semantic space of words between the one or more terms and the one or more image tags, wherein the second distance indicates a semantic similarity between the one or more terms and the one or more image tags, and (Guo discloses [col 9 ln 10-14, col 1 ln 52-55] The semantic similarity (i.e. a semantic word space) between feature vectors and the VReps in terms of the distance (i.e. a second distance) are computed. The top n most-relevant VReps are returned… the score between this VRep and 
wherein the semantic space of words is generated by a machine-learning program; (Guo discloses [Claim 1] representing each of a plurality of images in a database as information in an image space; associating a label word set, from an annotation word space (i.e. semantic word space), with each of the plurality of images… to define a plurality of training instances (i.e. machine-learning program), each respective training instance comprising a respective image and a respective associated label word set, and having at least one constraint). 
searching the data structure using the selected image tag; (Guo discloses [col 9 ln 36-53] Image retrieval refers to from generating semantically similar images to a query image… by using the procedure… the computation complexity to the number of the VReps (i.e. data structure) which is searching the similarity between the VRep and each image in the image database in terms of the distance… with a respective associated label word set (i.e. selected image tags), for each of those top n most relevant VReps, the ranking-list of images in terms of the distance is provided).
retrieving the image from the data structure; and (Guo discloses [col 9 ln 43-53] Image retrieval refers to from generating semantically similar images to a query image… by using the procedure… the computation complexity to the number of the VReps (i.e. data structure) which is the similarity between the VRep and each image in the image database in terms of the distance).
causing presentation on a user interface of the image. (Guo discloses [col 9 ln 40-42] obtain the top overall ranking-list in the image space... the top m images are returned as the query result… is viewed as a VRep as a user interface in FIG. 3).
wherein each of the one or more image tags is a concept of the image content ontology and the first distance is computed by traversing between concepts in the image content ontology;
comparing the first distance and the second distance to one or more threshold values;
when the first distance and the second distance do not exceed the one or more threshold values, selecting an image tag from the one or more image tags based on the first distance and the second distance,
Hill discloses wherein each of the one or more image tags is a concept of the image content ontology and the first distance is computed by traversing between concepts in the image content ontology; (Hill discloses [col 15 ln 50-55, col 5 ln 38-51, col 3 ln 10-12] The hierarchical multi-faceted classification structure is employed to search content for visual images and video with higher accuracy and repeatable results… The ontology-based category label refinement is through an interactive and iterative process (i.e. traversing) of deriving (i.e. computing) refined category labels (i.e. image tags) of the image… at the different depths (i.e. first distance) of the ontology… are preferably defined as relationships among visual semantic concepts in the image ontology-based category label… infer visual concepts and improve automatic category labeling (content tagging) among other things).
Hill discloses comparing the first distance and the second distance to one or more threshold values; (Hill discloses [col 8 ln 16-38, col 5 ln 40-45, col 6 ln 21-25] FIG. 2 shows a multi-faceted classification with filter scores in accordance with rules for a faceted taxonomy for aiding search the content for visual images of the content-based tagging artifacts… obtain empirical estimates from a separate validation to give a direct estimate of accuracy by comparing the threshold scores at s0 of P(y=1|s>=s0) in multiple feature spaces … for deriving refined category tagging at different depths (i.e. the first distance) of the ontology… and the depth and the breadth of a target semantic similarity space (i.e. the second distance)).

Hill discloses when the first distance and the second distance do not exceed the one or more threshold values, selecting an image tag from the one or more image tags based on the first distance and the second distance, (Hill discloses [col 7 ln 9-11, col 15 ln 50-60, col 5 ln 40-45, col 6 ln 21-25] using filter score (i.e. threshold value) filters out any contradicting scores to get a set of most probable labels (i.e. image tag) of images in multiple feature spaces in FIG. 2… allows multiple parallel hierarchies represents a different aspect for decomposing or hierarchically relating the label categories for aiding search the content for visual images of the content-based tagging artifacts (i.e. image tag) with higher accuracy and repeatable results… based on an refined category tagging at different depths (i.e. the first distance) of the ontology… and the depth and the breadth of a target semantic similarity space (i.e. the second distance)).
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Guo and Hill disclosing query the image with the natural language, and, when Hill’s hierarchical multi-facet classifications structure for image content ontology via traversing the nodes was combined with Guo’s searching image mechanism for better structured prediction with multimodal data mining, the claimed limitation on the wherein each of the one or more image tags is a concept of the image content ontology and the first distance is computed by traversing between concepts in the image content ontology;
comparing the first distance and the second distance to one or more threshold values; 
when the first distance and the second distance do not exceed the one or more threshold values, selecting an image tag from the one or more image tags based on the first distance and the second distance would be obvious. The motivation to combine Guo and Hill is to provide taxonomies and classification scheme tools for categorization the digital artifacts efficiently. (See Hill [col 1 ln 20-21]).
As to claim 2, Guo in view of Hill discloses the method of claim 1, wherein the concepts in the image content ontology are linked by edges according to relationships between the concepts. (Hill discloses [col 2 ln 10-12] the concepts in the image artifacts ontology with each category node is linked to one or more children that reflected semantic decomposition of the parent category with a multi-relational ontology in FIG. 3).
As to claim 4, Guo in view of Hill discloses the method of claim 1, wherein the machine-learning program is a neural network. (Hill discloses [col 5 ln 47-49] the machine learning, e.g., Bayes Network, SVM, KNN, HMM, GMM, MaxEnt, etc. (i.e. neural network)).
As to claim 5, Guo in view of Hill disclose the method of claim 1, wherein the second distance is computed using any of: cosine similarity, dot product, dice similarity, hamming distance, and city block distance. (Hill discloses [col 7 ln 59-65] Calibrating SVM output score includes parameterized the feature space of the similarity with a cosine series expansion in a direction orthogonal to the separating hyperplane).
As to claim 21, Guo in view of Hill discloses the computer-implemented method of claim 1, wherein the second distance in the semantic space of words derives a vector from an operand that quantifies meaning based on the language context in which the words of the operand are used. (Hill discloses [col 5, ln 47-63] using the statistical classification method SVM (Support Vector Machines)… detecting confidence scores (i.e. quantifies meaning) for categories (i.e. semantic space of words), e.g., their ascendants, descendants, siblings, or other embedding vectors… results in a high dimensional embedding of each word based on the similarity of the use of the words” without the description of “an operand that quantifies meaning”. The examiner interprets the “an operand” as a node (category) of the classification of hierarchical structure and “quantifies meaning” as confidence scores for categories).
Regarding claims 18 and 20, these claims recite the system/computer-implemented method performed by the method of claims 1; therefore, the same rationale of rejection is applicable.

Claims 6, 15, 16, and 19 are rejected under 35 U.S.C. § 103 as being unpatentable over Guo in view of Hill and further in view of Badaskar (US 20140081633 A1, “Badaskar”).
As to claim 6, Guo in view of Hill may not explicitly disclose all the aspects of the method of claim 1, further comprising: computing a third distance in the semantic space of words between the natural language query and the plurality of image tags, the third distance being computed with a different operation than the second distance, wherein selecting the at least one image is further based on the third distance.
However, Badaskar discloses computing a third distance in the semantic space of words between the natural language query and the plurality of image tags, the third distance being computed with a different operation than the second distance, wherein selecting the at least one image is further based on the third distance. (Badaskar discloses [0117, 0121] the photo module performs natural language query operations on multiple query terms… a geo-code parameter computing a distance range of geo-codes associated with a 
The examiner notes that the third distance is not clearly defined in the specification, however, [0057] of the specification recites “To get the best result the distances calculated by the various distance modules 512-520 are combined” and Cosine Similarity 512… and City Block Distance 520 are cited in FIG. 5. The examiner uses the City Block Distance module as the third distance calculation.
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Guo in view of Hill and Badaskar disclosing query the image with the natural language, and, when Badaskar’s City Block Distance for the third distance was combined with Guo in view of Hill’s searching image mechanism for better structured prediction with multimodal data mining, the claimed limitation on the computing a third distance in the semantic space of words between the natural language query and the plurality of image tags, the third distance being computed with a different operation than the second distance, wherein selecting the at least one image is further based on the third distance would be obvious. The motivation to combine Guo in view of Hill and Badaskar is to provide a digital assistant system for voice-based media search efficiently. (See Badaskar [0002]).
As to claim 15, Guo in view of Hill may not explicitly disclose all the aspects of the method of claim 1, wherein the natural language query comprises a plurality of query terms and an indication of whether the terms of the plurality of query terms are to be located within a third distance of each other, and
in response to determining the terms are to be located within the third distance of each other, retrieving one or more images from the database of images tagged with each of the selected image tags wherein objects associated with the selected image tags are located within the third distance of each other.
Badaskar discloses wherein the natural language query comprises a plurality of query terms and an indication of whether the terms of the plurality of query terms are to be located within a third distance of each other, and (Badaskar discloses [0086] the photo module performs natural language query operations on and searches for photographs include one or more query terms from a photo database based on a user input to identify photographs (i.e. selected image tags), and locally stores photographs each in association with one or more tags which is located within a distance range of geo-codes that corresponds to a location identified in a search query).
Badaskar discloses in response to determining the terms are to be located within the third distance of each other, retrieving one or more images from the database of images tagged with each of the selected image tags (Badaskar discloses [0117] a geo-code parameter comprises a distance range of geo-codes associated with a location specified in response to the query terms. The distance range of geo-codes correspond to the geographical boundaries of a city within a radius distance around the principal geo-code to retrieve the photo with geo-codes tag within the third distance.
Badaskar discloses wherein objects associated with the selected image tags are located within the third distance of each other. (Badaskar discloses [0096] digital assistant determines locations associated with the geo-code tags of the photographs to be selected, and determines those photo with geo-code tags correspond to the location identified in the search query).
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Guo in view of Hill and Badaskar disclosing  wherein the natural language query comprises a plurality of query terms and an indication of whether the terms of the plurality of query terms are to be located within a third distance of each other, and
in response to determining the terms are to be located within the third distance of each other, retrieving one or more images from the database of images tagged with each of the selected image tags wherein objects associated with the selected image tags are located within the third distance of each other would be obvious. The motivation to combine Guo in view of Hill and Badaskar is to provide a digital assistant system for voice-based media search efficiently. (See Badaskar [0002]).
As to claim 16, Guo in view of Hill may not explicitly disclose all the aspects of the method of claim 1, further comprising automatically identifying, with a processor, at least one feature in each image of a plurality of untagged stored images; and 
associating, with the processor, a respective image tag of the plurality of image tags with each image of the plurality of untagged stored images, each respective image tag corresponding to the at least one feature identified in the corresponding image of the plurality of untagged stored images. 
Badaskar discloses automatically identifying, with a processor, at least one feature in each image of a plurality of untagged stored images; and (Badaskar discloses [0086] the photo module performs operations on and searches for photographs with creating tags automatically by associating tags with stored photographs (e.g., tagging the photograph) based on a user input to identify photographs, and locally stores photographs each in association with one or more tags (i.e. untagged stored images prior to the association)).
 associating, with the processor, a respective image tag of the plurality of image tags with each image of the plurality of untagged stored images, each respective image tag corresponding to the at least one feature identified in the corresponding image of the plurality of untagged stored images. (Badaskar discloses [0008] metadata is stored with digital photographs (i.e. untagged stored images) when they are automatically captured (i.e. identified) and saved is cross-referenced (i.e. associating) with other user information to facilitate searching, e.g., associating user's vacation spans a certain set of days with the respective photos taken and saved corresponding to the calendar entry and geo-code with the respective photos taken and saved corresponding to a location identified in a search query).
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Guo in view of Hill and Badaskar disclosing query the image with the natural language, and, when Badaskar’s City Block Distance for the third distance was combined with Guo in view of Hill’s searching image mechanism for better structured prediction with multimodal data mining, the claimed limitation on the automatically identifying, with a processor, at least one feature in each image of a plurality of untagged stored images; and 
associating, with the processor, a respective image tag of the plurality of image tags with each image of the plurality of untagged stored images, each respective image tag corresponding to the at least one feature identified in the corresponding image of the plurality of untagged stored images would be obvious. The motivation to combine Guo in view of Hill and Badaskar is to provide a digital assistant system for voice-based media search efficiently. (See Badaskar [0002]).
As to claim 19, Guo in view of Hill may not explicitly disclose all the aspects of the system according to claim 18, the computing-based device being at least partially implemented using hardware logic selected from any one of more of: a field-programmable gate array, a program-specific integrated circuit, a program-specific standard product, a system-on-a-chip, a complex programmable logic device. 
Badaskar discloses the computing-based device being at least partially implemented using hardware logic selected from any one of more of: a field-programmable gate array, a program-specific integrated circuit, a program-specific standard product, a system-on-a-chip, a complex programmable logic device. (Badaskar discloses [0043] FIG. 2 shows memory includes instructions; various functions of the user device are implemented in hardware and/or in firmware, including in one or more signal processing and application specific integrated circuits, and the user device 104).
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Guo in view of Hill and Badaskar disclosing query the image with the natural language, and, when Badaskar’s City Block Distance for the third distance was combined with Guo in view of Hill’s searching image mechanism for better structured prediction with multimodal data mining, the claimed limitation on the computing-based device being at least partially implemented using hardware logic selected from any one of more of: a field-programmable gate array, a program-specific integrated circuit, a program-specific standard product, a system-on-a-chip, a complex programmable logic device would be obvious. The motivation to combine Guo in view of Hill and Badaskar is to provide a digital assistant system for voice-based media search efficiently. (See Badaskar [0002]).

Claims 7-10, 13-14, and 17 are rejected under 35 U.S.C. § 103 as being unpatentable over Guo in view of Hill and further in view of Zadeh et al. (US 20140201126 A1, “Zadeh”).
As to claim 7, Guo in view of Hill discloses the method of claim 1, wherein selecting at least one of the plurality of image tags further comprises: 
 selecting image tags that are within a predetermined first distance threshold and within a predetermined second distance threshold. 
Zadeh discloses selecting image tags that are within a predetermined first distance threshold and within a predetermined second distance threshold. (Zadeh discloses [1361] there are the accuracy factor and reliability factor involved in the search engine for selecting the image tags. That is, there is a threshold for each distance as to how much accuracy for the selection result which could be a fuzzy parameter itself).
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Guo in view of Hill and Zadeh disclosing query the image with the natural language, and, when Zadeh’s selecting image tags with a threshold for each distance for accuracy was combined with Guo in view of Hill’s searching image mechanism for better structured prediction with multimodal data mining, the claimed limitation on the selecting image tags that are within a predetermined first distance threshold and within a predetermined second distance threshold would be obvious. The motivation to combine Guo in view of Hill and Zadeh is to provide efficient and fast algorithms for image searching process using Z-number fuzzy logics for machine learning. (See Zadeh [0182]).
As to claim 8, Guo in view of Hill discloses the method of claim 1, wherein selecting the image tag comprises: 
However, Guo in view of Hill may not explicitly disclose all the aspects of the representing the first distance and the second distance as a vote for one or more image tags; 
combining the votes for each image tag; 
selecting the image tag based on the votes.
Zadeh discloses representing the first distance and the second distance as a vote for one or more image tags; (Zadeh discloses [2226] For the aggregation method, we take a 
Zadeh discloses combining the votes for each image tag; 
selecting the image tag based on the votes. (Zadeh discloses [2284] using the queries or type of queries to select the image, e.g. what majority of users are interested in or what is a hot topic as a collected (i.e. selecting) or aggregated (i.e. combining) votes from the society, users or social network).
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Guo in view of Hill and Zadeh disclosing query the image with the natural language, and, when Zadeh’s selecting image tags with a threshold for each distance for accuracy was combined with Guo in view of Hill’s searching image mechanism for better structured prediction with multimodal data mining, the claimed limitation on the representing the first distance and the second distance as a vote for one or more image tags; 
combining the votes for each image tag; 
selecting the image tag based on the votes would be obvious. The motivation to combine Guo in view of Hill and Zadeh is to provide efficient and fast algorithms for image searching process using Z-number fuzzy logics for machine learning. (See Zadeh [0182]).
As to claim 9, Guo in view of Hill and Zadeh discloses the method of claim 8, wherein each vote is weighted based on a magnitude of the corresponding distance prior to combining the votes. (Zadeh discloses [2558] we have voting or weighted voting or consensus of users or friends to get the final result from multiple people or users. We also have higher weights based on the people who have more experience, or experts, or people with higher score for credibility prior to combining the votes).
As to claim 10, Guo in view of Hill discloses the method of claim 1, further comprising:  
 displaying at least a portion of the image;
receiving information indicating the image has been selected; and 
displaying the image and information related to the image. 
Zadeh discloses displaying at least a portion of the image; (Zadeh discloses [1778] based on the features of sub-objects of the objects depicted in an image are extracted by preprocessing (e.g., mapping) a portion of an image into a segmented layout for display).
Zadeh discloses receiving information indicating the image has been selected; and 
displaying the image and information related to the image. (Zadeh discloses [2062] by using the received data (i.e. information) of the retrieved images indicating the user interface interaction of the links and displaying annotations on the selected image).
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Guo in view of Hill and Zadeh disclosing query the image with the natural language, and, when Zadeh’s selecting image tags with a threshold for each distance for accuracy was combined with Guo in view of Hill’s searching image mechanism for better structured prediction with multimodal data mining, the claimed limitation on the displaying at least a portion of the image;
receiving information indicating the image has been selected; and 
displaying the image and information related to the image would be obvious. The motivation to combine Guo in view of Hill and Zadeh is to provide efficient and fast algorithms for image searching process using Z-number fuzzy logics for machine learning. (See Zadeh [0182]).
As to claim 13, Guo in view of Hill and Zadeh discloses the method of claim 10, further comprising: 
receiving information indicating the position of a cursor with respect to the image, the cursor being controlled by a user; 
determining whether the cursor is positioned over an object identified in the image; and (Zadeh discloses [2786-2793] the user tells story or dialog between selected images with marker or highlighter or clicks or mouse cursor movements (i.e. cursor is positioned and controlled by a user) to select a region or image or frame (i.e. over an object in the selected image).
in response to determining the cursor is positioned over an object identified in the image, displaying a bounding box around the identified object. (Zadeh discloses [2786-2793] the mobile device or phone to take a picture or video, by the user, and then the picture is analyzed by a software in the phone or laptop, in response to find all the faces, and mark them by box or rectangle (around the face) (i.e. identified object) as box them (i.e. bounding box). Then, the user can click (cursor in positioned) on faces and comment or tag them by voice and other data)
As to claim 14, Guo in view of Hill and Zadeh discloses the method of claim 13, further comprising: receiving an indication that the bounding box has been selected; and 
updating the natural language query to include an image tag associated with the identified object corresponding to the bounding box. (Zadeh discloses [2793] the user click on faces and comment or tag them by voice and other data on the box (i.e. bounding box). The comment and content updates the repository for further analysis, and a copy goes to the user's social site or album site. Then, the user updates the annotations of the image in the proper repository with the modified user query, for safe keeping or storage, from his phone or mobile device or tablet).
As to claim 17, Guo in view of Hill and Zadeh disclose the method of claim 1, further comprising: receiving data indicating the image is to be shared; and 
making the image available to one or more other parties. (Zadeh discloses [2689] receiving the location of a retrieved object (i.e. image) is used as a reference with the identification or URL of the image/video/audio for later usage, such as for sharing via email or .

Claims 11 and 12 are rejected under 35 U.S.C. § 103 as being unpatentable over Guo in view of Hill and Zadeh and further in view of Badaskar.
As to claim 11, Guo in view of Hill and Zadeh discloses the method of claim 10, wherein the similarity is based on image tags shared between the image and the one or more images that are similar to the image. (Guo discloses [col 3 ln 65-67] A clustering algorithm is applied to the whole feature space to group similar feature vectors together as a virtual VRep in the image space with image tag pairs like water and duck as a shared similarity cluster).
However, Guo in view of Hill and Zadeh may not explicitly disclose all the aspects of the wherein the information related to the image comprises one or more images that share a similarity with the image, 
Badaskar discloses wherein the information related to the image comprises one or more images that share a similarity with the image, (Badaskar discloses [0121, 0183] a plurality of media items are selected to be presented ranked based on the confidence values, and displayed in an order based on the ranking which are more similar to match (i.e. share a similarity), or more closely match, the user's query are displayed to the user first… based on the tags or marks with appropriate labels on the images).
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Guo in view of Hill and Zadeh and Badaskar disclosing query the image with the natural language, and, when Badaskar’s City Block Distance for the third distance was combined with Guo in view of Hill and Zadeh’s searching image mechanism for better structured prediction with multimodal data mining, the claimed  wherein the information related to the image comprises one or more images that share a similarity with the image, would be obvious. The motivation to combine Guo in view of Hill and Zadeh and Badaskar is to provide a digital assistant system for voice-based media search efficiently. (See Badaskar [0002]).
As to claim 12, Guo in view of Hill, Zadeh and Badaskar discloses the method of claim 11, wherein the similarity is further based on confidence values associated with the image tags shared between the image and the one or more images that are similar to the image. (Badaskar discloses [0097, 0121] These tags are used as search parameters to locate other photographs with the same or similar metadata tags… include time, date, or location information may have shared the same or similar information on the tags. A plurality of media items are selected to be presented ranked based on the confidence values associated with the tags, and displayed in an order based on the ranking which are more similar to match, or more closely match, the user's query are displayed to the user first).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
                                                                                                                                 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kavita Padmanabhan can be reached on 5712728352.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JENQ-KANG CHU/Examiner, Art Unit 2176                                                                                                                                                                                                        
/KAVITA STANLEY/Supervisory Patent Examiner, Art Unit 2176