DETAILED ACTION
In response to communication filed on 2 February 2022, claims 1, 4-6 and 9-11 are amended. Claims 3 and 8 are canceled. Claims 1-2, 4-7 and 9-11 are pending. 
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments, see “Rejections under 35 U.S.C § 101”, filed 2 February 2022, have been fully considered, but are not persuasive. 

APPLICANT’S ARGUMENT: Applicant argues that the present claim 1 provides an improved method for image recognition, so as to improve the accuracy of delivering information. Applicant further argues that based on the claim limitations from claim 1 the to-be-matched image corresponding to the image in the to-be- processed information is found, and the descriptive information of the image is determined at the same time, such that the computer can generate interaction information with respect to a to-be-recognized image correctly. Applicant also argues that the claim as a whole provides a specific improvement over the prior image recognition, resulting in an improved computer-related method of information interaction. 
EXAMINER’S RESPONSE: Examiner has carefully considered the argument but respectfully disagrees. The claim limitations “to obtain a search request for the image”, have been identified as insignificant extra solution activity. The claim limitations “importing the image into an image search model… and the image search model is configured to” and “importing the to-be-matched image into a semantic tagging model…. wherein the semantic tagging model is configured to” have been identified as conventional computer functionality. All, the other claim limitations have been identified as a mental process related to the abstract idea as explained in the rejection below. Also, per MPEP [2016.05(a)(II) “However, it is important to keep in mind that an e.g. a recited fundamental economic concept) is not an improvement in technology”. Thus, the recited claim limitations related to searching have already been identified as an abstract idea (that can be performed by a human based on the mental process) and as a result, an improvement in the searching of image data (abstract idea itself) is not considered as an improvement in the technology. As a result, the arguments above are not persuasive.

APPLICANT’S ARGUMENT: Applicant argues that the by means of the additional elements, the descriptive information of the image is determined correctly, which enables the computer to generate comprehensive interaction information with respect to a to-be-recognized image accurately, thereby improving the efficiency of information interaction.
EXAMINER’S RESPONSE: Examiner has carefully considered the argument but respectfully disagrees. Applicant has cited the entire claim 1 as additional elements and does not clarify which specific claim limitations are significantly more than the judicial exception and provides an inventive concept. As a result, the arguments above are not persuasive.

Applicant’s arguments, see “Rejections under 35 U.S.C § 103”, filed 2 February 2022, have been fully considered, but are not persuasive. 

APPLICANT’S ARGUMENT: Applicant argues that , Lester fails to disclose "searching for descriptive information of the image within the to-be-processed information based on the feature word; wherein the descriptive information characterizes a textual description of the image" as recited in the amended claim 1.
EXAMINER’S RESPONSE: Examiner has carefully considered the argument but respectfully disagrees. In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). Similarly, for the current application, Perone teaches to-be-processed information in [0030] as a query that includes text and images and also teaches feature words in [0019] as textual feature vectors 134 and image feature vectors 136 from Perone. Perone further mentions in [0019] that these feature vectors are compared to identify a matching image. To a person of ordinary skill in the art, based on the broadest reasonable interpretation in light of the specification, feature word may be reasonably interpreted as feature vectors. Perone does not explicitly teach descriptive information of the image wherein the descriptive information characterizes a textual description of the image. Therefore Lester reference has been incorporated to teach that the images are identified based on features in [0024] and in [0032], Lester also teaches how the feature vectors are extracted from the images. To further clarify, Lester teaches in [0043] and [0033] regarding a mapping that explains the relationship between visual words and feature information representing an object or a scene. Therefore, there is a relationship between visual words and feature information. As a result, the combination of Perone and Lester teaches the above argued limitation and not just Lester as being argued above. 

APPLICANT’S ARGUMENT: Applicant argues that Lester also fails to disclose "constructing response information to the to-be-processed information from the descriptive information" as recited in the amended claim 1. Instead, Lester discloses providing a set pf images (which are not descriptive information) as responsive to the search query.
EXAMINER’S RESPONSE: Examiner has carefully considered the argument but respectfully disagrees. Lester reference teaches that the images are identified based on features in [0024] and in [0032], Lester also teaches how the feature vectors are extracted from the images. To further clarify, Lester teaches in [0043] and [0033] regarding a mapping that explains the relationship between visual words and feature information representing an object or a scene. constructing a response information may be reasonably interpreted as determining set of images identified based on the feature information derived. As a result, Lester teaches the above argued limitation and the arguments are not considered to be persuasive. 

APPLICANT’S ARGUMENT: Applicant argues that Perone and Lester also fail to disclose the concrete process of searching for descriptive information of the image within the to-be-processed information based on the feature word, i.e., they fail to disclose "importing the image into an image search model to obtain a to- be-matched image set corresponding to the image,… to-be-recognized semantic tag as the descriptive information" as recited in the amended claim 1. 
EXAMINER’S RESPONSE: Examiner has carefully considered the argument but respectfully disagrees. Based on additional amended claims, Marchesotti reference has been added. Therefore the arguments are not considered to be persuasive. 

APPLICANT’S ARGUMENT: Applicant argues that Marchesotti at no point teach or suggest determining interpretive information of a noun corresponding to the image in the to-be-recognized semantic tag as the descriptive information. 
EXAMINER’S RESPONSE: Examiner has carefully considered the argument but respectfully disagrees. Marchesotti reference in [0072] teaches that tags include freeform text information and that textual information is analyzed to determine a specific category based on the noun extracted. Also, the example relates to extracting a noun and then determining a category of flower for the content. To a person of ordinary skill in the art and based on broadest reasonable interpretation in light of specification, interpretive information of a noun may be reasonably 

Claim Objections
Claims 4 and 9 are objected to because of the following informalities:  
Claim 4 recites “The method according to claim 3” should read as --The method according to claim 1-- since claim 3 is a canceled claim.
Claim 9 recites “The apparatus according to claim 8” should read as --The apparatus according to claim 6-- since claim 8 is a canceled claim.
Appropriate corrections are required.

Claim Rejections - 35 USC § 101
Claims 1-2, 4-7 and 9-11 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1:
Claims 1-2 and 4-5 are recited as being directed to a “method”. Claims 6-7 and 9-10 are recited as being directed to an “apparatus”. Claim 11 is being recited as being directed to a “computer-readable form”. Thus claims 1-2, 4-7 and 9-11 have been identified to be directed towards the appropriate statutory category. Below is further analysis related to step 2.

Regarding claim 1, 
Step 2A: Prong One: 
Claim 1 recites limitations:
extracting a feature word from the textual information of the to-be-processed information,… and searching for descriptive information of the image within the to-be-processed information based on the feature word, the feature word by: to obtain a to-be-matched image set corresponding to the image, wherein the to-be-matched image set comprises at least one to-be-matched image… to characterize a first corresponding relationship between the image and the to- be-matched image;… to obtain a semantic tag set corresponding to the to-be-matched image set,… to characterize a second corresponding relationship between the to-be-matched image and a semantic tag, and the semantic tag provides a textual description of the to-be-matched image; and selecting a to-be-recognized semantic tag from the semantic tag set, and determining interpretive information of a noun corresponding to the image in the to-be-recognized semantic tag as the descriptive information, wherein the descriptive information characterizes a textual description of the image; and
constructing response information to the to-be- processed information from the descriptive information.
These claim limitations appear to be reciting a “Mental Process” including evaluation which may be performed in a human mind. 
	A human being can mentally apply evaluation to extract feature words that characterize a search request from the textual information. A human being can apply evaluation to obtain matched image set and characterize relationships between the images. A human mind can also evaluate to characterize relationships between images and tags wherein tags provide a textual description. A human mind can evaluate to determine descriptive information from the image based on the feature words. A human mind can apply evaluation to construct response information from the descriptive information. 
Step 2A: Prong Two:
Claim 1 further recites limitations:
obtaining to-be-processed information, the to-be- processed information comprising textual information and an image;
… to obtain a search request for the image,… 

Claim 1 further recites limitations:
importing the image into an image search model… and the image search model is configured to…
importing the to-be-matched image into a semantic tagging model…. wherein the semantic tagging model is configured to
 These claim limitations appear to be reciting the functionality of populating image information within specific models. The functionality of populating image information within specific models do not integrate into a practical application since they are conventional computer functions. Per MPEP (2106.05 (b) (I)), computer that applies a judicial exception, such as an abstract idea, by use of conventional computer functions does not qualify as a particular machine. As a result these claim limitations do not apply to a practical application.
Step 2B:
Claim 1 further recites limitations:
obtaining to-be-processed information, the to-be- processed information comprising textual information and an image;
These claim limitations as a whole have been identified as insignificant extra-solution activity. Per MPEP 2106.05(g) “An example of pre-solution activity is a step of gathering data for use in a claimed process, e.g., a step of obtaining information about credit card transactions, 
Claim 1 further recites limitations:
importing the image into an image search model… and the image search model is configured to…
importing the to-be-matched image into a semantic tagging model…. wherein the semantic tagging model is configured to…
These claim limitations appear to be reciting the functionality of populating image information within specific models. The functionality of populating image information within specific models do not amount to significantly more since they are conventional computer functions. Per MPEP (2106.05 (b) (I)), computer that applies a judicial exception, such as an abstract idea, by use of conventional computer functions does not qualify as a particular machine. As a result, these limitations are merely executing the abstract idea without being significantly more than the abstract. These references provide evidence that snapshots are conventional computer technology: Perone et al. (US 2021/0089571 A1 – [0012]), Marchesotti et al. (US 2012/0269441 A1 – [0067]), Hurley et al. (US 2004/0041846 A1– [Abstract]) and Hu et al. (US 2013/0110484 A1 – [Abstract]).
Claims 6 and 11 incorporate substantively all the limitations of claim 1 in an apparatus and computer-readable storage form and are rejected under the same rationale.

claims 2 and 4, 
Step 2A-2B:
Claim 2 further recites limitations:
performing a semantic recognition on the textual information to obtain semantic information corresponding to the textual information; and 
extracting the feature word from the semantic information.
Claim 4 further recites limitations:
counting numbers of identical semantic tags within the semantic tag set, and determining the semantic tag having a maximum number as the to-be-recognized semantic tag.
These claim limitations appear to be reciting a “Mental Process” including evaluation which may be performed in a human mind. 
A human being can apply evaluation to extract the feature words to perform a semantic recognition on the textual information and also count numbers of identical semantic tags to use the semantic tag having maximum number of the to-be-recognized semantic tag.  
There are no additional claim limitations that integrate into a practical application or amount to significantly more than the abstract idea. 
Claims 7 and 9 incorporate substantively all the limitations of claims 2 and 4 respectively in system form and are rejected under the same rationale.

Regarding claim 5, 
Step 2A: Prong One: 
Claim 5 recites limitations:
performing a semantic recognition on the feedback information to obtain the accuracy; 
choosing a secondary to-be-recognized tag from the semantic tags in the semantic tag set excluding the to- be-recognized semantic tag, in response to determining that the accuracy is below a preset threshold; 
using the interpretive information of the noun in the secondary to-be-recognized tag, the noun corresponding to the image, as secondary descriptive information; and 
constructing the response information to the to-be- processed information from the secondary descriptive information.
These claim limitations appear to be reciting a “Mental Process” including evaluation which may be performed in a human mind. 
	A human being can mentally apply evaluation to analyze feedback information based on semantic recognition to determine accuracy. A human mind can evaluate to choose a secondary tag if the accuracy is below a specific threshold. A human mind can mentally evaluate to determine noun information and constructing secondary descriptive information based on the noun. 
Step 2A: Prong Two:
Claim 5 further recites limitations:
receiving feedback information corresponding to the response information, wherein the feedback information evaluates an accuracy of the response information;
These claim limitations as a whole have been identified as insignificant extra-solution activity. Per MPEP 2106.05(g) “An example of pre-solution activity is a step of gathering data for use in a claimed process, e.g., a step of obtaining information about credit card transactions, which is recited as part of a claimed process of analyzing and manipulating the gathered information by a series of steps in order to detect whether the transactions were fraudulent”. Similarly the claim limitations as a whole above appear to be gathering data in terms of feedback information that will be further processed and do not appear to integrate the abstract idea into a practical application.
Step 2B:
Claim 5 further recites limitations:
receiving feedback information corresponding to the response information, wherein the feedback information evaluates an accuracy of the response information;
These claim limitations as a whole have been identified as insignificant extra-solution activity. Per MPEP 2106.05(g) “An example of pre-solution activity is a step of gathering data for use in a claimed process, e.g., a step of obtaining information about credit card transactions, which is recited as part of a claimed process of analyzing and manipulating the gathered information by a series of steps in order to detect whether the transactions were fraudulent”. Similarly the claim limitations as a whole above appear to be gathering data in terms of feedback information that will be further processed and appear to be conventional computer functionality. Also, MPEP 2106.05(d)(II) has identified “Receiving or transmitting data over a network, e.g., using the Internet to gather data” as conventional computer technology. Similarly, the claim limitations identified above appear to be receiving data. As a result, these claim limitations as a whole do not appear to amount to significantly more than the abstract idea itself.
Claim 10 incorporates substantively all the limitations of claim 5 in an apparatus and computer-readable storage form and are rejected under the same rationale.	

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 6 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Perone et al. (2021/0089571 A1, “Perone”) in view of Lester (US 2017/0249339 A1, hereinafter “Lester”) further in view of Marchesotti et al. (US 2012/0269441 A1, hereinafter “Marchesotti”).

Regarding claim 1, Perone teaches
A computer-implemented method for information interaction, the method comprising: (see Perone, [0048] “The method 500 may be performed by the system 100 shown in FIG. 1”; [0009] “to search the image feature vectors to identify an image matching the query”).
obtaining to-be-processed information, the to-be-processed information comprising textual information and an image; (see Perone, [0030] “the query 160 may be an image or a combination of an image, speech, and/or text… the system 100 may receive the query 160 stating "Find me a picture similar to the displayed photo." The encoder 122 encodes both the image and text of the query to perform the matching”).
extracting a feature word from the textual information of the to-be-processed information to obtain a search request for the image, and… (see Perone, [0011] “A textual feature vector may represent… contextual information”; [0026] “the query 160 may be natural language speech describing an image to be searched. The speech from the query 160 may be processed by the NLP 212 to obtain text describing the image to be searched”; [0025] “natural language processing (NLP) 212 may be applied to the query 160 to determine text for the query 160 that is applied as input to the encoder 122 to determine the textual feature vector 134”) within the to-be-processed information (see Perone, [0030] “the query 160 may be an image or a combination of an image, speech, and/or text… the system 100 may receive the query 160 stating "Find me a picture similar to the displayed photo." The encoder 122 encodes both the image and text of the query to perform the matching”) based on the feature word by: (see Perone, [0019] “The textual feature vector 134 and the image feature vectors 136 may be importing the image into an image search model to obtain a to-be-matched image set corresponding to the image, wherein the to-be-matched image set comprises at least one to-be-matched image, and (see Perone, [0012] “When the image and textual feature vectors are populated in the multimodal space, similar image features and textual features may be identified by comparing the distances of the feature vectors in the multimodal space to identify a matching image to the query”; [0030] “the query 160 may be an image or a combination of an image, speech, and/or text… the system 100 may receive the query 160 stating "Find me a picture similar to the displayed photo." The encoder 122 encodes both the image and text of the query to perform the matching”) the image search model is configured to characterize a first corresponding relationship between the image and the to-be-matched image;… (see Perone, [0013] “One example of a distance comparison may include a cosine proximity, where the cosine angles between feature vectors in the multimodal space are compared to determine closest feature vectors. Cosine similar features may be proximate in the multimodal space, and dissimilar feature vectors may be distal. Feature vectors may have k-dimensions, or coordinates in a multimodal space. Feature vectors with similar features are embedded close to each other in the multimodal space in vector models”) corresponding to the image (see Perone, [0030] “the query 160 may be an image”; [0014] “images may be manually tagged with a description, and matches may be found by searching the manually-added descriptions. The tags, including textual descriptions”). 
Perone does not explicitly teach searching for descriptive information of the image within the to-be-processed information; importing the to-be-matched image into a semantic tagging model to obtain a semantic tag set corresponding to the to-be-matched image set, wherein the semantic tagging model is configured to characterize a second corresponding relationship between the to-be-matched image and a semantic tag, and the semantic tag provides a textual description of the to-be-matched image; and selecting a to-be-recognized semantic tag from the semantic tag set, and determining interpretive information of a noun corresponding to the image in the to-be-recognized semantic tag as the descriptive information, wherein the descriptive information characterizes a textual description of the image; and constructing response information to the to-be-processed information from the descriptive information. 
However, Lester discloses feature extractor for images and also teaches
searching for descriptive information of the image within the query (see Lester, [0024] “a set of images identified as responsive to a selected image subset based search query from a user based on features of a cropped raw image represented as the image search query”; [0032] “extracts the feature vector of the cropped raw image”; [0043] “the mapping data 244 may include predetermined mapping information which identifies a mapping between a first visual word and first feature information representing an object or scene”; [0033] “an image depicting a sandy coastline may be associated with one or more visual words (e.g., beach, ocean, etc.)”) as the descriptive information being used to characterize a textual description of the image; and (see Lester, [0024] “a set of images identified as responsive to a selected image subset based search query from a user based on features of a cropped raw image represented as the image search query”; [0025] “identify features in images containing representations of one or more objects such as foreground objects and background objects”; [0029] “an image having a representation of multiple objects may be associated with multiple visual words”; [0043] “the mapping data 244 may include predetermined mapping information which identifies a mapping between a first visual word and first feature information representing an object or scene”; [0033] “an image depicting a sandy coastline may be associated with one or more visual words (e.g., beach, ocean, etc.)”). 
constructing response information to the to-be-processed information from the descriptive information (see Lester, [0024] “a set of images identified as responsive to a selected image subset based search query from a user based on features of a cropped raw image represented as the image search query”). 

The proposed combination of Perone and Lester does not explicitly teach importing the to-be-matched image into a semantic tagging model to obtain a semantic tag set corresponding to the to-be-matched image set, wherein the semantic tagging model is configured to characterize a second corresponding relationship between the to-be-matched image and a semantic tag, and the semantic tag provides a textual description of the to-be-matched image; and selecting a to-be-recognized semantic tag from the semantic tag set, and determining interpretive information of a noun corresponding to the image in the to-be-recognized semantic tag. 
However, Marchesotti discloses semantic content tags and also teaches
importing the to-be-matched image into a semantic tagging model to obtain a semantic tag set corresponding to the to-be-matched image set, wherein the semantic tagging model is configured to characterize a second corresponding relationship between the to-be-matched image and a semantic tag, and the semantic tag provides a textual description of the to-be-matched image; and (see Marchesotti, [0067] “The semantic content 14 of the image can be derived from one or more of manual annotations 23, textual tags produced by automatic models which populate an image with textual information from other  retrieved by a search engine in response to a query input by a user”) selecting a to-be-recognized semantic tag from the semantic tag set, and determining interpretive information of a noun for an image (see Marchesotti, [0072] “Where the tags 23, 40 are freeform, i.e., not restricted to any category, the system 10 may include a syntactic parser which analyzes the textual information 23, 40 to identify text which is recognized as referring to a content category. For example, given the sentence "this is a red rose" the system extracts rose(noun) and assigns the image to the content category "flower," with a feature weight wj of 1”) in the to-be-recognized semantic tag (see Marchesotti, [0072] “Where the tags 23, 40 are freeform, i.e., not restricted to any category, the system 10 may include a syntactic parser which analyzes the textual information 23, 40 to identify text which is recognized as referring to a content category”). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the functionality of analyzing text information to extract features as being disclosed and taught by Marchesotti in the system taught by the proposed combination of Perone and Lester to yield the predictable results of using content features in order to improve the assessment of its quality (see Marchesotti, [0067] “Thus, as demonstrated in the examples below, using content features 14 which describe the main subject of the image 12 can improve the assessment of its quality”).
Claims 6 and 11 incorporate substantively all the limitations of claim 1 in an apparatus form (see Perone, [0017] “image search system 100, referred to as system 100. The system 100 may include a processor 110 and a data storage 121… machine readable instructions 120 that are executable by the processor 110”; [0009] “to search the image feature vectors to identify an image matching the query”) and computer-readable storage form (see Perone, .

Claims 2 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Perone, Lester and Marchesotti in view of Levit et al. (US 2016/0336006 A1, hereinafter “Levit”).

Regarding claim 2, the proposed combination of Perone, Lester and Marchesotti teaches
wherein the extracting a feature word from the textual information of the to- be-processed information comprises: (see Perone, [0011] “A textual feature vector may represent… contextual information”; [0026] “the query 160 may be natural language speech describing an image to be searched. The speech from the query 160 may be processed by the NLP 212 to obtain text describing the image to be searched”; [0025] “natural language processing (NLP) 212 may be applied to the query 160 to determine text for the query 160 that is applied as input to the encoder 122 to determine the textual feature vector 134”).
extracting the feature word (see Perone, [0011] “A textual feature vector may represent… contextual information”; [0026] “the query 160 may be natural language speech describing an image to be searched. The speech from the query 160 may be processed by the NLP 212 to obtain text describing the image to be searched”; [0025] “natural language processing (NLP) 212 may be applied to the query 160 to determine text for the query 160 that is applied as input to the encoder 122 to determine the textual feature vector 134”). 
The proposed combination of Perone, Lester and Marchesotti does not explicitly teach performing a semantic recognition on the textual information to obtain semantic information corresponding to the textual information; and extracting the feature word from the semantic information. 

performing a semantic recognition on the textual information to obtain semantic information corresponding to the textual information; and (see Levit, [0049] “feature extraction component 120 may parse and/or extract items (e.g., word n-grams, phrases, queries, sentences, etc.) of new typed text from new typed corpus 119 and may represent and/or convert each item of new typed text into a feature vector with respect to one or more features (e.g., lexical features, syntactic feature, semantic features, pronounceability features, contextual features, etc.) that can be used to characterize the item of new typed text”). 
extracting features from the semantic information (see Levit, [0049] “feature extraction component 120 may parse and/or extract items (e.g., word n-grams, phrases, queries, sentences, etc.) of new typed text from new typed corpus 119 and may represent and/or convert each item of new typed text into a feature vector with respect to one or more features (e.g., lexical features, syntactic feature, semantic features, pronounceability features, contextual features, etc.) that can be used to characterize the item of new typed text”). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the functionality of analyzing text information to extract features as being disclosed and taught by Levit in the system taught by the proposed combination of Perone, Lester and Marchesotti to yield the predictable results of effectively identifying patterns, clusters, and/or behaviors of typed text contained in unspeakable corpus (see Levit, [0040] “Classifier training component 117 may calculate and/or estimate statistical properties of individual features, combinations of features, and/or feature vectors to identify patterns, clusters, and/or behaviors that are exhibited by at least one, some, or all items of typed text contained in unspeakable corpus”).
Claim 7 incorporates substantively all the limitations of claim 2 in an apparatus form and is rejected under the same rationale.

Claims 4 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Perone, Lester and Marchesotti in view of Watanabe et al. (US 11,157,550 B2, hereinafter “Watanabe”).

Regarding claim 4, the proposed combination of Perone, Lester and Marchesotti teaches
wherein the selecting a to-be-recognized semantic tag from the semantic tag set comprises: (see Marchesotti, [0072] “Where the tags 23, 40 are freeform, i.e., not restricted to any category, the system 10 may include a syntactic parser which analyzes the textual information 23, 40 to identify text which is recognized as referring to a content category. For example, given the sentence "this is a red rose" the system extracts rose (noun) and assigns the image to the content category "flower," with a feature weight wj of 1”).
the to-be-recognized semantic tag (see Marchesotti, [0072] “the system extracts rose (noun)”).
The proposed combination of Perone, Lester and Marchesotti does not explicitly teach counting numbers of identical semantic tags within the semantic tag set, and using the semantic tag having a maximum number as the to-be-recognized semantic tag. 
	However, Watanabe discloses images and tag information and also teaches
	counting numbers of identical semantic tags within the semantic tag set, and using the semantic tag having a maximum number as tag information (see Watanabe, [col13 lines9-19] “a tag added to a query image used for a search key can be estimated by totaling information of tags added to the similar images… a result of estimating a tag added to a query image such as a score like the vehicle model A is 50%, a score like the vehicle model B is 30% and a score like the vehicle model C is 20%”; [col7 lines62-63] “having common tag information”). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the functionality of analyzing tag information as being 
Claim 9 incorporates substantively all the limitations of claim 4 in an apparatus form and is rejected under the same rationale.

Claims 5 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Perone, Lester, Marchesotti and Watanabe in view of Makeev (US 2008/0301089 A1, hereinafter “Makeev”).

Regarding claim 5, the proposed combination of Perone, Lester, Marchesotti and Watanabe teaches
the method further comprising correcting the descriptive information, wherein the correcting the descriptive information comprises: (see Perone, [0038] “the joint embeddings 220 may be enhanced by continuous training… train the encoder 122 to produce more accurate results”).
choosing a secondary to-be-recognized tag from the semantic tags in the semantic tag set excluding the to-be-recognized semantic tag, (see Watanabe, [col6 line11] “plural tags may also be registered for one registered image”; [col1 lines43-45] “various images can be obtained at a time by performing a similar image search using plural tagged images for a query” – there are plurality of tags).
determining the interpretive information of the noun (see Marchesotti, [0072] “Where the tags 23, 40 are freeform, i.e., not restricted to any category, the system 10 may include a syntactic parser which analyzes the textual information 23, 40 to identify text which is corresponding to the image (see Perone, [0030] “the query 160 may be an image”) in the secondary to-be-recognized tag (see Watanabe, [col6 line11] “plural tags may also be registered for one registered image”; [col1 lines43-45] “various images can be obtained at a time by performing a similar image search using plural tagged images for a query” – there are plurality of tags) as secondary descriptive information; and (see Lester, [0029] “an image having a representation of multiple objects may be associated with multiple visual words”).
constructing the response information to the to-be-processed information from the secondary descriptive information (see Lester, [0029] “includes a collection of images 254 and an image search engine 256 for searching the collection of images 254…  an image having a representation of multiple objects may be associated with multiple visual words”).
The proposed combination of Perone, Lester, Marchesotti and Watanabe does not explicitly teach receiving feedback information corresponding to the response information, wherein the feedback information evaluates an accuracy of the response information; performing a semantic recognition on the feedback information to obtain the accuracy; choosing a secondary to-be-recognizing tag from the semantic tags in the semantic tag set excluding the to-be-recognizing semantic tag in response to determining that the accuracy is below a preset threshold;
However, Makeev discloses feedback logs and also teaches
receiving feedback information corresponding to the response information, wherein the feedback information evaluates an accuracy of the response information; (see Makeev, [0024] “the processing device 102 may track the user 116 selection activities on the search results page and add this to a feedback log associated with the search term”; [0027] “The feedback logs 138 may include any suitable number of entries of search query sessions, 
performing a semantic recognition on the feedback information to obtain the accuracy; (see Makeev, [0054] “the feedback logs may include a counter value indicating the number of search sessions logged for a particular search term” – counter value is interpreted as accuracy). 
perform a specific functionality in response to determining that the accuracy is below a preset threshold; (see Makeev, [0054] “the feedback logs may include a counter value indicating the number of search sessions logged for a particular search term”; [0055] “If the counter value is below a threshold, step 232, the processing device may generate the search results without using relevancy factors”). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the functionality of feedback information as being disclosed and taught by Makeev in the system taught by the proposed combination of Perone, Lester, Marchesotti and Watanabe to yield the predictable results of improving the sequence and hence the effectiveness of the search result by placing more popular search result document identifiers higher in the search results page for the requesting user (see Makeev, [0051] “this methodology thereby improves the sequence and hence the effectiveness of the search result by placing more popular search result document identifiers higher in the search results page for the requesting user”). 
Claim 10 incorporates substantively all the limitations of claim 5 in an apparatus form and is rejected under the same rationale.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VAISHALI SHAH whose telephone number is (571)272-8532. The examiner can normally be reached Monday - Friday (7:30 AM to 4:00 PM).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, TAMARA KYLE can be reached on (571)272-4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more 





/VAISHALI SHAH/Primary Examiner, Art Unit 2156