DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office action is responsive to the following communication: Request for Continued Examination filed on 15 March 2021.
The instant application claims foreign priority to CN201710936315, filed on 10 October 2017.
Claim(s) 1-20 is/are pending and present for examination.  Claim(s) 1, 2, and 13 is/are in independent form.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 15 March 2021 has been entered.
 
Response to Amendment
Claims 1, 2, and 13 have been amended.
No claims have been cancelled.
No claims have been newly added.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

As per claims 2 and 13, the rejections under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, are withdrawn in view of Applicant’s Amendment.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
As per claim 1, the claim(s) recite(s) in part “converting the search click behavior data…” and “determining a text corresponding to the target image.”
The limitations directed towards “converting” and “determining” are interpreted to be the observation or judgment about correlations between vectors, therefore, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “one or more processors,” in claim 1, nothing in the claim element precludes the step from practically being performed in the mind.
This judicial exception is not integrated into a practical application by additional elements. In particular, the claim recites using a processor to perform the steps of “acquiring search click behavior data,” “performing training…,” and “extracting an image feature vector…” are considered additional elements.  These additional elements represent mere extra-solution activities to the judicial exception.  These elements do not integrate the abstract idea into a practical application because they do not impose a meaningful limit on the judicial exception and provide only insignificant extra solution activity that is mere data gathering in conjunction with the abstract idea.
Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93). The claims provide that the said steps may be performed by processors. Therefore, the “acquiring,” “performing,” and “extracting” are nothing more than what can be handled by a conventional image search engine and does not provide significantly more than the judicial exception. The claim(s) is/are not patent eligible.
As per claims 2 and 13, the claim(s) recite(s) in part “determining a text corresponding to the target image.”
The limitations directed towards “determining” are interpreted to be the observation or judgment about vectors, therefore, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “one or more processors,” in claim 13, nothing in the claim element precludes the step from practically being performed in the mind.
This judicial exception is not integrated into a practical application by additional elements. In particular, the claim recites using a processor to perform the steps of “extracting an image feature vector…” are considered additional elements.  These additional elements represent mere extra-solution activities to the judicial exception.    As per the limitations in claim 13 directed to “one or more processors,” and “computer readable media,” these elements of the claims are recited at a high-level of generality (i.e., as a generic processor performing a generic computer function) of searching and storing such that it amounts no more than mere instructions to apply the exception using a generic computer component.  These elements do not integrate the abstract idea into a practical application because they do not impose a meaningful limit on the judicial exception and provide only insignificant extra solution activity that is mere data gathering in conjunction with the abstract idea.
Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93). The claims provide that the said steps may be performed by processors. Therefore, the “extracting” is nothing more than what can be handled by a conventional image search engine and does not provide significantly more than the judicial exception. The claim(s) is/are not patent eligible.
As per claims 3 and 14, the limitations are directed towards determining the correlation between the target image and the text according to a Euclidean distance, which is an additional element beyond the above identified judicial exception.  These additional elements represent mere extra-solution activities to the judicial exception.  These elements do not integrate the abstract idea into a practical application because they do not impose a meaningful limit on the judicial exception and provide only insignificant extra solution activity that is mere data processing in conjunction with the abstract idea.
The claim(s) do not include additional elements that are sufficient to amount to significantly more than the judicial exception because using mathematical formulas such as a Euclidean distance only add well-understood, routine and conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (See Flook, 437 U.S. at 594, 198 USPQ2d at 199 (recomputing or readjusting alarm limit values); Bancorp Services v. Sun Life, 687 F.3d 1266, 1278, 103 USPQ2d 1425, 1433 (Fed. Cir. 2012) ("The computer required by some of Bancorp’s claims is employed only for its most basic function, the performance of repetitive calculations, and as such does not impose meaningful limits on the scope of those claims.")). The claim(s) is/are not patent eligible.
As per claims 4, 5, and 15, the limitations are directed towards further defining “selecting the text whose correlations between the text feature vector and the image feature vector of the target image is greater than a preset threshold” and “selecting the text whose 
As per claims 6 and 16, the limitations are directed towards further defining “determining a respective similarity between the image feature vector and a respective text feature vector of a respective text among a plurality of texts” and “determining the text corresponding to the target image based on the determined respective similarity.”  The limitations directed towards “determining” are interpreted to be the observation or judgment about a level of similarity, therefore, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is,  nothing in the claim element precludes the step from practically being performed in the mind.  There are no additional elements that would tie the limitations to a practical application and/or that would amount to significantly more than the judicial exception.
As per claims 7 and 17,
As per claims 8 and 18, the claim(s) recite(s) in part “converting the search click behavior data….”
The limitations directed towards “converting” are interpreted to be the observation or judgment about data conversion of click behavior into text pairs, therefore, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “one or more processors,” in claim 1, nothing in the claim element precludes the step from practically being performed in the mind.
This judicial exception is not integrated into a practical application by additional elements. In particular, the claim recites using a processor to perform the steps of “acquiring search click behavior data,” “performing training…,” and “extracting an image feature vector…” are considered additional elements.  These additional elements represent mere extra-solution activities to the judicial exception.  These elements do not integrate the abstract idea into a practical application because they do not impose a meaningful limit on the judicial exception and provide only insignificant extra solution activity that is mere data gathering in conjunction with the abstract idea.
The claim(s) do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the “acquiring” data,” “performing” training, and “extracting” vectors only add well-understood, routine and conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (See Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93). The claims provide that the said steps may be performed by processors. Therefore, the “acquiring,” “performing,” and “extracting” are nothing more than what can be handled by a conventional image search engine and does not provide significantly more than the judicial exception. The claim(s) is/are not patent eligible.
As per claims 9, 10, 11, and 19, the claim(s) recite(s) in part “performing segmentation processing and part-of-speech analysis on the search texts,” “determining texts from data obtained through the segmentation processing and the part-of-speech analysis,” and “performing deduplication processing on image data clicked based on the search texts.”

This judicial exception is not integrated into a practical application by additional elements. In particular, the claim recites using a processor to perform the steps of “performing deduplication processing…” are considered additional elements.  These additional elements represent mere extra-solution activities to the judicial exception.  These elements do not integrate the abstract idea into a practical application because they do not impose a meaningful limit on the judicial exception and provide only insignificant extra solution activity that is mere data gathering in conjunction with the abstract idea.
The claim(s) do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the “performing deduplication processing” only add well-understood, routine and conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (See Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93). The claims provide that the said steps may be performed by processors. Therefore, the “performing” is nothing more than what can be handled by a conventional image search engine and does not provide significantly more than the judicial exception. The claim(s) is/are not patent eligible.
As per claims 12 and 20, the limitations are directed towards further text pairs, which is an additional element beyond the above identified judicial exception.  These additional elements represent mere extra-solution activities to the judicial exception.  These elements do not integrate the abstract idea into a practical application because they do not impose a meaningful limit on the judicial exception and provide only insignificant extra solution activity that is mere data processing in conjunction with the abstract idea.
Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93). The claim(s) is/are not patent eligible.

	
Examiner’s Note
Examiner cites particular columns and/or paragraphs and line numbers in the references as applied to claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may be applied as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in entirely as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1 is/are rejected under 35 U.S.C. 103 as being unpatentable over Fu et al, USPGPUB No. 2017/0330054, filed on 30 September 2016, and published on 16 November 2017, in view of Mei et .
As per independent claim 1, Fu, in combination with Mei and Keisler, discloses:
One or more computer readable media storing thereon computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform acts comprising: 
acquiring search click behavior data, the search click behavior data including search texts and image data clicked based on the search texts {See Fu, [0093], wherein this reads over “Therefore, in a exemplified implementation of this embodiment, a positive sample image and a negative sample image corresponding to a query may be automatically determined according to image click logs of users.  For example, after a user enters a query to search for an image, an image clicked by the user based on a searching result is used as a positive sample image corresponding to the query, and an image not clicked by the user is used as a negative sample image corresponding to the query”}; 
converting the search click behavior data into a plurality of image text pairs, the plurality of image text pairs including a first image text pair including a first text and an image and a second image text pair including a second text and the image {See Mei, [0022], wherein this reads over “note that the click-through data 122 for the textual query 104(1)-visual image 108(1) pair shows 47 clicks. The click-through data 122 for the textual query 104(1)-visual image 108(2) pair shows 50 clicks. In this example, the click-through data can suggest a stronger association between textual query 104(1) and visual image 108(2) than with visual image 108(1)”}, the converting the search click behavior data into the plurality of image text pairs including performing deduplication processing on the image data clicked based on the search texts {See Keisler, column 24, lines 51-60, wherein this reads over “In some embodiments, merging of results is performed. This may include deduplication of tile images. For example, if tiles are returned that are geographically nearby to each other (e.g., adjacent and/or overlapping), the tiles may be merged together (e.g., by returning a stitched together version of overlapping tile results that is a larger, merged tile). As another example, for overlapping tiles that are returned, the center most tile in the group may be displayed as the result (while filtering out the other overlapping images).”};
performing training according to the plurality of image text pairs {See Fu, [0095], wherein this reads over “the training of the original deep neural network is accomplished by using the positive-negative training pair, and therefore, to improve the training efficiency, preferably, two completely identical original deep neural networks may be constructed for receiving a positive training pair consisting of the training query and the positive sample image and a negative training pair consisting of the training query and the negative sample image, respectively, thereby implementing quick and real-time model training”} to obtain a data model for extracting an image feature vector and a text feature vector {See Fu, [0054], wherein this reads over “The representation vector generation network may comprise four representation vector generation units that are separately configured to convert the input query, the image surrounding text data, the image content data and the image associated characteristic data into corresponding representation vectors, for conducting subsequent model training works”; and [0162], wherein this reads over “a representation vector generation network and a relevance calculation network, the representation vector generation network is used to convert different types of data in the training sample into representation vectors and input the representation vectors to the relevance calculation network, and the relevance calculation network is used to convert at least two input representation vectors into a relevance metric value”}; 
extracting an image feature vector of a target image, the image feature vector representing an image content of the target image {See Fu, [0057], wherein this reads over “The representation vector generation of the image pixel content is currently widely used in a CNN (Convolutional Neural Network) classification network.  The input of the network is a size-normalized image pixel matrix, and the output thereof is a classification representation vector of an image.”};  and 
determining a text corresponding to the target image according to a correlation between the image feature vector and a text feature vector of the text, the text feature vector representing semantics of the text, the image feature vector and the text feature vector being in a same vector space {See Fu, [0059], wherein this reads over “Because image surrounding text data and the query are both texts, representation vector generation manners thereof are the same, which are both text representation vector generation”; and [0064], wherein this reads over “The word vectors and the three networks may be trained separately, or word vectors or networks that have been trained in other tasks may be used, or the vectors and the three networks may be trained in this task together with the subsequent relevance calculation network”; and [0066], wherein this reads over “similar to the representation vector generation of the text, the representation vector generation of the image pixel content and the representation vector generation of the image associated characteristic data may be trained separately, or trained in this task together with subsequent networks”}. 
	Fu is directed to the invention of an image search method using a relevance prediction model.  Fu fails to disclose the claimed feature of “converting the search click behavior data into a plurality of image text pairs, the plurality of image text pairs including a first image text pair including a first text and an image and a second image text pair including a second text and the image.”  Mei is directed to the invention of click-through based learning for internet searches.  Specifically, Mei discloses that “the click-through data 122 for the textual query 104(1)-visual image 108(1) pair shows 47 clicks” and “[t]he click-through data 122 for the textual query 104(1)-visual image 108(2) pair shows 50 clicks.”  See Mei, [0022].  Additionally, Mei discloses that “the click-through data can suggest a stronger association between textual query 104(1) and visual image 108(2) than with visual image 108(1).”  See Mei, [0022].  Accordingly, it would have been obvious to one of ordinary skill in the art to improve the prior art of Fu with that of Mei for the predictable result of a system wherein the image search system of Fu may further include the feature of using user click behavior to determine image-text pairs as disclosed by the clickthrough processing of Mei. 
	As per the claimed feature of “the converting the search click behavior data into the plurality of image text pairs including performing deduplication processing on the image data clicked based on the search texts,” the combination of Fu and Mei fails to disclose said features.
Keisler is directed to the invention of a geo-visual search.  Specifically, Keisler discloses that “merging of results is performed” which “may include deduplication of tile images.”  See Keisler, column .
Claim(s) 2-3, 6-8, 12-14, 16-18, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Fu et al, USPGPUB No. 2017/0330054, filed on 30 September 2016, and published on 16 November 2017, in view of Hwang et al, USPGPUB No. 2015/0125073, filed on 4 November 2014, and published on 7 May 2015.
As per independent claims 2 and 13, Fu, in combination with Hwang, discloses:
A method comprising:
extracting an image feature vector of a target image, the image feature vector representing an image content of the target image {See Fu, [0057], wherein this reads over “The representation vector generation of the image pixel content is currently widely used in a CNN (Convolutional Neural Network) classification network.  The input of the network is a size-normalized image pixel matrix, and the output thereof is a classification representation vector of an image.”};  and
determining a text corresponding to the target image according to a correlation between the image feature vector and a text feature vector of the text, the text feature vector representing semantics of the text, the image feature vector and the text feature vector being in a same vector space {See Fu, [0059], wherein this reads over “Because image surrounding text data and the query are both texts, representation vector generation manners thereof are the same, which are both text representation vector generation”; and [0064], wherein this reads over “The word vectors and the three networks may be trained separately, or word vectors or networks that have been trained in other tasks may be used, or the vectors and the three networks may be trained in this task together with the subsequent relevance calculation network”; and [0066], wherein this reads over “similar to the representation vector generation of the text, the representation vector generation of the image pixel content and the representation vector generation of the image associated characteristic data may be trained separately, or trained in this task together with subsequent networks”}, the determining the text corresponding to the target image according to the correlation between the image feature vector and the text feature vector of the text including:
determining multiple tag categories corresponding to the target image {See Hwang, [0066], wherein this reads over “The controller 150 may control the storage unit 130 to store additional information by mapping the additional information with the target image. The additional information may include at least one of keywords related to the category including the target object”}; and
selecting a tag corresponding to a highest correlation with the target image among multiple tags under a respective tag category of the multiple tag categories {See Hwang, [0067], wherein this reads over “An operation of storing at least one of keywords related to an image by mapping the keyword with the image may be referred to "tagging an image". To tag the target image, a method including processes of a user selecting a certain area of the target image and inputting a keyword about the selected area is used”; and [0068], wherein this reads over “In the image processing apparatus 100 according to the present disclosure, for the category that is determined including the target object, the target image is automatically tagged and thus the target image may be automatically classified”; and [0095], wherein this reads over “The image processing apparatus may store the target image 501 by mapping the target image 501 with additional information including at least one of keywords related to "Namdaemun gate" and "Person". The at least one of keywords related to "Namdaemun gate" may include the name of the category, that is "Namdaemun gate".”; and [0185], wherein this reads over “The image processing apparatus may determine whether the target object is included in the first category, based on the matching score. For the shape of the target object to have a high matching score, the shape of the target object may have characteristics similar to the root model and part models of the first category. Also, for the shape of the target object to have a high matching score, the shapes of parts of the target object may not be separated far from positions learned with respect to the part models of the first category”}. 
Fu is directed to the invention of an image search method using a relevance prediction model.  Fu fails to disclose the claimed feature of “the determining the text corresponding to the target image according to the correlation between the image feature vector and the text feature vector of the text includes: determining multiple tag categories corresponding to the target image; and selecting a tag corresponding to a highest correlation with the target image among multiple tags under a respective tag category of the multiple tag categories.”
Hwang is directed to the invention of a system for processing images.  Hwang discloses that an image processing apparatus may determine a category of a target image.  Specifically, Hwang discloses that “[t]he controller 150 may control the storage unit 130 to store additional information by mapping the additional information with the target image” wherein “[t]he additional information may include at least one of keywords related to the category including the target object.”  See Hwang, [0066].  That is, Hwang would disclose a system wherein keywords related to a category (i.e. tag categories) may be associated to a target image.
Additionally, Hwang discloses that “an operation of storing at least one of keywords related to an image by mapping the keyword with the image may be referred to "tagging an image"” such that “a method including processes of a user selecting a certain area of the target image and inputting a keyword about the selected area is used.”  See Hwang, [0067].  Moreover, Hwang discloses that “for the category that is determined including the target object, the target image is automatically tagged and thus the target image may be automatically classified.”  See Hwang, [0068].  Furthermore, Hwang discloses that “[t]he image processing apparatus may store the target image 501 by mapping the target image 501 with additional information including at least one of keywords related to "Namdaemun gate" and "Person"” and “[t]he at least one of keywords related to "Namdaemun gate" may include the name of the category, that 
Moreover, Hwang discloses that and “[t]he image processing apparatus may determine whether the target object is included in the first category, based on the matching score” and “[f]or the shape of the target object to have a high matching score, the shape of the target object may have characteristics similar to the root model and part models of the first category” such that “for the shape of the target object to have a high matching score, the shapes of parts of the target object may not be separated far from positions learned with respect to the part models of the first category.”  See Hwang, [0185].  Wherein Hwang discloses that a high matching score is utilized to determine a category, Hwang would have indeed read upon the claimed feature of selecting a tag with the highest correlation under “a respective tag category of the multiple tag categories.”
Accordingly, wherein Hwang discloses the aforementioned features, it would have been obvious to one of ordinary skill in the art to improve the prior art of Fu with that of Hwang for the predictable result of a system for categorizing images in view of the vectors provided by Fu.
As per dependent claims 3 and 14, Fu, in combination with Hwang, discloses:
The method of claim 2, further comprising:
determining the correlation between the target image and the text {See Fu, [0072], wherein this reads over “The two new representation vectors not only have a uniform format, but also are in the same representation space, and thereby can be input to a vector distance calculating unit to calculate a relevance metric value”} according to a Euclidean distance between the image feature vector and the text feature vector {See Fu, [0073], wherein this reads over “Typically, the vector distance calculating unit may calculate a cosine distance between the two vectors output by the first standard vector representation unit and the second standard vector representation unit, to determine the relevance metric value between the two vectors, or calculate another vector distance for measuring the similarity between the two vectors, such as an Euclidean distance between the two vectors, which is not limited in this embodiment”}. 
As per dependent claims 6 and 16, Fu, in combination with Hwang, discloses:
The method of claim 2, wherein the determining the text corresponding to the target image according to the correlation between the image feature vector and the text feature vector of the text includes:
determining a respective similarity between the image feature vector and a respective text feature vector of a respective text among a plurality of texts {See Fu, [0073], wherein this reads over “Typically, the vector distance calculating unit may calculate a cosine distance between the two vectors output by the first standard vector representation unit and the second standard vector representation unit, to determine the relevance metric value between the two vectors, or calculate another vector distance for measuring the similarity between the two vectors, such as an Euclidean distance between the two vectors, which is not limited in this embodiment”}; and
determining the text corresponding to the target image based on the determined respective similarity {See Fu, [0086], wherein this reads over “Inputs of the image search relevance prediction model are a query entered by a user and image data of a target image (for example, comprising: image surrounding text data, image content data, and image associated characteristic data), and an output thereof is a relevance metric value between the query and the target image”}. 
As per dependent claims 7 and 17, Fu, in combination with Hwang, discloses:
The method of claim 6, wherein the determining the respective similarity between the image feature vector and a respective text feature vector of a respective text among a plurality of texts includes: 
determining, one by one, the respective similarity between the image feature vector and the respective text feature vector of each of the plurality of texts {See Fu, [0087], wherein this reads over “After an image search engine receives an image query entered by a user, the image query and to-be-sorted images are input to the image search relevance prediction model to obtain relevance metric values between the to-be-sorted images and the image query.”}. 
As per dependent claims 8 and 18, Fu, in combination with Hwang, discloses:
The method of claim 2, further comprising:
acquiring search click behavior data, the search click behavior data including search texts and image data clicked based on the search texts {See Fu, [0093], wherein this reads over “Therefore, in a exemplified implementation of this embodiment, a positive sample image and a negative sample image corresponding to a query may be automatically determined according to image click logs of users.  For example, after a user enters a query to search for an image, an image clicked by the user based on a searching result is used as a positive sample image corresponding to the query, and an image not clicked by the user is used as a negative sample image corresponding to the query”}; 
converting the search click behavior data into a plurality of image text pairs {See Fu, [0091], wherein this reads over “In this embodiment, a query and a positive-negative sample image pair (also briefly referred to as a pair) under the query may be used as a training sample.  That is, a training sample consists of a query and a pair formed by two images.  In the pair, one image has better relevance with the Query than the other image, and the two images are referred to as a positive sample and a negative sample respectively”}; and 
performing training according to the plurality of image text pairs {See Fu, [0095], wherein this reads over “the training of the original deep neural network is accomplished by using the positive-negative training pair, and therefore, to improve the training efficiency, preferably, two completely identical original deep neural networks may be constructed for receiving a positive training pair consisting of the training query and the positive sample image and a negative training pair consisting of the training query and the negative sample image, respectively, thereby implementing quick and real-time model training”} to obtain a data model for extracting the image feature vectors and the text feature vector {See Fu, [0054], wherein this reads over “The representation vector generation network may comprise four representation vector generation units that are separately configured to convert the input query, the image surrounding text data, the image content data and the image associated characteristic data into corresponding representation vectors, for conducting subsequent model training works”; and [0162], wherein this reads over “a representation vector generation network and a relevance calculation network, the representation vector generation network is used to convert different types of data in the training sample into representation vectors and input the representation vectors to the relevance calculation network, and the relevance calculation network is used to convert at least two input representation vectors into a relevance metric value”}. 
As per dependent claim 12 and 20, Fu, in combination with Hwang, discloses:
The method of claim 8, wherein a respective image text pair of the plurality of image text pairs includes an image and a text {See Fu, [0091], wherein this reads over “In this embodiment, a query and a positive-negative sample image pair (also briefly referred to as a pair) under the query may be used as a training sample.  That is, a training sample consists of a query and a pair formed by two images.  In the pair, one image has better relevance with the Query than the other image, and the two images are referred to as a positive sample and a negative sample respectively”}. 
Claims 4, 5, and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Fu, in view of Hwang, and in further view of Li et al, USPGPUB No. 2002/0161747, filed on 13 March 2001, and published on 31 October 2002.
As per dependent claim 4, the combination of Fu and Hwang fails to expressly disclose the claimed feature of “selecting the text whose correlations between the text feature vector and the image feature vector of the target image is greater than a preset threshold.”  Li is directed to a media content search engine which incorporates text content and user log mining.  Specifically, Li discloses that “matching module 184 compares the similarity of each image to the query vectors to a threshold value--if the numerical value representing the similarity of an image to the query vectors exceeds the threshold value then the image is a "match" and returned to interface component 122, and if the numerical value dues not exceed the threshold value then the image is not a match and is not returned to interface component 122.”  See Li, [0057].  Additionally, Li discloses that “user log 190 stores, for each image that was marked relevant or irrelevant, an indication of the image, an indication of whether it was marked as relevant or irrelevant, and the query that resulted in retrieval of the image (e.g., the text and/or image input by the user as the initial search criteria, or the query vectors generated therefrom).”  See Li, [0058].  That is, Li discloses a system wherein the query text which met the “match” threshold (i.e. a preset threshold) may be stored in a user log (i.e. selecting a text whose correlations between the text 
As per dependent claim 5, the combination of Fu and Hwang fails to expressly disclose the claimed feature of “selecting the text whose correlations between the text feature vector and the image feature vector of the target image is greater than a preset ranking threshold
As per dependent claim 15, the combination of Fu and Hwang fails to expressly disclose the claimed feature of “selecting the text whose correlations between the text feature vector and the image feature vector of the target image is greater than a preset threshold; or selecting the text whose correlations between the text feature vector and the image feature vector of the target image is greater than a preset ranking threshold.”  It is noted that the selecting steps are recited in optional for (i.e. via the “or” operator).  Accordingly, it is noted that only one of the two selecting steps need by disclosed by the prior art.   Li is directed to a media content search engine which incorporates text content and user log mining.  Specifically, Li discloses that “matching module 184 compares the similarity of each image to the query vectors to a threshold value--if the numerical value representing the similarity of an image to the query vectors exceeds the threshold value then the image is a "match" and returned to interface component 122, and if the numerical value dues not exceed the threshold value then the image is not a match and is not returned to interface component 122.”  See Li, [0057].  Additionally, Li discloses that “user log 190 stores, for each image that was marked relevant or irrelevant, an indication of the image, an indication of whether it was marked as relevant or irrelevant, and the query that resulted in retrieval of the image (e.g., the text and/or image input by the user as the initial search criteria, or the query vectors generated therefrom).”  See Li, [0058].  Moreover, Li discloses that “matching module 184 compares the similarity of all images available from media content indexer 136 to the query vectors and returns the images with the highest similarities (the highest numerical values representing similarity) to interface component 122.”  See Li, [0057].  That is, Li discloses a system wherein the query text which met the “match” threshold (i.e. a preset ranking threshold) may be stored in a user log (i.e. selecting a text whose correlations between the text feature vector and the image feature vector of the target image is greater than a preset ranking threshold).  Accordingly, it would have been obvious to one of ordinary skill in the art to improve the prior art of Fu with that of Li for the predictable result of a system wherein the image search system of Fu may further include the feature of selecting and storing those query text which the correlation between a plurality of vectors exceeded the match threshold.
Claim 9Fu, in view of Hwang, MacGillivray et al, U.S. Patent No. 9,589,060, filed on 23 July 2014, and issued on 7 March 2017.
As per dependent claim 9, Fu fails to disclose the claimed feature of “performing segmentation processing and part-of-speech analysis on the search texts; and determining texts from data obtained through the segmentation processing and the part-of-speech analysis.”  MacGillivray is directed to the invention of a system for generating response to natural language queries.  Specifically, MacGillivray discloses that “[t]he computerized system includes a memory device that stores a set of instructions and at least one processor that executes the set of instructions to receive a query from the user and divide the query into query segments based on a set of grammar rules.”  See MacGillivray, column 1, lines 45-59.  Additionally, MacGillivray discloses that “the at least one processor executes the set of instructions to receive information related to the first and second segments, and generate a response to the query based on the received information.”  See MacGillivray, column 2, lines 22-35.  Wherein MacGillivray discloses a system for dividing queries into segments according to natural language processing (i.e. performing segmentation processing and part-of-speech analysis on the search texts) and generating a response in view of select segments (i.e. determining texts from data obtained through the segmentation processing and the part-of-speech analysis), it would have been obvious to one of ordinary skill in the art to improve the prior art of Fu with that of MacGillivray for the predictable result of improving the media content search engine of Fu to further include segmentation processing and part-of-speech analysis.

Allowable Subject Matter
Claims 10, 11, and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Response to Arguments
Applicant's arguments filed 15 March 2021 have been fully considered but they are not persuasive.
Claim Rejection under 35 U.S.C. 101
Statutory Subject Matter Rejection Under §101 Principles of Law
Alice Corp. v. CLSBankInt’l, 573 U.S. 208, 216 (2014). In determining whether a claim falls within an excluded category, we are guided by the Supreme Court’s two-step framework, described in Alice and Mayo. See id. at 217—18 (citing Mayo Collaborative Servs. v. Prometheus Labs., Inc., 566 U.S. 66, 75—77 (2012)). In accordance with that framework, we first determine what concept the claim is “directed to.”
Id. at 219 (“On their face, the claims before us are drawn to the concept of intermediated settlement, i.e., the use of a third party to mitigate settlement risk.”); see also Bilski v. Kappos, 561 U.S. 593, 611 (2010) (“Claims 1 and 4 in petitioners’ application explain the basic concept of hedging, or protecting against risk.”).
If the claim is “directed to” an abstract idea, we turn to the second step of the Alice and Mayo framework, where “we must examine the elements of the claim to determine whether it contains an ‘inventive concept’ sufficient to ‘transform’ the claimed abstract idea into a patent eligible application.” Alice, 573 U.S. at 221 (citation omitted). “A claim that recites an abstract idea must include ‘additional features’ to ensure ‘that the [claim] is more than a drafting effort designed to monopolize the [abstract idea].”’ Id. (alterations in original) (quoting Mayo, 566 U.S. at 77). “[M]erely requiring] generic computer implementation[] fail[s] to transform that abstract idea into a patent-eligible invention.” Id. 
The PTO has published revised guidance on the application of § 101. See USPTO, 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (Jan. 7, 2019) (“Guidance”).2 Under that guidance, we first look to whether the claim recites:
(1) any judicial exceptions, including certain groupings of abstract ideas (i.e., mathematical concepts, certain methods of organizing human activity such as a fundamental economic practice, or a mental process); and 
(2) additional elements that integrate the judicial exception into a practical application (see MPEP §§ 2106.05(a)—(c), (e)—(h) (9th ed. Rev. 08.2017, Jan. 2018)).
Only if a claim (1) recites a judicial exception and (2) does not integrate that exception into a practical application, do we then look to whether the claim: 
(see MPEP § 2106.05(d)); or
(4) simply appends well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception. See Guidance, 84 Fed. Reg. at 56.
The Examiner’s Rejection
The Examiner rejects independent claims 1, 2, and 13, finding that the features of those claims correspond to concepts identified as abstract ideas by the courts, such as delivering user-selected media content to portable devices. (citing Affinity Labs of Tex. v. Amazon.com, Inc., 838 F.3d 1266 (Fed. Cir. 2016)).  We find that the limitations of the current claims are performed by the generically recited computer/processor, and that the limitations are merely instructions to implement the abstract idea on a computer and require no more than a generic computer to perform generic computer functions that are well-understood, routine and conventional activities previously known to the industry.  We also find that these additional limitations are not sufficient to amount to significantly more than the judicial exception, whether considered individually or as an ordered combination.  Specifically, we find that the use of generic computer components to determine vectors does not impose any meaningful limit on the computer implementation of the abstract idea, and there is no indication that the combination of elements improves the functioning of a computer or improves any other technology. Rather, their collective functions merely provide conventional computer implementation.
Judicial Exception (Step 2A, Prong 1)
As per the rejection of independent claims 1, 2, and 13 under 35 U.S.C. 101, viewing the rejection through the lens of the Guidance, we must first consider whether the claim recites a judicial exception. Guidance, 84 Fed. Reg. at 51. The USPTO has synthesized the key concepts identified by the courts as abstract ideas into three primary subject-matter groupings: mathematical concepts, certain methods of organizing human activity (e.g., a fundamental economic practice), and mental processes. Id. at 52. As explained below, the claims recite certain methods of of mental processes, which are identified by the Guidance as abstract ideas. Id.

The steps of claims 1, 2, and 13 could be performed by a user in his or her head, but for claims’ recitations of generic computer hardware and instructions. For example, a user could visually observe and determine the related features of an image. Here, the claimed information associating vector information with the images can be held in the user’s memory. Thus, claims 1, 2, and 13 recite concepts that can be performed in the human mind (observation, evaluation, judgement), an example of a mental process. See Elec. Power Group, LLC v. Alstom S.A., 830 F.3d 1350, 1354 (Fed. Cir. 2016) (“[W]e have treated analyzing information by steps people go through in their minds, or by mathematical algorithms, without more, as essentially mental processes within the abstract-idea category.”). The Guidance lists a mental process as another example of an abstract idea. Guidance, 84 Fed. Reg. at 52.
Furthermore, it is noted that the claim fails to require that “the image feature extraction needs to be based on the pixels of images.”  Rather, the claim recites “extracting an image feature vector of a target image, the image feature vector representing an image content of the target image.”  In response to applicant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., image feature extraction based on the pixels of the images) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).  That is, while Applicant asserts that the Specification requires an extraction based on the pixels of images, it is noted that said features are not recited within the claims and so lack patentable weight.
Accordingly, we find that claims 1, 2, and 13 recites abstract ideas of a mental process. For the same reasons, claims 3-12 and 14-20 recite abstract ideas.
“Directed to ” the Judicial Exception (Step 2A, Prong 2)
Because the claims recite an abstract idea, we now proceed to determine whether the recited judicial exception is integrated into a practical application. See Guidance, 84 Fed. Reg. at 51. Specifically, we look to whether the claim recites additional elements that integrate the exception into a practical 
improvement to other technology or technical field. When a claim recites a judicial exception and fails to integrate the exception into a practical application, the claim is directed to the judicial exception.
Applicant asserts the argument that “Claim 1 recites an improvement in the functioning of computer technology.”  See Amendment, page 11.
As previously provided, we note that the claims do not recite an improvement to another technology or technical field, an improvement to the functioning of the computer itself, or meaningful limitations beyond generally linking the use of an abstract idea to a particular technological environment. Specifically, we find that the limitations of the current claims are performed by the generically recited computer/processor and that the use of generic computer components to tag images and does not impose any meaningful limit on the computer implementation of the abstract idea.  As to the claim as a whole, we find that there is no indication that the combination of elements improves the functioning of a computer or improves any other technology as tagging images for transmission over a network is well-understood, routine, and conventional, and therefore does not add significantly more than the abstract idea (see MPEP 2106.05(d)(ii), "receiving or transmitting data over a network").
More broadly, we find that the instant claims merely recites abstract ideas implemented on generic computer hardware, with generic programming instructions. Unlike the claimed invention in McRO, for example, that improved how a physical display operated to produce better quality images, the claimed invention here merely uses generic computing components to evaluate images.
Simply reciting generic computer hardware for performing an abstract idea does not integrate that abstract idea into a practical application. See Alice, 573 U.S. at 225—26 (“Viewed as a whole, petitioner’s method claims simply recite the concept of intermediated settlement as performed by a generic computer. The method claims do not, for example, purport to improve the functioning of the computer itself. Nor do they effect an improvement in any other technology or technical field. Instead, the claims at issue amount to ‘nothing significantly more’ than an instruction to apply the abstract idea of intermediated settlement using some unspecified, generic computer.” (internal citations omitted)); DealertrackInc. v. Huber, 674 F.3d 1315, 1333 (Fed. Cir. 2012) (“Simply adding a ‘computer aided’ limitation to a claim covering an See Elec. Power Group, 830 F.3d at 1354. The Guidance also discusses other ways that additional elements can integrate the judicial exception into a practical application—e.g., a particular machine or manufacture, a particular transformation, and a particular treatment of a disease. See Guidance, 84 Fed. Reg. at 55. The instant claims also lacks such features.
Accordingly, claims 1, 2, and 13 do not integrate the recited abstract ideas into a practical application. We find that the additional limitations of the dependent claims do not integrate the abstract ideas into practical applications; rather, they simply recite the use of generic computer components and do not impose meaningful limits on the computer implementations of the abstract ideas.
Inventive Concept (Step 2B)
To determine whether a claim provides an inventive concept, we consider the additional elements - individually and in combination - to determine whether they (1) add a specific limitation beyond the judicial exception that is not “well-understood, routine, conventional” in the field or (2) simply append well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception. Guidance, 84 Fed. Reg. at 56. Also, we reevaluate our conclusions about the additional elements discussed in the previous step. Id.
We find that the limitations of the instant claims, individually and in an ordered combination, do not recite significantly more than the abstract idea because the limitations are merely instructions to implement the abstract idea on a computer and require no more than a generic computer to perform generic computer functions that are well-understood, routine and conventional activities previously known to the industry.  It is noted that Applicant fails to provide how the claimed invention improves the technical field of relevant technology.  Specifically, while Applicant asserts that the claimed method “provides payment information based on the time difference between two wireless connections,” it is noted that claimed invention is directed to using search click behavior for determining a text that is relevant to an image.  It is unclear how the claimed method may improve the “payment information” when there is no reference to such within the recited claims. 

In sum, the limitations of claims 1, 2, and 13, considered individually and in combination, do not provide an inventive concept.  For the same reasons, claims 3-12 and 14-20 recite abstract ideas.
Applicant’s arguments with respect to the rejection of claim 1 under 35 U.S.C. 103 has been considered but are moot because the newly-cited prior art combination.
Applicant's arguments with respect to the claim rejections under 35 U.S.C. 103 in view of Fu and Hwang have been fully considered but they are not persuasive.
Applicant asserts the argument that the cited prior art combination fails to disclose the newly recited claim feature of “selecting a tag corresponding to a highest correlation with the target image among multiple tags under a respective tag category of the multiple tag categories.”  See Amendment, page 16.  The Examiner respectfully disagrees.
Hwang is directed to the invention of a system for processing images.  Hwang discloses that an image processing apparatus may determine a category of a target image.  Specifically, Hwang discloses that “[t]he controller 150 may control the storage unit 130 to store additional information by mapping the additional information with the target image” wherein “[t]he additional information may include at least one of keywords related to the category including the target object.”  See Hwang, [0066].  That is, Hwang would disclose a system wherein keywords related to a category (i.e. tag categories) may be associated to a target image.
Applicant asserts the argument that “Hwang only achieves obtaining multiple tags for the same category.”  See Amendment, page 18.  The Examiner respectfully disagrees in that Hwang discloses that “the image analyzer 120 may determine in which of a plurality of categories an object presented by the target image is included.”  See Hwang, [0059].  Additionally, Hwang discloses that “[t]he image analyzer 120 may determine one or more categories including a plurality of target objects.”  See Hwang, [0060].
Additionally, Hwang discloses that “an operation of storing at least one of keywords related to an image by mapping the keyword with the image may be referred to "tagging an image"” such that “a 
Moreover, Hwang discloses that and “[t]he image processing apparatus may determine whether the target object is included in the first category, based on the matching score” and “[f]or the shape of the target object to have a high matching score, the shape of the target object may have characteristics similar to the root model and part models of the first category” such that “for the shape of the target object to have a high matching score, the shapes of parts of the target object may not be separated far from positions learned with respect to the part models of the first category.”  See Hwang, [0185].  Wherein Hwang discloses that a high matching score is utilized to determine a category, Hwang would have indeed read upon the claimed feature of selecting a tag with the highest correlation under “a respective tag category of the multiple tag categories.”
Accordingly, the claim rejections under 35 U.S.C. 103 are maintained.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL KIM whose telephone number is (571)272-2737.  The examiner can normally be reached on Monday-Friday, 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/Paul Kim/
Examiner
Art Unit 2152



/PK/