DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Drawings
Figures 4-5 and 8 have gray shading which does not qualify as black ink on white paper.  Examiner suggests removing the shading as it does not appear to be necessary to the drawing or replace the shading with black texturing.
Color photographs and color drawings are not accepted in utility applications unless a petition filed under 37 CFR 1.84(a)(2) is granted. Any such petition must be accompanied by the appropriate fee set forth in 37 CFR 1.17(h), one set of color drawings or color photographs, as appropriate, if submitted via EFS-Web or three sets of color drawings or color photographs, as appropriate, if not submitted via EFS-Web, and, unless already present, an amendment to include the following language as the first paragraph of the brief description of the drawings section of the specification:
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
Color photographs will be accepted if the conditions for accepting color drawings and black and white photographs have been satisfied. See 37 CFR 1.84(b)(2).

Claim Interpretation
Claim 9 recites “generating a translated tag in the target language for the input tag using a backup translation application”. The claims do not recite functionality, but instead recites what backup translation application is used for. Examiner suggests amending the claim to recite the functionality performed by the claimed method, instead of reciting what the claim elements are used for.

Claim Objections
Claims 2-6, 13, 14, 18 and 20  are objected to because of the following informalities:
Claims 2, 3, 13, 14 recite “a source language word” should read as --the source language word-- as it appears to be a typographical error and may cause antecedent basis issue. 
Claim 18 recites “comprising a source language word” should read as --comprising the source language word-- as it appears to be a typographical error and may cause antecedent basis issue. 
Claims 3, 14 and 18 recite “have source language tags” should read as --have the source language tags-- as it appears to be a typographical error and may cause antecedent basis issue.
Claims 2, 3, 13, 14 and 18 recite “a target language word” should read as --the target language word-- as it appears to be a typographical error and may cause antecedent basis issue.
Claim 4 recites “for each of the candidate translations, determining a context score based on co- occurrences of the candidate translation” should read as –for each of the candidate translations, determining a context score based on co- occurrences of the each of the plurality of candidate translations-- and “selecting the candidate translation” should read as --selecting a first candidate translation-- as it appears to be a typographical error and may cause antecedent basis issue.
Claims 5-6 and 20 recites “candidate translation” should read as –first candidate translation-- as it appears to be a typographical error and may cause antecedent basis issue. 
Appropriate corrections are required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Step 1:
Claims 1-11 are recited as being directed to a “method”. Claims 12-15 are recited as being directed to a “system”. Claims 16-20 are recited as being directed to a “computer-readable medium”. Thus claims 1-20 have been identified to be directed towards the appropriate statutory category. Below is further analysis related to step 2.

Regarding claim 1, 
Step 2A: Prong One: 
generating a co-occurrence data structure for a target language based on relevant images that are related to a set of target language words, wherein the co-occurrence data structure describes a co-occurrence of a target language word from the set of target language words and a source language word associated with the relevant images;
generating a plurality of candidate translations in the target language for the input tag; 
selecting a translated tag from the plurality of candidate translations based on the co- occurrence data structure indicating a higher relevance for the translated tag than a different translated tag from the plurality of candidate translations; and 
associating the translated tag with the input image.
These claim limitations appear to be reciting a “Mental Process” including evaluation which may be performed in a human mind. 
	A human being can mentally apply evaluation to generate co-occurrence data structure based on relevant images, source language and target language. A human being can apply evaluation to generate plurality of candidate translations and selecting a translated tag from plurality of candidate translations based on the generated co-occurrence data structure. A human mind can evaluate to associate translated tags with the input image. 
Step 2A: Prong Two:
Claim 1 further recites limitations:
receiving an input tag for an input image in a source language;
These claim limitations as a whole have been identified as insignificant extra-solution activity. Per MPEP 2106.05(g) “Extra-solution activity includes both pre-solution and post-solution activity. An example of pre-solution activity is a step of gathering data for use in a claimed process, e.g., a step of obtaining information about credit card transactions, which is recited as part of a claimed process of analyzing and manipulating the gathered”. Similarly the claim limitations as a whole above appear to be gathering data in terms of receiving input tag for an input image and do not appear to integrate the abstract idea into a practical application.
Step 2B:
Claim 1 further recites limitations:
receiving an input tag for an input image in a source language;  
These claim limitations as a whole have been identified as insignificant extra-solution activity. Per MPEP 2106.05(g) “Extra-solution activity includes both pre-solution and post-solution activity. An example of pre-solution activity is a step of gathering data for use in a claimed process, e.g., a step of obtaining information about credit card transactions, which is recited as part of a claimed process of analyzing and manipulating the gathered”. Similarly the claim limitations as a whole above appear to be gathering data in terms of receiving input tag for an input image and do not appear to integrate the abstract idea into a practical application and appear to be conventional computer functionality. Also, MPEP 2106.05(d)(II) has identified “Receiving or transmitting data over a network, e.g., using the Internet to gather data” as conventional computer technology. Similarly, the claim limitations identified above appear to be receiving data. As a result, these claim limitations as a whole do not appear to amount to significantly more than the abstract idea itself.

Regarding claim 12, 
Step 2A: Prong One: 
generating a co-occurrence data structure for a target language based on relevant images that are related to a set of target language words, wherein the co- occurrence data structure describes a co-occurrence of a target language word in the set of target language words and a source language word in the relevant images;
obtaining a plurality of candidate translations in the target language for the input tag; 
determining a translated tag from the plurality of candidate translations based on the co-occurrence data structure; and 
associating the translated tag with the input image.
These claim limitations appear to be reciting a “Mental Process” including evaluation which may be performed in a human mind. 
	A human being can mentally apply evaluation to generate co-occurrence data structure based on relevant images, source language and target language. A human being can apply evaluation to generate plurality of candidate translations and selecting a translated tag from plurality of candidate translations based on the generated co-occurrence data structure. A human mind can evaluate to associate translated tags with the input image. 
Step 2A: Prong Two:
Claim 12 further recites limitations:
A system comprising: 
a processing device; and 
a non-transitory computer-readable medium communicatively coupled to the processing device, wherein the processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations comprising:
These claim limitations appear to be to merely add the use of generic computer components which are merely executing the abstract idea within a computer device (see MPEP 2106.05(b)) and do not appear to integrate the abstract idea into a particular application.
Claim 12 further recites limitations:
receiving an input tag for an input image in a source language;
These claim limitations as a whole have been identified as insignificant extra-solution activity. Per MPEP 2106.05(g) “Extra-solution activity includes both pre-solution and post-solution activity. An example of pre-solution activity is a step of gathering data for use in a claimed process, e.g., a step of obtaining information about credit card transactions, which is recited as part of a claimed process of analyzing and manipulating the gathered”. Similarly the claim limitations as a whole above appear to be gathering data in terms of receiving input tag for an input image and do not appear to integrate the abstract idea into a practical application.
Step 2B:
Claim 12 further recites limitations:
A system comprising: 
a processing device; and 
a non-transitory computer-readable medium communicatively coupled to the processing device, wherein the processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations comprising:
These claim limitations appear to be to merely add the use of generic computer components which are merely executing the abstract idea within a computer device (see MPEP 2106.05(b)) and do not appear to amount to significantly more.
Claim 12 further recites limitations:
receiving an input tag for an input image in a source language;  
These claim limitations as a whole have been identified as insignificant extra-solution activity. Per MPEP 2106.05(g) “Extra-solution activity includes both pre-solution and post-solution activity. An example of pre-solution activity is a step of gathering data for use in a claimed process, e.g., a step of obtaining information about credit card transactions, which is recited as part of a claimed process of analyzing and manipulating the gathered”. Similarly the claim limitations as a whole above appear to be gathering data in terms of receiving input tag for an input image and do not appear to integrate the abstract idea into a practical application and appear to be conventional computer functionality. Also, MPEP 2106.05(d)(II) has identified “Receiving or transmitting data over a network, e.g., using the Internet to gather data” as conventional computer technology. Similarly, the claim limitations identified above appear to be receiving data. As a result, these claim limitations as a whole do not appear to amount to significantly more than the abstract idea itself.

Regarding claim 16, 
Step 2A: Prong One: 
generating a plurality of candidate translations in a target language for the input tag; 
determining a translated tag from the plurality of candidate translations based on the plurality of tags of the input image in the source language; and 
associating the translated tag with the input image.
These claim limitations appear to be reciting a “Mental Process” including evaluation which may be performed in a human mind. 
	A human being can apply evaluation to generate plurality of candidate translations and selecting a translated tag from plurality of candidate translations based on the generated co-occurrence data structure. A human mind can evaluate to associate translated tags with the input image. 
Step 2A: Prong Two:
Claim 16 further recites limitations:
A non-transitory computer-readable medium having program code that is stored thereon, the program code executable by one or more processing devices for performing operations comprising:
These claim limitations appear to be to merely add the use of generic computer components which are merely executing the abstract idea within a computer device (see MPEP 2106.05(b)) and do not appear to integrate the abstract idea into a particular application.
Claim 16 further recites limitations:
receiving an input tag for an input image in a source language, wherein the input image has a plurality of tags in the source language other than the input tag;
These claim limitations as a whole have been identified as insignificant extra-solution activity. Per MPEP 2106.05(g) “Extra-solution activity includes both pre-solution and post-solution activity. An example of pre-solution activity is a step of gathering data for use in a claimed process, e.g., a step of obtaining information about credit card transactions, which is recited as part of a claimed process of analyzing and manipulating the gathered”. Similarly the claim limitations as a whole above appear to be gathering data in terms of receiving input tag for an input image and do not appear to integrate the abstract idea into a practical application.
Step 2B:
Claim 16 further recites limitations:
A non-transitory computer-readable medium having program code that is stored thereon, the program code executable by one or more processing devices for performing operations comprising:
These claim limitations appear to be to merely add the use of generic computer components which are merely executing the abstract idea within a computer device (see MPEP 2106.05(b)) and do not appear to amount to significantly more.
Claim 16 further recites limitations:
receiving an input tag for an input image in a source language, wherein the input image has a plurality of tags in the source language other than the input tag;  
These claim limitations as a whole have been identified as insignificant extra-solution activity. Per MPEP 2106.05(g) “Extra-solution activity includes both pre-solution and post-solution activity. An example of pre-solution activity is a step of gathering data for use in a claimed process, e.g., a step of obtaining information about credit card transactions, which is recited as part of a claimed process of analyzing and manipulating the gathered”. Similarly the claim limitations as a whole above appear to be gathering data in terms of receiving input tag for an input image and do not appear to integrate the abstract idea into a practical application and appear to be conventional computer functionality. Also, MPEP 2106.05(d)(II) has identified “Receiving or transmitting data over a network, e.g., using the Internet to gather data” as conventional computer technology. Similarly, the claim limitations identified above appear to be receiving data. As a result, these claim limitations as a whole do not appear to amount to significantly more than the abstract idea itself.

Regarding claims 2-8, 13-15, 17-20: 
Step 2A: 
Claim 2 recites:
wherein the co-occurrence of a target language word and a source language word is determined as a number of relevant images related to the target language word and associated with the source language word.
Claim 3 recites:
querying an image data store to identify a set of relevant images that are related to the target language word; 
determining source language tags of the set of relevant images, the source language tags comprising a source language word; and 
determining the co-occurrence of the target language word and the source language word as a number of images in the set of relevant images that have source language tags containing the source language word. 
	Claim 4 recites:
	for each of the candidate translations, determining a context score based on co- occurrences of the candidate translation and words in the plurality of tags of the input image other than the input tag; and 
selecting the candidate translation with a highest context score as the translated tag for the input tag.
	Claim 5 recites:
	wherein the context score of a candidate translation is further determined based on confidence scores associated with the plurality of tags of the input image other than the input tag.
Claim 6 recites:
	wherein the context score of a candidate translation is further determined based on a confidence score associated with the candidate translation. 
Claim 7 recites:
	wherein the input tag comprises a single word and wherein the plurality of candidate translations are generated based on a dictionary.
Claim 8 recites:
	further comprising determining that the input tag is a single-word tag, wherein generating the plurality of candidate translations in the target language for the input tag is performed in response to determining that the input tag is a single-word tag and based on a dictionary. 
Claim 17 recites:
	wherein the operations further comprise: generating a co-occurrence data structure for the target language based on relevant images that are related to a set of target language words, wherein the co-occurrence data structure describes a co-occurrence of a target language word in the set of target language words and a source language word in the relevant images; 
wherein determining the translated tag from the plurality of candidate translations is further based on the co-occurrence data structure.
These claim limitations appear to be reciting a “Mental Process” including evaluation which may be performed in a human mind. 
	A human being can apply evaluation to generate co-occurrence of source and target languages are based on relevant images. A human mind can evaluate to determine that relevant images based on target language words and source language words in order to determine the co-occurrence in the relevant images. A human being can evaluate to determine scores for the candidate translations based on the co-occurrence of candidate translation and words in the plurality of tags of the input images and selecting the candidate translation with the highest context score. A human mind can evaluate to determine the context score based on the confidence scores associated with the plurality of tags of the input image. A human being can apply evaluation to determine the context score based on the confidence scores associated with the candidate translations. A human being can evaluate to determine that the input tag comprises a single word and then generate the translations based on a dictionary. A human mind can evaluate to determine that the input tag is a single-word tag and generating candidate translations based on a dictionary. 
Claim 13 incorporates substantively all the limitations of claim 2 in a system form and is rejected under the same rationale.
Claims 14 and 18 incorporate substantively all the limitations of claim 3 in a system and computer-readable medium form and are rejected under the same rationale.
Claims 15 and 19 incorporate substantively all the limitations of claim 4 in a system and computer-readable medium form and are rejected under the same rationale.
Claim 20 incorporates substantively all the limitations of claim 6 in computer-readable medium form and is rejected under the same rationale. 

Regarding claim 9, 
Step 2A: Prong One: 
Claim 9 further recites limitations:
determining no candidate translations in the target language are generated for the input tag based on the dictionary; and responsive to determining that no candidate translations are generated based on the dictionary,…
These claim limitations appear to be reciting a “Mental Process” including evaluation which may be performed in a human mind. 
	A human being can apply evaluation to determine that no candidate translations in the target language are generated based on the dictionary. 
Step 2A: Prong Two:
Claim 9 further recites limitations:
…generating a translated tag in the target language for the input tag using a backup translation application.
These claim limitations appear to be reciting translation application to translate information between languages. Language translation applications do not integrate into a practical application since they are conventional computer functions. Per MPEP (2106.05 (b) (I)), computer that applies a judicial exception, such as an abstract idea, by use of conventional computer functions does not qualify as a particular machine. As a result these claim limitations do not apply to a practical application.
Step 2B:
Claim 9 further recites limitations:
… generating a translated tag in the target language for the input tag using a backup translation application.
These claim limitations appear to be reciting translation application to translate information between languages. Language translation applications do not amount to significantly more since they are conventional computer functions. Per MPEP (2106.05 (b) (I)), computer that applies a judicial exception, such as an abstract idea, by use of conventional computer functions does not qualify as a particular machine. As a result, these limitations are merely executing the abstract idea without being significantly more than the abstract. These references provide evidence that language translation applications are conventional computer technology: Blakely et al. (US 2002/0175937 A1 – [0003]), Ali et al. (US 2005/0198573 A1 – [0017]), Chu et al. (US 2006/0294463 A1 – [0031]), Bauman et al. (US 2007/0179775 A1 – [0054], Alwan et al. (US 2007/0244691 A1 – [0029]) and Och et al. (US 2008/0262828 A1 – [0052]). 

Regarding claim 10, 
Step 2A: Prong One: 
determining that the input tag is a multi-word tag; and responsive to determining that the input tag is a multi-word tag,…
 These claim limitations appear to be reciting a “Mental Process” including evaluation which may be performed in a human mind. 
	A human being can apply evaluation to determine that input tag is a multi-word tag. 
Step 2A: Prong Two:
Claim 10 further recites limitations:
determining that the input tag is a multi-word tag; and responsive to determining that the input tag is a multi-word tag,…
These claim limitations appear to be reciting translation application to translate information between languages. Language translation applications do not integrate into a practical application since they are conventional computer functions. Per MPEP (2106.05 (b) (I)), computer that applies a judicial exception, such as an abstract idea, by use of conventional computer functions does not qualify as a particular machine. As a result these claim limitations do not apply to a practical application.
Step 2B:
Claim 10 further recites limitations:
…determining a translated tag for the input tag via a multi-word translation application.
These claim limitations appear to be reciting translation application to translate information between languages. Language translation applications do not amount to significantly more since they are conventional computer functions. Per MPEP (2106.05 (b) (I)), computer that applies a judicial exception, such as an abstract idea, by use of conventional computer functions does not qualify as a particular machine. As a result, these limitations are merely executing the abstract idea without being significantly more than the abstract. These references provide evidence that language translation applications are conventional computer technology: Blakely et al. (US 2002/0175937 A1 – [0003]), Ali et al. (US 2005/0198573 A1 – [0017]), Chu et al. (US 2006/0294463 A1 – [0031]), Bauman et al. (US 2007/0179775 A1 – [0054], Alwan et al. (US 2007/0244691 A1 – [0029]) and Och et al. (US 2008/0262828 A1 – [0052]).
	
Regarding claim 11, 
Step 2A: 
Claim 11 further recites limitations:
wherein the multi-word translation application comprises a machine learning model trained to translate a multi-word tag from the source language to the target language.
These claim limitations appear to be reciting machine learning translation application to translate information between languages. Machine learning language translation applications do not integrate into a practical application since they are conventional computer functions. Per MPEP (2106.05 (b) (I)), computer that applies a judicial exception, such as an abstract idea, by use of conventional computer functions does not qualify as a particular machine. As a result these claim limitations do not apply to a practical application.
Step 2B:
Claim 11 further recites limitations:
wherein the multi-word translation application comprises a machine learning model trained to translate a multi-word tag from the source language to the target language.
These claim limitations appear to be reciting translation application to translate information between languages. Machine learning language translation applications do not amount to significantly more since they are conventional computer functions. Per MPEP (2106.05 (b) (I)), computer that applies a judicial exception, such as an abstract idea, by use of conventional computer functions does not qualify as a particular machine. As a result, these limitations are merely executing the abstract idea without being significantly more than the abstract. These references provide evidence that machine learning language translation applications are conventional computer technology: Menezes et al. (US 2009/0326911 A1 – [0002]), Drewes (US 2012/0284015 A1 – [0013]), Chen et al. (US 2018/0129972 A1 – [0035], [0047], [0063]) and Chung et al. (US 2019/0325308 A1 – [0047], [0048] and [0070]). 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 4, 6, 12, 15-17 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Noh et al. (“An Automatic Translation of Tags for Multimedia Contents Using Folksonomy Networks”, July 19–23, 2009, download date 8 June 2020, hereinafter “Noh”) in view of Moore (US 2004/0098247 A1, hereinafter “Moore”).

Regarding claim 1, Noh teaches
	A method in which one or more processing devices perform operations comprising: (see Noh, [page1 col1 ¶1] “a novel method to translate tags attached to multimedia contents for cross-language retrieval”). 
generating a co-occurrence data structure for a target language (see Noh, [page1 col2 ¶3 “A cross-language image retrieval system can translate either the attached annotations or the queries… A translation candidate of a query word is assigned with a high coherence score when it co-occurs frequently with the translations of other query words”) based on relevant images (see Noh, [page1 col2 ¶2 “even when a relevant image annotated with one language is found, the images that are more relevant can be retrieved with other languages”) that are related to a set of target language words, (see Noh, [page2 col1 ¶2] “A tag set for each multimedia content can be represented as a single network that connects the tag words through folksonomy. A word can have several senses. Therefore, the network of the tag set in the source language is compared with multiple networks in the target language, since the structures of the networks in the target language are different from one another according to the translation candidates of the tag set. Among the possible networks in the target language, the most similar network to the tag network in the source language is chosen as the most probable translation of the tag set”; [page3 col1 ¶2] “this network, a node represents a tag, and an edge between two tags is made when they co-occur”) annotations associated with the relevant images; (see Noh, [page1 col2 ¶2 “even when a relevant image annotated with one language is found, the images that are more relevant can be retrieved with other languages”).
receiving an input tag for an input image in a source language; (see Noh, [page1 col1 ¶5] “Most of them are pictures and video clips uploaded by users of the services. Some websites such as Flickr, YouTube and Picasa encourage their users to upload and share their multimedia contents through folksonomy tagging systems”; [page2 col2 ¶2 “Many images are uploaded only with tags and titles, and few have other text annotations”).   
generating a plurality of candidate translations in the target language for the input tag; (see Noh, [page4 col1 ¶1 “two English tag words “wood, desk” are attached to an object. Both words have more than one sense. Suppose that an English-German dictionary look-up shows two candidates for “wood” (Holz - wooden material, Wald - forest), and three candidates for “desk” (Schalter - as a counter, Schreibtisch - a place for reading and writing, Tisch - a table). If one source tag is translated into one target tag, there are six candidates: “Holz, Schalter”, “Holz, Schreibtisch”, “Holz, Tisch”, “Wald, Schalter”, “Wald, Schreibtisch” and “Wald, Tisch”.”). 
selecting a translated tag from the plurality of candidate translations based on tag networks (see Noh, [page4 col1 ¶2 “By comparing the tag network of “wood, desk” one by one with tag networks of all six candidates, the most probable translation is chosen by picking up the most similar network”) indicating a higher relevance for the translated tag than a different translated tag from the plurality of candidate translations; and (see Noh, [page7 col2 ¶4 “a strong connection is established between the English networks of “music” and “underground”, while very few sub-structures are shared between the English tags “music” and “(a sense of) subway”. Thus, the similarity of “Untergrund, Musik” is far higher than that of “U-bahn, Musik””).  
associating the translated tag with the input image (see Noh, [page6 col1 ¶1 “The answer set is prepared for each tag. For each image in the test set, both its original tags (English) and their translation candidates (German) are shown… selects one or more translation candidates as proper German tags for each image… a tag “canon” can be attached on an image as a company name”). 
Noh does not explicitly teach wherein the co-occurrence data structure describes a co-occurrence of a target language word from the set of target language words and a source language word; selecting a translated tag from the plurality of candidate translations based on the co-occurrence data structure. 
However, Moore discloses phrase translation relationships and also teaches
wherein the co-occurrence data structure describes a co-occurrence of a target language word from the set of target language words and a source language word (see Moore, [0076] “that computes word association scores for individual word pairs and multi-word pairs previously described, except that multi-words are broken down into their constituent single words before word association scores are  computed. In other words, the degree of association between a word (Ws) in the source language sentence and a word (Wt) in the target language sentence is computed in terms of the frequencies with which Ws occurs in sentences of the source language (S) part of the corpus and Wt occurs in sentences of the target language (T) part of the corpus, compared to the frequency with which Ws and Wt co-occur in aligned sentences of the corpus”).
determining best phrase translations based on the co-occurrence data structure (see Moore, [0076] “that computes word association scores for individual word pairs and multi-word pairs previously described, except that multi-words are broken down into their constituent single words before word association scores are  computed. In other words, the degree of association between a word (Ws) in the source language sentence and a word (Wt) in the target language sentence is computed in terms of the frequencies with which Ws occurs in sentences of the source language (S) part of the corpus and Wt occurs in sentences of the target language (T) part of the corpus, compared to the frequency with which Ws and Wt co-occur in aligned sentences of the corpus”; [0124] “Therefore, model 408 iterates on steps 754 and 756 until the best phrase translations remain the same, or stabilize”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include functionality of co-occurrence data structure as being disclosed and taught by Moore, in the system taught by Noh to yield the predictable results of efficiently maintain the safety, integrity, security and confidentiality of the data (see Moore, [0121] “Model 406 thus produces a new set of most likely translations which take into account the effect of translations across the entire corpus. This is indicated by block 711. This set of most likely translations is provided to model 408”).

Regarding claim 12, Noh teaches
generating a co-occurrence data structure for a target language (see Noh, [page1 col2 ¶3 “A cross-language image retrieval system can translate either the attached annotations or the queries… A translation candidate of a query word is assigned with a high coherence score when it co-occurs frequently with the translations of other query words”) based on relevant images (see Noh, [page1 col2 ¶2 “even when a relevant image annotated with one language is found, the images that are more relevant can be retrieved with other languages”) that are related to a set of target language words, (see Noh, [page2 col1 ¶2] “A tag set for each multimedia content can be represented as a single network that connects the tag words through folksonomy. A word can have several senses. Therefore, the network of the tag set in the source language is compared with multiple networks in the target language, since the structures of the networks in the target language are different from one another according to the translation candidates of the tag set. Among the possible networks in the target language, the most similar network to the tag network in the source language is chosen as the most probable translation of the tag set”; [page3 col1 ¶2] “this network, a node represents a tag, and an edge between two tags is made when they co-occur”) annotations in the relevant images; (see Noh, [page1 col2 ¶2 “even when a relevant image annotated with one language is found, the images that are more relevant can be retrieved with other languages”). 
receiving an input tag for an input image in the source language; (see Noh, [page1 col1 ¶5] “Most of them are pictures and video clips uploaded by users of the services. Some websites such as Flickr, YouTube and Picasa encourage their users to upload and share their multimedia contents through folksonomy tagging systems”; [page2 col2 ¶2 “Many images are uploaded only with tags and titles, and few have other text annotations”).
obtaining a plurality of candidate translations in the target language for the input tag; (see Noh, [page4 col1 ¶1 “two English tag words “wood, desk” are attached to an object. Both words have more than one sense. Suppose that an English-German dictionary look-up shows two candidates for “wood” (Holz - wooden material, Wald - forest), and three candidates for “desk” (Schalter - as a counter, Schreibtisch - a place for reading and writing, Tisch - a table). If one source tag is translated into one target tag, there are six candidates: “Holz, Schalter”, “Holz, Schreibtisch”, “Holz, Tisch”, “Wald, Schalter”, “Wald, Schreibtisch” and “Wald, Tisch”.”).
determining a translated tag from the plurality of candidate translations based on tag networks (see Noh, [page4 col1 ¶2 “By comparing the tag network of “wood, desk” one by one with tag networks of all six candidates, the most probable translation is chosen by picking up the most similar network”).
and; associating the translated tag with the input image (see Noh, [page6 col1 ¶1 “The answer set is prepared for each tag. For each image in the test set, both its original tags (English) and their translation candidates (German) are shown… selects one or more translation candidates as proper German tags for each image… a tag “canon” can be attached on an image as a company name”).
Noh does not explicitly teach A system comprising: a processing device; and a non-transitory computer-readable medium communicatively coupled to the processing device, wherein the processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations comprising: wherein the co-occurrence data structure describes a co-occurrence of a target language word in the set of target language words and a source language word; determining a translated tag from the plurality of candidate translations based on the co-occurrence data structure. 
However, Moore discloses phrase translation relationships and also teaches
A system comprising: (see Moore, [0043] “a computer 20 in accordance with one illustrative embodiment”). 
a processing device; and (see Moore, [0043] “a computer 20 in accordance with one illustrative embodiment”). 
a non-transitory computer-readable medium communicatively coupled to the processing device, wherein the processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations comprising: (see Moore, [0044] “The drives and the associated computer-readable media provide nonvolatile storage of computer readable instructions, data structures, program modules and other data for the personal computer 20”).
wherein the co- occurrence data structure describes a co-occurrence of a target language word in the set of target language words and a source language word (see Moore, [0076] “that computes word association scores for individual word pairs and multi-word pairs previously described, except that multi-words are broken down into their constituent single words before word association scores are  computed. In other words, the degree of association between a word (Ws) in the source language sentence and a word (Wt) in the target language sentence is computed in terms of the frequencies with which Ws occurs in sentences of the source language (S) part of the corpus and Wt occurs in sentences of the target language (T) part of the corpus, compared to the frequency with which Ws and Wt co-occur in aligned sentences of the corpus”).
determining best phrase translations based on the co-occurrence data structure (see Moore, [0076] “that computes word association scores for individual word pairs and multi-word pairs previously described, except that multi-words are broken down into their constituent single words before word association scores are  computed. In other words, the degree of association between a word (Ws) in the source language sentence and a word (Wt) in the target language sentence is computed in terms of the frequencies with which Ws occurs in sentences of the source language (S) part of the corpus and Wt occurs in sentences of the target language (T) part of the corpus, compared to the frequency with which Ws and Wt co-occur in aligned sentences of the corpus”; [0124] “Therefore, model 408 iterates on steps 754 and 756 until the best phrase translations remain the same, or stabilize”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include functionality of system and co-occurrence data structure as being disclosed and taught by Moore, in the system taught by Noh to yield the predictable results of efficiently maintain the safety, integrity, security and confidentiality of the data (see Moore, [0121] “Model 406 thus produces a new set of most likely translations which take into account the effect of translations across the entire corpus. This is indicated by block 711. This set of most likely translations is provided to model 408”).

Regarding claim 16, Noh teaches
receiving an input tag for an input image in a source language, (see Noh, [page1 col1 ¶5] “Most of them are pictures and video clips uploaded by users of the services. Some websites such as Flickr, YouTube and Picasa encourage their users to upload and share their multimedia contents through folksonomy tagging systems”; [page2 col2 ¶2 “Many images are uploaded only with tags and titles, and few have other text annotations”) wherein the input image has a plurality of tags in the source language other than the input tag; (see Noh, [page4 col1 ¶1] “For example, two English tag words “wood, desk” are attached to an object”).  
generating a plurality of candidate translations in a target language for the input tag; (see Noh, [page4 col1 ¶1 “two English tag words “wood, desk” are attached to an object. Both words have more than one sense. Suppose that an English-German dictionary look-up shows two candidates for “wood” (Holz - wooden material, Wald - forest), and three candidates for “desk” (Schalter - as a counter, Schreibtisch - a place for reading and writing, Tisch - a table). If one source tag is translated into one target tag, there are six candidates: “Holz, Schalter”, “Holz, Schreibtisch”, “Holz, Tisch”, “Wald, Schalter”, “Wald, Schreibtisch” and “Wald, Tisch”.”).
determining a translated tag from the plurality of candidate translations based on the plurality of tags of the input image in the source language; and (see Noh, [page4 col1 ¶2 “By comparing the tag network of “wood, desk” one by one with tag networks of all six candidates, the most probable translation is chosen by picking up the most similar network”). 
associating the translated tag with the input image (see Noh, [page6 col1 ¶1 “The answer set is prepared for each tag. For each image in the test set, both its original tags (English) and their translation candidates (German) are shown… selects one or more translation candidates as proper German tags for each image… a tag “canon” can be attached on an image as a company name”).
Noh does not explicitly teach A non-transitory computer-readable medium having program code that is stored thereon, the program code executable by one or more processing devices for performing operations comprising:
	However, Moore discloses phrase translation relationships and also teaches
	A non-transitory computer-readable medium having program code that is stored thereon, the program code executable by one or more processing devices for performing operations comprising: (see Moore, [0043] “a computer 20 in accordance with one illustrative embodiment”; [0044] “The drives and the associated computer-readable media provide nonvolatile storage of computer readable instructions, data structures, program modules and other data for the personal computer 20”).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include functionality of computer-readable medium and co-occurrence data structure as being disclosed and taught by Moore, in the system taught by Noh to yield the predictable results of efficiently maintain the safety, integrity, security and confidentiality of the data (see Moore, [0121] “Model 406 thus produces a new set of most likely translations which take into account the effect of translations across the entire corpus. This is indicated by block 711. This set of most likely translations is provided to model 408”).

	Regarding claim 4, the proposed combination of Noh and Moore teaches
	wherein the input image has a plurality of tags in the source language including the input tag, and (see Noh, [page4 col1 ¶1 “two English tag words “wood, desk” are attached to an object”) wherein determining a translated tag from the plurality of candidate translations (see Noh, [page4 col1 ¶2 “By comparing the tag network of “wood, desk” one by one with tag networks of all six candidates, the most probable translation is chosen by picking up the most similar network”) based on the co-occurrence data structure comprises: (see Moore, [0076] “that computes word association scores for individual word pairs and multi-word pairs previously described, except that multi-words are broken down into their constituent single words before word association scores are  computed. In other words, the degree of association between a word (Ws) in the source language sentence and a word (Wt) in the target language sentence is computed in terms of the frequencies with which Ws occurs in sentences of the source language (S) part of the corpus and Wt occurs in sentences of the target language (T) part of the corpus, compared to the frequency with which Ws and Wt co-occur in aligned sentences of the corpus”; [0124] “Therefore, model 408 iterates on steps 754 and 756 until the best phrase translations remain the same, or stabilize”).
for each of the candidate translations, determining a context score based on co- occurrences of the candidate translation and words in the plurality of tags of the input image other than the input tag; and (see Noh, [page1 col2 ¶3] “the coherence score of a translation candidate is computed using word co-occurrence statistics. A translation candidate of a query word is assigned with a high coherence score when it co-occurs frequently with the translations of other query words”). 
selecting the candidate translation with a highest context score as the translated tag for the input tag (see Noh, [page1 col2 ¶3] “the coherence score of a translation candidate is computed using word co-occurrence statistics. A translation candidate of a query word is assigned with a high coherence score when it co-occurs frequently with the translations of other query words”; [page2 col1 ¶1] “in both tag translation and query translation is to select the most probable translation candidate of each tag or query word. This is often called translation selection”). The motivation for the proposed combination is maintained. 
	Claims 15 and 19 incorporate substantively all the limitations of claim 4 in a system and computer-readable medium form and are rejected under the same rationale.

	Regarding claim 6, the proposed combination of Noh and Moore teaches
	wherein the context score of a candidate translation is further determined based on (see Noh, [page1 col2 ¶3] “the coherence score of a translation candidate is computed using word co-occurrence statistics. A translation candidate of a query word is assigned with a high coherence score when it co-occurs frequently with the translations of other query words”) a confidence score associated with the candidate translation (see Moore, [0086] “This generates a consistent set of log-likelihood-ratio scores to use as a confidence measure for the phrase translation pairs 410”; [claim 31] “to convert the modified score into a desired confidence metric indicative of a confidence level associated with the candidate phrase as a translation of the source language phrase”). The motivation for the proposed combination is maintained. 
	Claim 20 incorporates substantively all the limitations of claim 6 in computer-readable medium form and is rejected under the same rationale.

	Regarding claim 17, the proposed combination of Noh and Moore teaches
wherein the operations further comprise: generating a co-occurrence data structure for the target language (see Moore, [0076] “that computes word association scores for individual word pairs and multi-word pairs previously described, except that multi-words are broken down into their constituent single words before word association scores are  computed. In other words, the degree of association between a word (Ws) in the source language sentence and a word (Wt) in the target language sentence is computed in terms of the frequencies with which Ws occurs in sentences of the source language (S) part of the corpus and Wt occurs in sentences of the target language (T) part of the corpus, compared to the frequency with which Ws and Wt co-occur in aligned sentences of the corpus”) based on relevant images (see Noh, [page1 col2 ¶2 “even when a relevant image annotated with one language is found, the images that are more relevant can be retrieved with other languages”) that are related to a set of target language words, wherein the co-occurrence data structure describes a co-occurrence of a target language word in the set of target language words and a source language word (see Moore, [0076] “that computes word association scores for individual word pairs and multi-word pairs previously described, except that multi-words are broken down into their constituent single words before word association scores are  computed. In other words, the degree of association between a word (Ws) in the source language sentence and a word (Wt) in the target language sentence is computed in terms of the frequencies with which Ws occurs in sentences of the source language (S) part of the corpus and Wt occurs in sentences of the target language (T) part of the corpus, compared to the frequency with which Ws and Wt co-occur in aligned sentences of the corpus”) in the relevant images; (see Noh, [page1 col2 ¶2 “even when a relevant image annotated with one language is found, the images that are more relevant can be retrieved with other languages”).
wherein determining the translated tag from the plurality of candidate translations is further based on (see Noh, [page4 col1 ¶2 “By comparing the tag network of “wood, desk” one by one with tag networks of all six candidates, the most probable translation is chosen by picking up the most similar network”) the co-occurrence data structure (see Moore, [0076] “that computes word association scores for individual word pairs and multi-word pairs previously described, except that multi-words are broken down into their constituent single words before word association scores are  computed. In other words, the degree of association between a word (Ws) in the source language sentence and a word (Wt) in the target language sentence is computed in terms of the frequencies with which Ws occurs in sentences of the source language (S) part of the corpus and Wt occurs in sentences of the target language (T) part of the corpus, compared to the frequency with which Ws and Wt co-occur in aligned sentences of the corpus”; [0124] “Therefore, model 408 iterates on steps 754 and 756 until the best phrase translations remain the same, or stabilize”). The motivation for the proposed combination is maintained.

Claims 2-3, 13-14 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Noh and Moore in view of Hewavitharana et al. (US 2018/0075508 A1, hereinafter “Hewavitharana”).

Regarding claim 2, the proposed combination of Noh and Moore teaches
wherein the co-occurrence of a target language word and a source language word is determined as (see Moore, [0076] “that computes word association scores for individual word pairs and multi-word pairs previously described, except that multi-words are broken down into their constituent single words before word association scores are  computed. In other words, the degree of association between a word (Ws) in the source language sentence and a word (Wt) in the target language sentence is computed in terms of the frequencies with which Ws occurs in sentences of the source language (S) part of the corpus and Wt occurs in sentences of the target language (T) part of the corpus, compared to the frequency with which Ws and Wt co-occur in aligned sentences of the corpus”). 
The proposed combination of Noh and Moore does not explicitly teach a number of relevant images related to the target language word and associated with the source language word.
However, Hewavitharana discloses translating listing from a first language to a second language and also teaches
a number of relevant images related to the target language word and associated with the source language word (see Hewavitharana, [0014] “the Listing Engine translates a first listing from a first language to a second language. The first listing includes at least one image of a first item… The Listing Engine provides as input to an encoded neural network model a translated first listing and a second listing in the second language. The second listing includes at least one image of a second item… The Listing Engine obtains from the encoded neural network model a first feature vector for the translated first listing and a second feature vector for the second listing. The first and the second feature vectors both include at least one type of image signature feature and at least one type of listing text-based feature”; [0053] “the Listing Engine 150 inputs image portions into an encoded neural network model. For each pairing of listings, image portions are input into a first encoded neural network model. For example, in reference to the particular listing pair, an image portion(s) from the first listing and an image portion(s) from the second listing is input into the first encoded neural network model. The first encoded neural network model returns as first output a first similarity score. The first similarity score is representative of a degree of similarity between the various input image portions of the first and second listing”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include functionality of determining images based on association between target and source language words as being disclosed and taught by Hewavitharana, in the system taught by the proposed combination of Noh and Moore to yield the predictable results of increasing efficiency in identifying listing pairs for training data based on similarities identified (see Hewavitharana, [0054] “the Listing Engine 150 determines if the first similarity score satisfies a first threshold score. If the first threshold score is satisfied, the Listing Engine 150 executes operation 520. As such, the Listing Engine 150 performs a first similarity pass between the first and second listing so as to increase the efficiency in identifying listings pairs for training data”). 
Claim 13 incorporates substantively all the limitations of claim 2 in a system form and is rejected under the same rationale. 

Regarding claim 3, the proposed combination of Noh and Moore teaches
wherein the co-occurrence of a target language word and a source language word is determined by: (see Moore, [0076] “that computes word association scores for individual word pairs and multi-word pairs previously described, except that multi-words are broken down into their constituent single words before word association scores are  computed. In other words, the degree of association between a word (Ws) in the source language sentence and a word (Wt) in the target language sentence is computed in terms of the frequencies with which Ws occurs in sentences of the source language (S) part of the corpus and Wt occurs in sentences of the target language (T) part of the corpus, compared to the frequency with which Ws and Wt co-occur in aligned sentences of the corpus”).
to identify a set of relevant images (see Noh, [page1 col2 ¶2 “even when a relevant image annotated with one language is found, the images that are more relevant can be retrieved with other languages”).
of the set of relevant images, (see Noh, [page1 col2 ¶2 “even when a relevant image annotated with one language is found, the images that are more relevant can be retrieved with other languages”)
determining the co-occurrence of the target language word and the source language word (see Moore, [0076] “that computes word association scores for individual word pairs and multi-word pairs previously described, except that multi-words are broken down into their constituent single words before word association scores are  computed. In other words, the degree of association between a word (Ws) in the source language sentence and a word (Wt) in the target language sentence is computed in terms of the frequencies with which Ws occurs in sentences of the source language (S) part of the corpus and Wt occurs in sentences of the target language (T) part of the corpus, compared to the frequency with which Ws and Wt co-occur in aligned sentences of the corpus”) as a number of images in the set of relevant images (see Noh, [page1 col2 ¶2 “even when a relevant image annotated with one language is found, the images that are more relevant can be retrieved with other languages”).
The proposed combination of Noh and Moore does not explicitly teach querying an image data store to identify a set of relevant images that are related to the target language word; determining source language tags of the set of relevant images, the source language tags comprising a source language word; and; a number of images in the set of relevant images that have source language tags containing the source language word. 
However, Hewavitharana discloses translating listing from a first language to a second language and also teaches
querying an image data store to access images (see Hewavitharana, [0024] “the databases 126 are storage devices that store information to be posted (e.g., publications or listings) to the publication system 120. The databases 126 may also store digital item information in accordance with example embodiments, such as a plurality of listings in various languages. Each listing may have one or more images”; [0027] “the Listing Engine 150 may access the listings from the databases 126”) that are related to the target language word; (see Hewavitharana, [0014] “The second listing includes at least one image of a second item”; [0052] “accesses… a second plurality of listings in a target language”).
determining source language tags to determine images the source language tags comprising a source language word; and (see Hewavitharana, [0014] “The first listing includes at least one image of a first item”; [0052] “accesses a first plurality of listings in a source language”).
determining images that have source language tags containing the source language word (see Hewavitharana, [0014] “The first listing includes at least one image of a first item”; [0052] “accesses a first plurality of listings in a source language”). 
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include functionality of determining images based on association between target and source language words as being disclosed and taught by Hewavitharana, in the system taught by the proposed combination of Noh and Moore to yield the predictable results of increasing efficiency in identifying listing pairs for training data based on similarities identified (see Hewavitharana, [0054] “the Listing Engine 150 determines if the first similarity score satisfies a first threshold score. If the first threshold score is satisfied, the Listing Engine 150 executes operation 520. As such, the Listing Engine 150 performs a first similarity pass between the first and second listing so as to increase the efficiency in identifying listings pairs for training data”).
Claims 14 and 18 incorporate substantively all the limitations of claim 3 in a system and computer-readable medium form and are rejected under the same rationale.

Claims 5, 7-8 and 10-11 are rejected under 35 U.S.C. 103 as being unpatentable over Noh and Moore in view of Li et al. (US 10,242,034 A1, hereinafter “Li”).

Regarding claim 5, the proposed combination of Noh and Moore teaches
wherein the context score of a candidate translation is further determined based on (see Noh, [page1 col2 ¶3] “the coherence score of a translation candidate is computed using word co-occurrence statistics. A translation candidate of a query word is assigned with a high coherence score when it co-occurs frequently with the translations of other query words”).
The proposed combination of Noh and Moore does not explicitly teach confidence scores associated with the plurality of tags of the input image other than the input tag.
However, Li discloses determining related digital images and also teaches
	confidence scores associated with the plurality of tags of the input image other than the input tag (see Li, [col20 lines39-47] “the confidence score could be a confidence score for one tag that identifies a confidence that the tag is correct, or a confidence score for a combination of tags that identifies a confidence that the combination of tags are correct. For instance, a confidence score that an individual is included within one or more images may be high, but the confidence for the individual is attending a particular event or is at a particular location may be low” – Here the confidence score does not include the inputted tag).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include functionality of confidence scores associated with tags and tags being input as single or multiple words as being disclosed and taught by Li, in the system taught by the proposed combination of Noh and Moore to yield the predictable results of effectively determining images related to each other and of interest to a user (see Li, [col1 line59 – col2 line3] “The following detailed description is directed to technologies for the intelligent selection of images to create image narratives. As discussed above, while users can take, acquire, and store a large number of digital images, accessing these images at later points in times has proven difficult… the user may select an image narrative that includes images of the user that were programmatically determined to be of interest to the user. As used herein, “image narrative" may refer to a collection of digital images determined to be related to each other and of interest to a user”).

Regarding claim 7, the proposed combination of Noh and Moore teaches
and wherein the plurality of candidate translations are generated based on a dictionary (see Noh, [page6 col1 ¶6 “where n is the number of candidates listed in the dictionary for the translation of x”).
The proposed combination of Noh and Moore does not explicitly teach wherein the input tag comprises a single word.
However, Li discloses determining related digital images and also teaches
wherein the input tag comprises a single word (see Li, [col2 lines38-40] “a user may specify one or more keywords (e.g., tags) that may be used to generate an image narrative” - one keyword is interpreted as single-word tag). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include functionality of confidence scores associated with tags and tags being input as single or multiple words as being disclosed and taught by Li, in the system taught by the proposed combination of Noh and Moore to yield the predictable results of effectively determining images related to each other and of interest to a user (see Li, [col1 line59 – col2 line3] “The following detailed description is directed to technologies for the intelligent selection of images to create image narratives. As discussed above, while users can take, acquire, and store a large number of digital images, accessing these images at later points in times has proven difficult… the user may select an image narrative that includes images of the user that were programmatically determined to be of interest to the user. As used herein, "image narrative" may refer to a collection of digital images determined to be related to each other and of interest to a user”).

Regarding claim 8, the proposed combination of Noh and Moore teaches
wherein generating the plurality of candidate translations in the target language for the input tag is performed in response to determining that the input tag is a single-word tag and based on a dictionary (see Noh, [page6 col1 ¶6 “where n is the number of candidates listed in the dictionary for the translation of x”).
The proposed combination of Noh and Moore does not explicitly teach further comprising determining that the input tag is a single- word tag.
However, Li discloses determining related digital images and also teaches
further comprising determining that the input tag is a single- word tag, (see Li, [col2 lines38-40] “a user may specify one or more keywords (e.g., tags) that may be used to generate an image narrative” – one keyword is interpreted as single-word tag). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include functionality of confidence scores associated with tags and tags being input as single or multiple words as being disclosed and taught by Li, in the system taught by the proposed combination of Noh and Moore to yield the predictable results of effectively determining images related to each other and of interest to a user (see Li, [col1 line59 – col2 line3] “The following detailed description is directed to technologies for the intelligent selection of images to create image narratives. As discussed above, while users can take, acquire, and store a large number of digital images, accessing these images at later points in times has proven difficult… the user may select an image narrative that includes images of the user that were programmatically determined to be of interest to the user. As used herein, "image narrative" may refer to a collection of digital images determined to be related to each other and of interest to a user”).

Regarding claim 10, the proposed combination of Noh and Moore teaches
and responsive to determining that the input tag is a multi-word tag, determining a translated tag for the input tag via a multi-word translation application (see Noh, [page4 col1 ¶1] “For example, two English tag words “wood, desk” are attached to an object. Both words have more than one sense. Suppose that an English-German dictionary look-up shows two candidates for “wood” (Holz - wooden material, Wald - forest), and three candidates for “desk” (Schalter - as a counter, Schreibtisch - a place for reading and writing, Tisch - a table). If one source tag is translated into one target tag, there are six candidates: “Holz, Schalter”, “Holz, Schreibtisch”, “Holz, Tisch”, “Wald, Schalter”, “Wald, Schreibtisch” and “Wald, Tisch”; [page4 col2 ¶2] “Kernel methods are popular in the machine learning community with numerous applications in data mining”). 
The proposed combination of Noh and Moore does not explicitly teach determining that the input tag is a multi-word tag. 
	However, Li discloses determining related digital images and also teaches
determining that the input tag is a multi-word tag; and (see Li, [col2 lines38-40] “a user may specify one or more keywords (e.g., tags) that may be used to generate an image narrative” – more keywords are interpreted as multi-word tag). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include functionality of confidence scores associated with tags and tags being input as single or multiple words as being disclosed and taught by Li, in the system taught by the proposed combination of Noh and Moore to yield the predictable results of effectively determining images related to each other and of interest to a user (see Li, [col1 line59 – col2 line3] “The following detailed description is directed to technologies for the intelligent selection of images to create image narratives. As discussed above, while users can take, acquire, and store a large number of digital images, accessing these images at later points in times has proven difficult… the user may select an image narrative that includes images of the user that were programmatically determined to be of interest to the user. As used herein, "image narrative" may refer to a collection of digital images determined to be related to each other and of interest to a user”).

Regarding claim 11, the proposed combination of Noh, Moore and Li teaches
wherein the multi-word translation application comprises a machine learning model trained to translate a multi-word tag from the source language to the target language (see Noh, [page4 col1 ¶1] “For example, two English tag words “wood, desk” are attached to an object. Both words have more than one sense. Suppose that an English-German dictionary look-up shows two candidates for “wood” (Holz - wooden material, Wald - forest), and three candidates for “desk” (Schalter - as a counter, Schreibtisch - a place for reading and writing, Tisch - a table). If one source tag is translated into one target tag, there are six candidates: “Holz, Schalter”, “Holz, Schreibtisch”, “Holz, Tisch”, “Wald, Schalter”, “Wald, Schreibtisch” and “Wald, Tisch”; [page4 col2 ¶2] “Kernel methods are popular in the machine learning community with numerous applications in data mining”).

Claims 9 is rejected under 35 U.S.C. 103 as being unpatentable over Noh, Moore and Li in view of Qian et al. (US 2009/0210214 A1, hereinafter “Qian”).

Regarding claim 9, the proposed combination of Noh, Moore and Li teaches
determining no candidate translations in the target language are generated for the input tag based on the dictionary; and (see Noh, [page6 col1 ¶1] “The selection of “none of the above” is also an option for the translator to indicate that none of the candidates are proper for the given tag… If a dictionary contains the translation candidates only for common nouns the translator marks it with “none of the above””). 
responsive to determining that no candidate translations are generated based on the dictionary, (see Noh, [page6 col1 ¶1] “The selection of “none of the above” is also an option for the translator to indicate that none of the candidates are proper for the given tag… If a dictionary contains the translation candidates only for common nouns the translator marks it with “none of the above””).
The proposed combination of Noh, Moore and Li does not explicitly teach responsive to determining that no candidate translations are generated based on the dictionary, generating a translated tag in the target language for the input tag using a backup translation application. 
However, Qian discloses receiving translations and also discloses
if the translations cannot be determined by dictionary generating a translated tag in the target language for the input tag using a backup translation application (see Qian, [0022] “if the local translation model (e.g., translation model 209) cannot perform a particular translation ( e.g., a dictionary of the translation model 209 does not include a particular word or phrase, or the translation model 209 does not include the appropriate language model), then the universal language input method editor 206 uses the translation server 212”; [0023] “for translating input into an application interface as the input is detected. For convenience, translation of input into an application interface as the input is detected will be described with respect to a system that performs the translation”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include functionality of utilizing translation applications as being disclosed and taught by Qian, in the system taught by the proposed combination of Noh, Moore and Li to yield the predictable results of automatically providing one or more translations of the input (see Qian, [0014] “the user provides input to be translated 125 in user input area 120. A universal language input method editor can detect the input and automatically provide one or more translations of the input”).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VAISHALI SHAH whose telephone number is (571)272-8532. The examiner can normally be reached Monday - Friday (7:30 AM to 4:00 PM).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, TAMARA KYLE can be reached on (571)272-4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/VAISHALI SHAH/Primary Examiner, Art Unit 2156