DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 06/01/2022 has been entered.
Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 11, and 16 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 3, 9, 10, 11, 12, 16, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Goyal et al. (US-20180203852-A1) in view of Josifovski et al. (US-20100106704-A1), Novak et al. (Iterative Refinement for Machine Translation), and Jagmohan et al. (US-20180052849-A1).
Regarding Claim 1,
Goyal teaches a computer-implemented method comprising: 
training, by the computing system, a first machine learning model based on a training set of embeddings of a first type… (para [0066] At S114, the dialog act is input to the trained model, e.g., as a sequence of characters (or as a matrix composed of one hot vectors corresponding to the characters).), …and the first machine learning model is trained to receive an embedding of a second type and to output a corresponding embedding of the first type (fig. 4; para [0050] This may be performed using a look-up table 73. Optionally, a further projection may be applied to these embeddings (multiplying each by a weight matrix that is learned during learning of the model 50). The character embeddings 71, denoted (x.sub.1.sup.s, x.sub.2.sup.s, . . . , x.sub.S.sup.s), are input to an encoder 74, which generates a representation 76 of the character sequence 52, based on the character embeddings. Character embeddings 71 (i.e embedding of the second type. Encoder 74 (i.e. first machine learning model). Sequence representation 76 (i.e. first type of embedding).); 
determining, by the computing system, a given embedding of the second type as input (para [0050] This may be performed using a look-up table 73. Optionally, a further projection may be applied to these embeddings (multiplying each by a weight matrix that is learned during learning of the model 50). The character embeddings 71, denoted (x.sub.1.sup.s, x.sub.2.sup.s, . . . , x.sub.S.sup.s), are input to an encoder 74, which generates a representation 76 of the character sequence 52, based on the character embeddings. Character embeddings 71 (i.e embedding of the second type. Encoder 74 (i.e. first machine learning model).), wherein the given embedding is based on features associated with a user (para [0031] The exemplary system sequentially updates its belief of the dialog state using information extracted from user (and agent) utterances, generates a semantic representation of a next utterance, and converts this to a natural language utterance.);
obtaining, by the computing system, an embedding of the first type from the first machine learning model (para [0089] The output 76 of the encoder is thus treated as a bag of hidden vectors 94 and the decoder is therefore not limited to drawing them in sequence. Sequence representation 76 (i.e. first type of embedding).), wherein the embedding of the first type corresponds to the given embedding of the second type (Fig. 4; para [0050] This may be performed using a look-up table 73. Optionally, a further projection may be applied to these embeddings (multiplying each by a weight matrix that is learned during learning of the model 50). The character embeddings 71, denoted (x.sub.1.sup.s, x.sub.2.sup.s, . . . , x.sub.S.sup.s), are input to an encoder 74, which generates a representation 76 of the character sequence 52, based on the character embeddings. Sequence representation 76 (i.e. first type of embedding) obtained from encoder model associated with character embeddings 71 (i.e. second embedding type).); and 
generating, by the computing system, a recommendation for the user associated with the given embedding of the second type based on a second machine learning model (para [0017] In accordance with one aspect of the exemplary embodiment, a method is provided for generating a system for generation of a target sequence of characters from an input semantic representation. And para [0068] At S118, the generated target sequence 54 is output as an utterance (as text or as generated speech), by the output component 42, e.g., to the client device 24.), wherein the second machine learning model receives the embedding of the first type that corresponds to the given embedding of the second type (Fig. 4; para [0054] In the exemplary embodiment, an attention mechanism 114 focuses the attention of the decoder on the character representation(s) 76 most likely to be the next to be input to the decoder RNN 102. In particular, the vector which is the current hidden state h.sub.t-1 of the decoder cell 104, 106, 108 is compared to the character representations 76 ( vectors of the same size as the hidden state) and the character representations 76 are accorded weights as a function of their similarity (affinity). Sequence representation 76 (i.e. first embedding type) is input into a decoder (i.e. second machine learning model) which corresponds to character embeddings 71 (i.e. second embedding type).).
Goyal does not explicitly disclose
generating, by a computing system, a second corpus from a first corpus comprising features associated with users, wherein the generating is based on an addition of a new feature to the first corpus or a removal of at least one of the features from the first corpus; 
training, by the computing system, a first machine learning model based on a training set of embeddings of a first type and a training set of embeddings of a second type,
…wherein the training set of embeddings of the first type is generated from the first corpus and the training set of embeddings of the second type is generated from the second corpus, the first corpus and the second corpus include a common feature, a training embedding of the first type for the common feature is different from a training embedding of the second type for the common feature…
generating, by the computing system, a lookup table that maps embeddings of the first type outputted by the first machine learning model to embeddings of the second type received by the first machine learning model;
determining, by the computing system, an embedding of the first type based on the lookup table, wherein the embedding of the first type corresponds to the given embedding of the second type;
However, Josifovski et al. (US 20100106704 A1) teaches 
generating, by a computing system, a second corpus from a first corpus comprising features associated with users (para [0027] Referring back to FIG. 1, at action 114, at least a portion of such a search result may be translated from a native language to a second language (also referred to herein as a target language).), wherein the generating is based on an addition of a new feature to the first corpus or a removal of at least one of the features from the first corpus (para [0025-0026] Such search results were crawled from the Web using the returned URLs. When a fresh copy was not available, a cached electronic document was retrieved with the cache header removed to ensure that these electronic documents were comparable to the original pages. Such crawled electronic documents were processed to remove tags, java scripts, and/or other non-content information. In cases where returned results were not HTML files (e.g., PDF files, MS Word documents, etc.), such files were removed from consideration.); 
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the method of machine translation of Goyal with the method of machine translation of Josifovski (para [0027] For example, such a translation of at least a portion of such a search result may be based at least in part on a machine translation by translation module 106. Translation module 106 may include an off-the-shelf machine translation system, specially developed machine translation system, the like, and/or combinations thereof.)
Doing so would allow for handling corpuses that contain more information contained in the target language than a source language. Including a target language in the machine translation model may improve the accuracy of the model (para [0037] In operation, procedure 300 may prove useful in situation where there may be more and/or better information in electronic documents in such a target language (such as English electronic documents when a non-English native language query is submitted). In such a case, significant terms and/or concepts may be target language (such as English) in origin and accurately may be improved by including such a target language electronic document prior to voting.)
While Goyal teaches a first machine learning model that takes the input of an embedding of a second type and outputs an embedding of a first type as mentioned above, Goyal does not explicitly disclose a lookup table to map a second embedding to a first embedding.
However, Novak teaches
training, by the computing system, a first machine learning model based on a training set of embeddings of a first type and a training set of embeddings of a second type (, pg. 3, section 3; Our model F takes as input a source sentence x and a target sentence y, and outputs a distribution over the vocabulary for each target position… . These operations result in a vector S j representing each position j in the source sentence x and a vector Ti representing each target context y −i|k  Source and target sentences represented as embeddings are used as training data.  pg. 3, section 3; g. The model is trained to maximize the (log) likelihood of the pairs (x, yref) from the training),
generating, by the computing system, a lookup table that maps embeddings of the first type outputted by the first machine learning model to embeddings of the second type received by the first machine learning model (pg. 2, section 2; The source and the target sequences are embedded via a lookup table that replace each word type with a learned vector. The resulting vector sequences are then processed by alternating convolutions and non-linearities. This results in a vector 
    PNG
    media_image1.png
    27
    59
    media_image1.png
    Greyscale
representing each position i in the source x and a vector 
    PNG
    media_image2.png
    35
    66
    media_image2.png
    Greyscale
 representing each position j in the target yg The source vector 
    PNG
    media_image1.png
    27
    59
    media_image1.png
    Greyscale
 (i.e. second embedding type) is mapped via a position/index to the target vector 
    PNG
    media_image2.png
    35
    66
    media_image2.png
    Greyscale
 (i.e. embedding of first type.) pg. 2, section 2; We further denote x as the source sentence, yg as the guess translation from which we start and which was produced by a phrase-based translation system (§6.1), and yref as the reference translation. Source sentence inputted into phrase-based translation system to generate target sentence.); 
determining, by the computing system, an embedding of the first type based on the lookup table, wherein the embedding of the first type corresponds to the given embedding of the second type (pg. 2, section 2; The source and the target sequences are embedded via a lookup table that replace each word type with a learned vector. The resulting vector sequences are then processed by alternating convolutions and non-linearities. This results in a vector 
    PNG
    media_image1.png
    27
    59
    media_image1.png
    Greyscale
representing each position i in the source x and a vector 
    PNG
    media_image2.png
    35
    66
    media_image2.png
    Greyscale
 representing each position j in the target yg . The source vector 
    PNG
    media_image1.png
    27
    59
    media_image1.png
    Greyscale
 (i.e. second embedding type) corresponds to the target vector 
    PNG
    media_image2.png
    35
    66
    media_image2.png
    Greyscale
 (i.e. embedding of first type.);
	It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the machine translation model of Goyal with the lookup table of Novak.
	Doing so would allow for predicting substitutions for current existing translations. Iterative improvements may be made to the current translation thus improving the accuracy of the translation (Abs.)
Jagmohan teaches 
…wherein the training set of embeddings of the first type is generated from the first corpus (para [0040] First corpus language embedding block 202 can analyze and learn the language embedding for the first domain corpus. By way of example, but not limitation, the first corpus language embedding block 202 can analyze and/or learn the language embedding for the first domain corpus by learning word embeddings for terms within the first domain corpus.) and the training set of embeddings of the second type is generated from the second corpus (para [0041] Second corpus language embedding block 204 can analyze and learn the language embedding for the second domain corpus. By way of example, but not limitation, the second corpus language embedding block 204 can analyze and/or learn the language embedding for the second domain corpus by learning word embeddings for terms within the first domain corpus.), the first corpus and the second corpus include a common feature (para [0023] In a comparable corpus, the texts are of the same kind and cover the same content, but they are not translations of each other. To exploit a parallel corpus, text alignment identifying substantially equivalent text segments (phrases or sentences) can be employed to facilitate analysis. And para [0033] Selected terms of the corpus pair can then be considered equivalent terms to learn joint embedding.), a training embedding of the first type for the common feature is different from a training embedding of the second type for the common feature (para [0037] The content mapping component 104 can be a processor that can perform mapping of terms associated with the first domain corpus to terms associated with the second domain corpus. In one embodiment one or more words can be mapped onto an n-dimensional real vector called the word embedding, wherein n can be the size of the layer just before the output layer… Unsupervised learning can stem from an association of frequent technical terms, which are common across both corpora. In some embodiments, a process can use a neural network in an unsupervised manner to map concepts from one domain (e.g., taxonomy) to another (e.g., internal corpus of a company). And para [0042] In some embodiments, as shown, both the first domain corpus embedding and the second domain corpus embedding can be output from the first corpus language embedding block 202 and/or the second corpus language embedding block 204 and received as inputs at first corpus and second corpus joint embedding learning (FSJEL) block 208. Furthermore, equivalent terms between the first domain corpus and the second domain corpus can be identified by equivalent terms identification block 206.)…
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the language model of Goyal with the language model of Jagmohan (para [0028] Language models can also be used by system 100 in information retrieval in a query likelihood model.).
Doing so would allow for unsupervised training. An advantage of unsupervised training is that it does not require labelled input data which may be useful in cases when there is limited labelled training data or when labelling training data takes too much time (para[0022] This disclosure describes systems, computer-implemented methods and/or computer program products that can leverage corpus pairs to learn outside-in term mappings for taxonomies and content in an automated and unsupervised manner (e.g., no labeling of terms are required).)
Regarding Claim 2,
Goyal, Josifovski, Novak, and Jagmohan teach the computer-implemented method of claim 1. Jagmohan further teaches further comprising: 
determining, by the system device, the common feature based on an intersection operation (para [0042] In some embodiments, as shown, both the first domain corpus embedding and the second domain corpus embedding can be output from the first corpus language embedding block 202 and/or the second corpus language embedding block 204 and received as inputs at first corpus and second corpus joint embedding learning (FSJEL) block 208. Furthermore, equivalent terms between the first domain corpus and the second domain corpus can be identified by equivalent terms identification block 206. Equivalent terms identification block 206 can identify known equivalent terms between the first domain corpus and the second domain corpus by identifying technical terms that are common to both corpora, where technical terms can be identified by comparing to background non-technical corpus.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the language model of Goyal, Josifovski, and Novak with the language model of Jagmohan (para [0028] Language models can also be used by system 100 in information retrieval in a query likelihood model.).
Doing so would allow for unsupervised training. An advantage of unsupervised training is that it does not require labelled input data which may be useful in cases when there is limited labelled training data or when labelling training data takes too much time (para[0022] This disclosure describes systems, computer-implemented methods and/or computer program products that can leverage corpus pairs to learn outside-in term mappings for taxonomies and content in an automated and unsupervised manner (e.g., no labeling of terms are required).)
Regarding Claim 3,
Goyal, Josifovski, Novak, and Jagmohan teach the computer-implemented method of claim 2. Jagmohan further teaches further comprising:
generating a first common embedding of the first type based on the common feature (para [0037] In one embodiment one or more words can be mapped onto an n-dimensional real vector called the word embedding, wherein n can be the size of the layer just before the output layer. And para [0047] where w.sup.1 is a vector representing a word appearing in the corpus And para [0052] In one embodiment of equation 3, w.sub.1,t.about.w.sub.2,t can be known equivalent terms), and -2-Application Serial No. 15/649,492Docket No. 36FB-180506 
generating a second common embedding of the second type based on the common feature (para [0037] In one embodiment one or more words can be mapped onto an n-dimensional real vector called the word embedding, wherein n can be the size of the layer just before the output layer. And para [0048] where w.sup.2 is a vector representing a word appearing in the corpus And para [0052] In one embodiment of equation 3, w.sub.1,t.about.w.sub.2,t can be known equivalent terms).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the language model of Goyal, Josifovski, and Novak with the language model of Jagmohan (para [0028] Language models can also be used by system 100 in information retrieval in a query likelihood model.).
Doing so would allow for unsupervised training. An advantage of unsupervised training is that it does not require labelled input data which may be useful in cases when there is limited labelled training data or when labelling training data takes too much time (para[0022] This disclosure describes systems, computer-implemented methods and/or computer program products that can leverage corpus pairs to learn outside-in term mappings for taxonomies and content in an automated and unsupervised manner (e.g., no labeling of terms are required).)
Regarding Claim 9,
Goyal, Josifovski, Novak, and Jagmohan teach the computer-implemented method of claim 1. Goyal further teaches wherein the computing device employs on-the-fly translation in obtaining the embedding of the first type (para [0042] The exemplary encoder-decoder RNN model 50 described herein is a type of sequence to sequence model (Seq-2-Seq). Such models have previously been used for machine translation, where the task is to translate a sentence from a source language to a target language, and developed by Sutskever 2014. It uses an RNN to encode a source sentence into some fixed vector representation which is later fed into another RNN to decode the target sentence.).
Regarding Claim 10,
Goyal, Josifovski, Novak, and Jagmohan teach the computer-implemented method of claim 1. Goyal (US 20180203852 A1) teaches
storing, by the computing device, in a lookup table, the embedding of the first type (Para [0095] x.sub.i.sup.T is the i.sup.th character embedding for the target sequence, (which can be converted to the character itself with the look-up table 73). .sigma.).
Regarding Claim 11,
Claim 11 is the system corresponding to the method of claim 1. Claim 11 is substantially similar to claim 1 and is rejected on the same grounds.
Regarding Claim 12,
Claim 12 is the system corresponding to the method of claim 1. Claim 12 is substantially similar to claim 2 and is rejected on the same grounds.
Regarding Claim 16,
Claim 16 is the computer-readable storage medium corresponding to the method of claim 1. Claim 16 is substantially similar to claim 1 and is rejected on the same grounds. 
Regarding Claim 17,
Claim 17 is the computer-readable storage medium corresponding to the method of claim 1. Claim 17 is substantially similar to claim 2 and is rejected on the same grounds. 

Claims 4, 13, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Goyal, Josifovski, Novak, and Jagmohan, as applied above, and further in view of Song et al. (US-20170060855-A1) 
Regarding Claim 4,
Goyal, Josifovski, Novak, and Jagmohan teach the computer-implemented method of claim 1. 
	Goyal, Josifovski, Novak, and Jagmohan do not explicitly disclose
	further comprising: running, by the computing system, the first machine learning model as its inverse.
However, Song further teaches further comprising: running, by the system device, the first machine learning model as its inverse (para [0081] If the language of the text to be quantized is the target language, the computing device may designate a reverse model of the encoding part of the trained bilingual encoding and decoding model for text vectors as the text vector prediction model of the first language.).
	It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the natural language model of Goyal, Josifovski, Novak, and Jagmohan with the natural language machine learning model of Song.
	Doing so would allow for a rule-based implementation to evaluate the quality of the candidate translations. The ability to evaluate and compare the quality of translations may allow the model to select the translation with the highest quality thus improving the accuracy of the translation (para [0240] The implementations enable rule-based translations of original fragments of text to reach a natural language semantic level to evaluate translation quality of the candidate translations, therefore improving quality of candidate translations.)
Regarding Claim 13,
Claim 13 is the system corresponding to the method of claim 1. Claim 13 is substantially similar to claim 4 and is rejected on the same grounds.
Regarding Claim 18,
Claim 18 is the computer-readable storage medium corresponding to the method of claim 1. Claim 18 is substantially similar to claim 4 and is rejected on the same grounds. 

Claims 5, 14, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Goyal, Josifovski, Novak, and Jagmohan, as applied above, and further in view of England et al. (US-20120290399-A1).
Regarding Claim 5,
Goyal, Josifovski, Novak, and Jagmohan teach the computer-implemented method of claim 1. Josifovski further teaches wherein the features associated with the users includes page features of pages (para [0025-0026] Such search results were crawled from the Web using the returned URLs. When a fresh copy was not available, a cached electronic document was retrieved with the cache header removed to ensure that these electronic documents were comparable to the original pages. Such crawled electronic documents were processed to remove tags, java scripts, and/or other non-content information.) with which the users interacted over a period of time.
	While Josifovski et al. teaches that the features associated with the user include page features (webpage features) Josifovski et al. does not explicitly disclose that these features were interacted by the user over a period of time. 
However, England et al. (US 20120290399 A1) teaches
wherein the features associated with the users includes page features of pages with which the users interacted over a period of time (para [0232] In this embodiment, social shopping operations are begun in step 2202, followed by the instrumentation of a webpage in step 2204. In various embodiments, tags (e.g., JavaScript tags) are automatically inserted in the webpage to identify visitors, dynamically interact with users, and to present predetermined social shopping features based upon their intent and need.).
	It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the search engine of Goyal, Josifovski, Novak, and Jagmohan with the search engine of England (para [0011] In one embodiment, the campaign data comprises a search term that returns a predetermined. URL when used in a search engine.)
	Doing so would allow for optimization of the search engine using predetermined words. An optimized search engine allows for returning results faster and more efficiently. (para [0052] As yet another example, the keyword density 4022, placement 4023, and insertion 4024 management sub-modules may likewise be used by the merchant and the affiliates to optimize searches through the use of predetermined keywords within related social commerce content.)
Regarding Claim 14,
Claim 14 is the system corresponding to the method of claim 1. Claim 14 is substantially similar to claim 5 and is rejected on the same grounds.
Regarding Claim 19,
Claim 19 is the computer-readable storage medium corresponding to the method of claim 1. Claim 19 is substantially similar to claim 5 and is rejected on the same grounds. 

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Goyal, Josifovski, Novak, and Jagmohan, as applied above, and further in view of Liu (US-20170235824-A1).
Regarding Claim 6,
Goyal, Josifovski, Novak, and Jagmohan teach the computer-implemented method of claim 1.
	Goyal, Josifovski, Novak, and Jagmohan do not explicitly disclose
further comprising: 
employing, by the computing system, a third machine learning model in creating the training set of embeddings of the first type; and31 Docket No. 36FB-180506 
employing, by the system device, an fourth machine learning model in creating the training set of embeddings of the second type.
	However, Liu (US 20170235824 A1) teaches 
employing, by the computing system, a third machine learning model in creating the training set of embeddings of the first type (fig. 7; para [0088] At operation 613, the target SSE model is used to generate semantic vectors (also referred to as the semantic vector representations) of the target.); and31 Docket No. 36FB-180506 
employing, by the computing system, a fourth machine learning model in creating the training set of embeddings of the second type (fig. 7; para [0093] At operation 616, the semantic vector of the source is generated using the source SSE model.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine method of calculating similarity scores with a language model of Goyal, Josifovski, Novak, and Jagmohan with the method of calculating similarity scores with a language model of Liu (para [0049] As indicated above, a language model for every category (e.g., LeafCat) is trained such that a classification recall candidate set can be re-ranked with gradient boosting machine (GBM) ensembled signals from sentence embedding similarity scores (derived using SSE modeling) and language model perplexity scores (derived using SLM).).
Doing so would allow for comparing similarities between semantic embeddings. The similarity between the source and target sequence helps indicate the accuracy of the translation from the source language (para [0029] Example embodiments, use sequence semantic embedding (SSE) methods to encode listing titles (e.g., title keywords for publications being listed) and category tree paths into semantic vector representations as <source sequence, target sequence> pairs. The vector distances of the source and target semantic vector representations can be used as a similarity measurement to get classification recall candidate sets.).

Claims 7, 15, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Goyal, Josifovski, Novak, and Jagmohan, as applied above, and further in view of Lee et al. (US-20180011843-A1).
Regarding Claim 7,
Goyal, Josifovski, Novak, and Jagmohan teach the computer-implemented method of claim 1. 
	Goyal, Josifovski, Novak, and Jagmohan do not explicitly disclose
further comprising: providing, by the computing system, the embedding of the first type to the second machine learning model, wherein the second machine learning model was trained using one or more embeddings of the first type.
However, Lee teaches
providing, by the computing system, the embedding of the first type to the second machine learning model, wherein the second machine learning model was trained using one or more embeddings of the first type (fig. 4; para [0120] In response to an input of the features extracted from the voice signal, the encoder 211 encodes the extracted features and generates a first feature vector, for example, a real number vector of [`2.542`, `0.827`, . . . , `5.936`]. The decoder 213 decodes the first feature vector generated by the encoder 211 and generates a first language sentence, for example, a sentence " ?" as a voice recognition result. The decoder 213 outputs the first language sentence sub-word or word units.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the machine translation model of Goyal, Josifovski, Novak, and Jagmohan with the machine translation model of Lee (para [0016] The generating of the second feature vector may include dividing the first language sentence into a plurality of sub-words, sequentially inputting input vectors respectively indicating the plurality of sub-words to an encoder used for machine translation).
Doing so would allow for collectively considering the current translation and previous translations. The translation with the highest score may be determined to be the final translation, thereby providing a more robust and accurate translation (para [0087]).
Regarding Claim 15,
Claim 15 is the system corresponding to the method of claim 1. Claim 15 is substantially similar to claim 7 and is rejected on the same grounds.
Regarding Claim 20,
Claim 20 is the computer-readable storage medium corresponding to the method of claim 1. Claim 20 is substantially similar to claim 7 and is rejected on the same grounds. 

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Goyal, Josifovski, Novak, Jagmohan, and Lee, as applied above, and further in view of and He et al. (US-20120016657-A1; hereinafter He).
Regarding Claim 8,
Goyal, Josifovski, Novak, Jagmohan and Lee teach the computer-implemented method of claim 7. 
Goyal, Josifovski, Novak, Jagmohan and Lee do not explicitly disclose
wherein the second machine learning model provides one or more insights using the embedding of the first type.
However, He (US 20120016657 A1) teaches 
wherein the second machine learning model provides one or more insights using the embedding of the first type (para [0016] The vector generator 115 generates a numerical vector from the combination of the TM features, the SMT features and the system independent features. The resultant vector is then fed into a recommender 120 which outputs a recommendation that grades the performance of each the TM module 105 and the SMT module 110 prior to post editing. The recommender 120 is programmed to predict the relative quality of the SMT output relative to the TM output and makes a recommendation on which output is better based on the comparison between the two outputs.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the natural language model of Goyal, Josifovski, Novak, Jagmohan, and Lee with the machine translation model of He (para [0016] A statistical machine translation (SMT) module 110 is also provided which is configured to generate translations on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora.).
Doing so would allow for predicting the quality of translation. Predicting the quality of the translation allows for selecting the output with the highest quality which improves the accuracy of the translation (para [0016] The recommender 120 is programmed to predict the relative quality of the SMT output relative to the TM output and makes a recommendation on which output is better based on the comparison between the two outputs.).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HENRY K NGUYEN whose telephone number is (571)272-0217. The examiner can normally be reached Mon - Fri 7:00am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on 5712723768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/H.N./Examiner, Art Unit 2121  
                                                                                                                                                                                                      /NICHOLAS KLICOS/Primary Examiner, Art Unit 2145