DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
2.	The information disclosure statement (IDS) submitted on 02/23/2021 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 103
3.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

4.	Claims 1, 4, 7, 15, 21-24, 27, 29-32 are rejected under 35 U.S.C.103 as being unpatentable over Elfardy et al. (US 11,418,461 B1) in view of Zhang et al. (US 2019/0197119 A1.)

	With respect to Claim 1, Elfardy et al. disclose 
 	A method for operating an electronic device, the method comprising: 
 	in accordance with receiving a first n-gram at the electronic device, employing one or more processors and a memory of the electronic device to perform operations (Elfardy et al. col. 15 lines 27-38 the computing system 600 may include: one or more computer processors 602...and/or other volatile non-transitory computer-readable media), wherein the first n-gram includes a first set tokens and represents a first semantic context within a first natural language associated with the first n-gram, the first set of tokens being an ordered set of tokens of the first natural language (Elfardy et al. col. 1 lines 63-66 Representative messages may be selected from each cluster and used in chat sessions according to the semantic context of the chat session, col. 13 lines 28-30 The message 500 may be parsed into a set of tokens 520, where each token corresponds to a word of message), and the operations comprising: 
 		employing a first language model to generate a token vector for each token in the first set of tokens (Elfardy et al. col. 13 lines 30-43 A separate word embedding 522 may be generated or otherwise obtained for each token ...The word embedding may be 300-length vectors. The word embeddings 522 may be obtained using a lookup of pre-generated word embeddings, or may be generated using any of a variety of algorithms, such as those that utilize a neural network to generate the vectors. For example, word embeddings may be obtained using Word2vec models, fast-Text models, GloVe models, or the like); 
 		employing a second language model and the token vector of each token in the first set of tokens to generate a first sentence vector for the first n-gram (Elfardy et al. col. 13 lines 44-63 To generate a sentence embedding 524 from a set of word embeddings 522, the encoding module 502 may implement any of a variety of techniques...For example, the word embeddings 522 may be passed through a neural network or layer thereof, such as a bi-directional long short-term memory (“BI-LSTM”) network or layer to generate the sentence embeddings 524) 
Elfardy et al. fail to explicitly teach 
 	 	a first sentence vector for the first n-gram that embeds the first semantic context within a vector space of the second language model; and 
 		selecting, from a plurality of other n-grams, a second n-gram based on a semantic relationship between the first semantic context and a second semantic context represented by the second n-gram, wherein the semantic relationship is based on the first sentence vector, and the second n-gram is associated with a second natural language. 
	However, Zhang et al. teach  
 		a first sentence vector for the first n-gram that embeds the first semantic context within a vector space of the second language model (Zhang et al. [0050] Words from different languages with identical semantic meanings may be mapped to the same vector or vectors which are very close to each other in the word embedding space. Words with similar semantic meanings are mapped to vectors that are close to each other. For example, the Spanish word “hijo” 106 (translating to “son” in English) is shown mapped to vector V3 118 via mapping M3 116. The proximity of vectors V1 and V3 to which the word “son” 102 and “hijo” 106 are mapped respectively indicates that the semantic context of these words is nearly identical. In one embodiment, to create the mappings between semantic meanings of words in different languages in the embedding space, a code-switching data set may be used. Code-switching data sets may be created by performing machine translations on select words and phrases having a high confidence of being translated correctly, and creating mixed language content based on these translations); and 
 		selecting, from a plurality of other n-grams, a second n-gram based on a semantic relationship between the first semantic context and a second semantic context represented by the second n-gram, wherein the semantic relationship is based on the first sentence vector, and the second n-gram is associated with a second natural language (Zhang et al. [0050] Words from different languages with identical semantic meanings may be mapped to the same vector or vectors which are very close to each other in the word embedding space. Words with similar semantic meanings are mapped to vectors that are close to each other. For example, the Spanish word “hijo” 106 (translating to “son” in English) is shown mapped to vector V3 118 via mapping M3 116. The proximity of vectors V1 and V3 to which the word “son” 102 and “hijo” 106 are mapped respectively indicates that the semantic context of these words is nearly identical. In one embodiment, to create the mappings between semantic meanings of words in different languages in the embedding space, a code-switching data set may be used. Code-switching data sets may be created by performing machine translations on select words and phrases having a high confidence of being translated correctly, and creating mixed language content based on these translations.) 	
Elfardy et al. and Zhang et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of embeddings as taught by Elfardy et al. using teaching of  mapping a word’s semantic meaning into a universal word embedding space as taught by Zhang et al. for the benefit of performing machine translation (Zhang et al. [0016] The present application describes techniques for mapping a word's semantic meaning into a universal word embedding space. Words in different languages with similar semantic meanings map into the same embedding in the embedding space, [0050] In one embodiment, to create the mappings between semantic meanings of words in different languages in the embedding space, a code-switching data set may be used. Code-switching data sets may be created by performing machine translations on select words and phrases having a high confidence of being translated correctly, and creating mixed language content based on these translations.)

	With respect to Claim 4, Elfardy et al. in view of Zhang et al. teach 
 	wherein the first n-gram is received through a spoken utterance of a user (Elfardy et al. col. 1 lines 7-24 Computing system can use communication...a chatbot is an automated dialog system that conducts on-line chat conversations using natural language (e.g., where both input and output are text, where the input to the chatbot is speech-to-text...to respond to the user’s input.)

	With respect to Claim 7, Elfardy et al. in view of Zhang et al. teach 
 	wherein the first and second natural languages are separate natural languages, the first n-gram is a first phrase in the first natural language, the second n-gram is a second phrase in the second natural language, and the semantic relationship between the first and second semantic contexts is a translation of the first phrase in the first natural language to the second phrase in the second natural language (Zhang et al. [0050] Words from different languages with identical semantic meanings may be mapped to the same vector or vectors which are very close to each other in the word embedding space. Words with similar semantic meanings are mapped to vectors that are close to each other. For example, the Spanish word “hijo” 106 (translating to “son” in English) is shown mapped to vector V3 118 via mapping M3 116. The proximity of vectors V1 and V3 to which the word “son” 102 and “hijo” 106 are mapped respectively indicates that the semantic context of these words is nearly identical. In one embodiment, to create the mappings between semantic meanings of words in different languages in the embedding space, a code-switching data set may be used. Code-switching data sets may be created by performing machine translations on select words and phrases having a high confidence of being translated correctly, and creating mixed language content based on these translations.) 	

	With respect to Claim 15, Elfardy et al. in view of Zhang et al. teach 
 	wherein the first n-gram encodes a first phrase of the first natural language that is provided by a user, the second n-gram encodes an identifier for a first content collection included in a plurality of content collections, each of the plurality of other n- grams encodes natural language content that is included in one or more of a plurality of content collections that includes the first content collection, a plurality of other sentence vectors were generated by embedding each of the plurality of other n-grams in the vector space, and the operations further comprise: providing, to a user, at least one of an identifier for the first content collection or a portion of first natural language content included in the first content collection to the user (Zhang et al. [0070] After applying the translation model 214 and/or the language model 220 to the input 202, the translation system 200 may generate an output 208 in the destination language L.sub.b. The output 208 may be in a textual format and may be presented on a display device. In some embodiments, the output 208 may be automatically presented (e.g., an automatic translation or “autotranslation”). In other embodiments, a prompt may be presented and the user may request that the translation  be shown. The translation may remain hidden until the user manually requests that the translation be presented.)

	With respect to Claim 21, Elfardy et al. in view of Zhang et al. 
 	further comprising: 
 	accessing a set of classifier training data that includes a plurality of n-grams, wherein each of the plurality of n-grams is labeled with a ground truth that classifies a corresponding n- gram as a classification of a plurality of classifications (Zhang et al. [0031] Machine learning may be used to train the classifier on available labeled data. Training the classifier may involve mapping words in the labeled data to their respective universal embeddings and creating a mapping or table for these correlations); 
 	employing the first and second language models to embed each of the plurality of n-grams in the vector space (Zhang et al. [0016] The present application describes techniques for mapping a word's semantic meaning into a universal word embedding space. Words in different languages with similar semantic meanings map into the same embedding in the embedding space. The embedding space is built using a concatenation of inputs from multiple different languages. Initially, the semantic meanings of similar words in different languages may or may not be connected or overlap in the embedding space. However, synthetically-generated code-switched sentences may be added to the training data to create connections between different languages); and 
 	employing the n-gram embeddings, the ground truth labels, and a supervised machine learning (ML) method to train a classifier model (Zhang et al. [0031] Machine learning may be used to train the classifier on available labeled data. Training the classifier may involve mapping words in the labeled data to their respective universal embeddings and creating a mapping or table for these correlations. When it comes time to classify new data (regardless of whether the new data is provided in a language that the classifier was trained on), the classifier can determine the intent of the new data by disregarding the language of the new data and replacing words in the new data with their universal embeddings. The words in the new data may be mapped, via the table/mapping, to their respective universal embeddings.)

 	With respect to Claim 22, Elfardy et al. in view of Zhang et al. 
 	wherein the trained classifier model classifies a semantic relationship between the first and second semantic contexts based on a spatial relationship between the first sentence vector and a second sentence vector corresponding to the second n-gram (Zhang et al. [0018] A classifier does not necessarily need to understand the full informational content or meaning of a word to classify subject matter in which the word appears. For example, the system may not know the specific definition of the term IPHONE™, nor the difference between an IPHONE™ and a GALAXY NOTE™, but for classification purposes it is sufficient that the terms have similar semantic  meanings (both referring to types of cell phones) and thus appear in similar contexts. If the word “IPHONE™” is often used with certain intents (e.g., technology reviews, listings for sale, etc.), then other words mapped to the same embedding as the word “IPHONE™” (as “GALAXY NOTE™” might be) may also be strongly connected with those intents.)

	With respect to Claim 23, Elfardy et al. in view of Zhang et al. teach 
 	wherein the first language model is a pre-trained Embeddings from Language Models (ELMo) that is installed on the electronic device and the token vector for each token in the first set of tokens is based on each of the other tokens in the first set of tokens and the order of the first set of tokens (Elfardy et al. col. 13 lines 3-11 As described in greater detail below, the operations of the encoding module 502 and clustering module 504 may be performed separately and sequentially (e.g., the encoded representations are generated, tested, and optimized first prior to clustering), or they may be performed iteratively and jointly (e.g., the testing and optimization of the encoded representations and clusters may be performing using a joint training process, such as using a composite objective function, col. 13 lines 30-43 A separate word embedding 522 may be generated or otherwise obtained for each token ...The word embedding may be 300-length vectors. The word embeddings 522 may be obtained using a lookup of pre-generated word embeddings, or may be generated using any of a variety of algorithms, such as those that utilize a neural network to generate the vectors. For example, word embeddings may be obtained using Word2vec models, fast-Text models, GloVe models, or the like.)

 	With respect to Claim 24, Elfardy et al. in view of Zhang et al. teach
 	wherein the second language includes a Bi-directional long short-term memory (Bi- LSTM) neural network and a fully connected (FC) neural network, and the second language model is installed on the electronic device (Elfardy et al. col. 13 lines 56-63 additional or alternative operations may be performed on the word embeddings 522 to generate sentence a sentence embedding 524. For example, the word embeddings 522 may be passed through a nerual network or layer thereof, such as a bi-directional long short-term memory (“Bi-LSTM”) network or layer to generate the sentence embeddings 524, Fig. 5 element 502 Encoding Module 502.)

	With respect to Claim 27, Elfardy et al. in view of Zhang et al. teach
 	wherein the second language model was trained by employing a the plurality of semantic tasks includes that a next phrase inference task that classifies a relationship between an ordered pair of natural language phrases, wherein training data for the natural language inference task includes a plurality of ordered pairs of phrases, each ordered pair of phrases of the plurality of ordered phrases includes a first phrase and a second phrase and is labeled with a ground truth relationship between ordered pair of phrases, wherein the classification of the relationship between the ordered pair of phrases includes one of logically deductive or not logically deductive (Zhang et al. [0018] A classifier does not necessarily need to understand the full informational content or meaning of a word to classify subject matter in which the word appears. For example, the system may not know the specific definition of the term IPHONE™, nor the difference between an IPHONE™ and a GALAXY NOTE™, but for classification purposes it is sufficient that the terms have similar semantic  meanings (both referring to types of cell phones) and thus appear in similar contexts. If the word “IPHONE™” is often used with certain intents (e.g., technology reviews, listings for sale, etc.), then other words mapped to the same embedding as the word “IPHONE™” (as “GALAXY NOTE™” might be) may also be strongly connected with those intents. The classification in Zhang include “not logically deductive”.)

 	With respect to Claim 29, Elfardy et al. in view of Zhang et al. teach
 	wherein the second n-gram is selected further based on a spatial relationship between the first sentence vector and a second sentence vector corresponding to the second n-gram and the spatial relationship is employed to determine the semantic relationship between the first and second semantic contexts (Zhang et al. [0050] Words from different languages with identical semantic meanings may be mapped to the same vector or vectors which are very close to each other in the word embedding space. Words with similar semantic meanings are mapped to vectors that are close to each other. For example, the Spanish word “hijo” 106 (translating to “son” in English) is shown mapped to vector V3 118 via mapping M3 116. The proximity of vectors V1 and V3 to which the word “son” 102 and “hijo” 106 are mapped respectively indicates that the semantic context of these words is nearly identical. In one embodiment, to create the mappings between semantic meanings of words in different languages in the embedding space, a code-switching data set may be used. Code-switching data sets may be created by performing machine translations on select words and phrases having a high confidence of being translated correctly, and creating mixed language content based on these translations.)
 
 	With respect to Claim 30, Elfardy et al. in view of Zhang et al. teach
 	wherein the second language model was trained by employing the first language model, a plurality of semantic tasks, and a loss function that includes a combination of one or more metric associated with each of the plurality of semantic tasks (Elfardy et al. col. 13 lines 64-67, col. 14 lines 1-20 To test the sentence embeddings 524, the encoding module 502 may in some embodiments attempt to reconstruct the original messages 500, from which the sentence embeddings 524 were generated, into reconstructed messages 550. For example, the encoding module 502 may be implemented as an autoencoder, with the sentence embeddings 524 generated using the encoder portion of the autoencoder, and the message reconstruction performed using the decoder portion of the autoencoder to generate the reconstructed messages 550. The autoencoder may be trained to minimize reconstruction errors (such as squared errors), often referred to as the “loss,” by optimizing for a loss function  through backpropagation. A loss function  (also referred to as an “objective function”) is used to determine a degree to which output for a given input differs from the expected or desired output for the given input. A gradient of the loss function  with respect to the parameters of the autoencoder is computed, and the parameters are then modified to minimize the loss function and, therefore, minimize the degree to which output differs from expected or desired output. Once a sufficient degree of loss is achieved, the process may be stopped and the sentence embeddings 524 generated using the encoder portion with its updated parameters may be provided to the clustering module 504, see more col. 14, col. 15.)

	With respect to Claim 31, Elfardy et al. disclose 
 	An electronic device, comprising: 
 	one or more processors (Elfardy et al. col. 15 lines 27-38 the computing system 600 may include: one or more computer processors 602...and/or other volatile non-transitory computer-readable media); 
 	a memory (Elfardy et al. col. 15 lines 27-38 the computing system 600 may include: one or more computer processors 602...and/or other volatile non-transitory computer-readable media); and 
 	one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for operating the electronic device (Elfardy et al. col. 15 lines 27-38 the computing system 600 may include: one or more computer processors 602...and/or other volatile non-transitory computer-readable media), and the instructions include operations comprising: 
 		receiving a first n-gram at the electronic device, wherein the first n-gram includes a first set tokens and represents a first semantic context within a first natural language associated with the first n-gram, the first set of tokens being an ordered set of tokens of the first natural language (Elfardy et al. col. 1 lines 63-66 Representative messages may be selected from each cluster and used in chat sessions according to the semantic context of the chat session, col. 13 lines 28-30 The message 500 may be parsed into a set of tokens 520, where each token corresponds to a word of message); 
 		employing a first language model to generate a token vector for each token in the first set of tokens (Elfardy et al. col. 13 lines 30-43 A separate word embedding 522 may be generated or otherwise obtained for each token ...The word embedding may be 300-length vectors. The word embeddings 522 may be obtained using a lookup of pre-generated word embeddings, or may be generated using any of a variety of algorithms, such as those that utilize a neural network to generate the vectors. For example, word embeddings may be obtained using Word2vec models, fast-Text models, GloVe models, or the like); 
 		employing a second language model and the token vector of each token in the first set of tokens to generate a first sentence vector for the first n-gram (Elfardy et al. col. 13 lines 44-63 To generate a sentence embedding 524 from a set of word embeddings 522, the encoding module 502 may implement any of a variety of techniques...For example, the word embeddings 522 may be passed through a neural network or layer thereof, such as a bi-directional long short-term memory (“BI-LSTM”) network or layer to generate the sentence embeddings 524) 
Elfardy et al. fail to explicitly teach 
 	 	a first sentence vector for the first n-gram that embeds the first semantic context within a vector space of the second language model; and 
 		selecting, from a plurality of other n-grams, a second n-gram based on a semantic relationship between the first semantic context and a second semantic context represented by the second n-gram, wherein the semantic relationship is based on the first sentence vector, and the second n-gram is associated with a second natural language. 
	However, Zhang et al. teach  
 		a first sentence vector for the first n-gram that embeds the first semantic context within a vector space of the second language model (Zhang et al. [0050] Words from different languages with identical semantic meanings may be mapped to the same vector or vectors which are very close to each other in the word embedding space. Words with similar semantic meanings are mapped to vectors that are close to each other. For example, the Spanish word “hijo” 106 (translating to “son” in English) is shown mapped to vector V.sub.3 118 via mapping M.sub.3 116. The proximity of vectors V.sub.1 and V.sub.3 to which the word “son” 102 and “hijo” 106 are mapped respectively indicates that the semantic context of these words is nearly identical. In one embodiment, to create the mappings between semantic meanings of words in different languages in the embedding space, a code-switching data set may be used. Code-switching data sets may be created by performing machine translations on select words and phrases having a high confidence of being translated correctly, and creating mixed language content based on these translations); and 
 		selecting, from a plurality of other n-grams, a second n-gram based on a semantic relationship between the first semantic context and a second semantic context represented by the second n-gram, wherein the semantic relationship is based on the first sentence vector, and the second n-gram is associated with a second natural language (Zhang et al. [0050] Words from different languages with identical semantic meanings may be mapped to the same vector or vectors which are very close to each other in the word embedding space. Words with similar semantic meanings are mapped to vectors that are close to each other. For example, the Spanish word “hijo” 106 (translating to “son” in English) is shown mapped to vector V3 118 via mapping M3 116. The proximity of vectors V1 and V3 to which the word “son” 102 and “hijo” 106 are mapped respectively indicates that the semantic context of these words is nearly identical. In one embodiment, to create the mappings between semantic meanings of words in different languages in the embedding space, a code-switching data set may be used. Code-switching data sets may be created by performing machine translations on select words and phrases having a high confidence of being translated correctly, and creating mixed language content based on these translations.) 
Elfardy et al. and Zhang et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of embeddings as taught by Elfardy et al. using teaching of  mapping a word’s semantic meaning into a universal word embedding space as taught by Zhang et al. for the benefit of performing machine translation (Zhang et al. [0016] The present application describes techniques for mapping a word's semantic meaning into a universal word embedding space. Words in different languages with similar semantic meanings map into the same embedding in the embedding space, [0050] In one embodiment, to create the mappings between semantic meanings of words in different languages in the embedding space, a code-switching data set may be used. Code-switching data sets may be created by performing machine translations on select words and phrases having a high confidence of being translated correctly, and creating mixed language content based on these translations.)	

 	With respect to Claim 32, Elfardy et al. disclose 
 	A non-transitory computer-readable storage medium storing one or more programs, the one or more programs comprising instructions for operating an electronic device, when the instructions are executed by one or more processors of the electronic device (Elfardy et al. col. 15 lines 27-38 the computing system 600 may include: one or more computer processors 602...and/or other volatile non-transitory computer-readable media), cause the electronic device performs operations comprising: 
 	receiving a first n-gram at the electronic device, wherein the first n-gram includes a first set tokens and represents a first semantic context within a first natural language associated with the first n-gram, the first set of tokens being an ordered set of tokens of the first natural language (Elfardy et al. col. 1 lines 63-66 Representative messages may be selected from each cluster and used in chat sessions according to the semantic context of the chat session, col. 13 lines 28-30 The message 500 may be parsed into a set of tokens 520, where each token corresponds to a word of message); 
 		employing a first language model to generate a token vector for each token in the first set of tokens (Elfardy et al. col. 13 lines 30-43 A separate word embedding 522 may be generated or otherwise obtained for each token ...The word embedding may be 300-length vectors. The word embeddings 522 may be obtained using a lookup of pre-generated word embeddings, or may be generated using any of a variety of algorithms, such as those that utilize a neural network to generate the vectors. For example, word embeddings may be obtained using Word2vec models, fast-Text models, GloVe models, or the like); 
 		employing a second language model and the token vector of each token in the first set of tokens to generate a first sentence vector for the first n-gram (Elfardy et al. col. 13 lines 44-63 To generate a sentence embedding 524 from a set of word embeddings 522, the encoding module 502 may implement any of a variety of techniques...For example, the word embeddings 522 may be passed through a neural network or layer thereof, such as a bi-directional long short-term memory (“BI-LSTM”) network or layer to generate the sentence embeddings 524) 
Elfardy et al. fail to explicitly teach 
 	 	a first sentence vector for the first n-gram that embeds the first semantic context within a vector space of the second language model; and 
 		selecting, from a plurality of other n-grams, a second n-gram based on a semantic relationship between the first semantic context and a second semantic context represented by the second n-gram, wherein the semantic relationship is based on the first sentence vector, and the second n-gram is associated with a second natural language. 
	However, Zhang et al. teach  
 		a first sentence vector for the first n-gram that embeds the first semantic context within a vector space of the second language model (Zhang et al. [0050] Words from different languages with identical semantic meanings may be mapped to the same vector or vectors which are very close to each other in the word embedding space. Words with similar semantic meanings are mapped to vectors that are close to each other. For example, the Spanish word “hijo” 106 (translating to “son” in English) is shown mapped to vector V.sub.3 118 via mapping M.sub.3 116. The proximity of vectors V.sub.1 and V.sub.3 to which the word “son” 102 and “hijo” 106 are mapped respectively indicates that the semantic context of these words is nearly identical. In one embodiment, to create the mappings between semantic meanings of words in different languages in the embedding space, a code-switching data set may be used. Code-switching data sets may be created by performing machine translations on select words and phrases having a high confidence of being translated correctly, and creating mixed language content based on these translations); and 
 		selecting, from a plurality of other n-grams, a second n-gram based on a semantic relationship between the first semantic context and a second semantic context represented by the second n-gram, wherein the semantic relationship is based on the first sentence vector, and the second n-gram is associated with a second natural language (Zhang et al. [0050] Words from different languages with identical semantic meanings may be mapped to the same vector or vectors which are very close to each other in the word embedding space. Words with similar semantic meanings are mapped to vectors that are close to each other. For example, the Spanish word “hijo” 106 (translating to “son” in English) is shown mapped to vector V3 118 via mapping M3 116. The proximity of vectors V1 and V3 to which the word “son” 102 and “hijo” 106 are mapped respectively indicates that the semantic context of these words is nearly identical. In one embodiment, to create the mappings between semantic meanings of words in different languages in the embedding space, a code-switching data set may be used. Code-switching data sets may be created by performing machine translations on select words and phrases having a high confidence of being translated correctly, and creating mixed language content based on these translations.) 	
Elfardy et al. and Zhang et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of embeddings as taught by Elfardy et al. using teaching of  mapping a word’s semantic meaning into a universal word embedding space as taught by Zhang et al. for the benefit of performing machine translation (Zhang et al. [0016] The present application describes techniques for mapping a word's semantic meaning into a universal word embedding space. Words in different languages with similar semantic meanings map into the same embedding in the embedding space, [0050] In one embodiment, to create the mappings between semantic meanings of words in different languages in the embedding space, a code-switching data set may be used. Code-switching data sets may be created by performing machine translations on select words and phrases having a high confidence of being translated correctly, and creating mixed language content based on these translations.)

5. 	Claims 2-3 are rejected under 35 U.S.C.103 as being unpatentable over Elfardy et al. (US 11,418,461 B1) in view of Zhang et al. (US 2019/0197119 A1) and Hovestadt et al. (US 2008/0140422 A1.)

 	With respect to Claim 2, Elfardy et al. in view of Zhang et al. teach all the limitations of Claim 1 upon which Claim 2 depends. Elfardy et al. in view of Zhang et al. fail to explicitly teach 
 	wherein the first n-gram encodes a first phrase in the first natural language that is non-executable by the electronic device, the first semantic context includes a user intent in accordance with the first phrase, and the second n-gram encodes an identifier for a command that is executable by the electronic device, such that executing command causes the electronic device to performs actions causing an accomplishment of at least a portion of the user intent.  
	However, Hovestadt et al. teach 
 	wherein the first n-gram encodes a first phrase in the first natural language that is non-executable by the electronic device, the first semantic context includes a user intent in accordance with the first phrase, and the second n-gram encodes an identifier for a command that is executable by the electronic device, such that executing command causes the electronic device to performs actions causing an accomplishment of at least a portion of the user intent (Hovestadt et al. [0006] A speech dialog control module enhances user operation of a speech dialog system by translating an input signal unrecognizable by a speech dialog system into a recognizable language. A speech dialog control module includes an input device that receives a speech signal in a first language. A controller receives the input signal and generates a control instruction that corresponds to the received input signal. The control instruction has a language that is different from the input signal. A speech-synthesis unit converts the control instruction into an output speech signal. An output device outputs the output speech signal, [0025] Once activated, the speech dialog control module 100 receives an input signal in a language that is different from a speech dialog system 102 recognizable language. For example, in the case where the speech dialog system 102 is configured to recognize English vocabulary, and the user only speaks German, the user may speak the German phrase "Scheibenwischer AN." The speech dialog control module 100 receives the German speech at the input device, processes the input signal to convert it to a command instruction in a recognizable language (e.g., English), and outputs a corresponding acoustic signal. The acoustic signal may be output through a speech dialog control module loudspeaker as the English phrase "windshield wiper on" The speech dialog system 100 may receive this command through microphone 218 and generate a command to switch on the vehicle's windshield wipers. The Examiner notes that the first natural language (e.g., German) is non-executable by the Speech Dialog System. The Speech Dialog System is able to recognize and execute the English command. The Speech Dialog system is not able to recognize and execute the German command.)
 	Elfardy et al., Zhang et al. and Hovestadt et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of embeddings as taught by Elfardy et al. using teaching of  mapping a word’s semantic meaning into a universal word embedding space as taught by Zhang et al. for the benefit of performing machine translation, using teaching of machine translation as taught by Hovestadt et al. for the benefit of translating an input unrecognizable language by a speech dialog system into a recognizable language (Hovestadt et al. [0006] A speech dialog control module enhances user operation of a speech dialog system by translating an input signal unrecognizable by a speech dialog system into a recognizable language. A speech dialog control module includes an input device that receives a speech signal in a first language. A controller receives the input signal and generates a control instruction that corresponds to the received input signal. The control instruction has a language that is different from the input signal. A speech-synthesis unit converts the control instruction into an output speech signal. An output device outputs the output speech signal.)

	With respect to Claim 3, Elfardy et al. in view of Zhang et al. teach all the limitations of Claim 1 upon which Claim 3 depends. Elfardy et al. in view of Zhang et al. fail to explicitly teach 
 	wherein the first n-gram encodes a first phrase in the first natural language, the first phrase being non-recognizable by a digital assistant implemented by the electronic device, the first semantic context includes a user intent in accordance with the first phrase, the second n-gram encodes a second phrase in the second natural language, the second phrase being recognizable by the digital assistant, and the second semantic context includes the user intent, which is in accordance with the second phrase.  
	However, Hovestadt et al. teach 
 	wherein the first n-gram encodes a first phrase in the first natural language, the first phrase being non-recognizable by a digital assistant implemented by the electronic device, the first semantic context includes a user intent in accordance with the first phrase, the second n-gram encodes a second phrase in the second natural language, the second phrase being recognizable by the digital assistant, and the second semantic context includes the user intent, which is in accordance with the second phrase (Hovestadt et al. [0006] A speech dialog control module enhances user operation of a speech dialog system by translating an input signal unrecognizable by a speech dialog system into a recognizable language. A speech dialog control module includes an input device that receives a speech signal in a first language. A controller receives the input signal and generates a control instruction that corresponds to the received input signal. The control instruction has a language that is different from the input signal. A speech-synthesis unit converts the control instruction into an output speech signal. An output device outputs the output speech signal, [0025] Once activated, the speech dialog control module 100 receives an input signal in a language that is different from a speech dialog system 102 recognizable language. For example, in the case where the speech dialog system 102 is configured to recognize English vocabulary, and the user only speaks German, the user may speak the German phrase "Scheibenwischer AN." The speech dialog control module 100 receives the German speech at the input device, processes the input signal to convert it to a command instruction in a recognizable language (e.g., English), and outputs a corresponding acoustic signal. The acoustic signal may be output through a speech dialog control module loudspeaker as the English phrase "windshield wiper on" The speech dialog system 100 may receive this command through microphone 218 and generate a command to switch on the vehicle's windshield wipers.)
 	Elfardy et al., Zhang et al. and Hovestadt et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of embeddings as taught by Elfardy et al. using teaching of  mapping a word’s semantic meaning into a universal word embedding space as taught by Zhang et al. for the benefit of performing machine translation, using teaching of machine translation as taught by Hovestadt et al. for the benefit of translating an input unrecognizable language by a speech dialog system into a recognizable language (Hovestadt et al. [0006] A speech dialog control module enhances user operation of a speech dialog system by translating an input signal unrecognizable by a speech dialog system into a recognizable language. A speech dialog control module includes an input device that receives a speech signal in a first language. A controller receives the input signal and generates a control instruction that corresponds to the received input signal. The control instruction has a language that is different from the input signal. A speech-synthesis unit converts the control instruction into an output speech signal. An output device outputs the output speech signal.)

6. 	Claims 5, 28 are rejected under 35 U.S.C.103 as being unpatentable over Elfardy et al. (US 11,418,461 B1) in view of Zhang et al. (US 2019/0197119 A1) and Qian et al. (US 2022/0215159 A1.)

 	With respect to Claim 5, Elfardy et al. in view of Zhang et al. teach all the limitations of Claim 1 upon which Claim 5 depends. Elfardy et al. in view of Zhang et al. fail to explicitly teach 
 	wherein the first and second natural languages are a same natural language, the second n-gram includes a second set of tokens and represents the second semantic context within the same natural language associated with the each of the first and second n-grams, the second set of tokens being an ordered set of tokens of the same natural language.  
	However, Qian et al. teach 
 	wherein the first and second natural languages are a same natural language, the second n-gram includes a second set of tokens and represents the second semantic context within the same natural language associated with the each of the first and second n-grams, the second set of tokens being an ordered set of tokens of the same natural language (Qian et al. [0096] A paraphrase is a different expression having same semantics as an input sentence. For example, if the input sentence is “What is the distance from the sun to the earth”, the input sentence may be paraphrased to obtain paraphrased sentences such as “How far is the sun from the earth”, “How many kilometers is it from the earth to the sun”, “How many kilometers is the earth from the sun”. These paraphrased sentences all express same or similar semantics as the input sentence, namely, “What is the distance between the sun and the earth”. Therefore, it may also be understood that these sentences are paraphrases of each other, [0243] For example, as shown in FIG. 11, for an input sentence (for example, “taiyang dao diqiu de juli shi duoshao”) and a paraphrased sentence (for example, “cong diqiu dao taiyang you duo yuan”) of the input sentence that are of a paraphrased sentence generator G.sub.j in the sentence paraphrased model, the similarity discriminator may first map each word in a word sequence of the input sentence to a word vector (word embedding), and then encode the word sequence by using an encoder, to convert the word sequence into a fixed-length sentence embedding, where j is a positive integer. The Examiner notes that the original sentence and the paraphrased sentence are the same natural language.)
Elfardy et al., Zhang et al. and Qian et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of embeddings as taught by Elfardy et al. using teaching of  mapping a word’s semantic meaning into a universal word embedding space as taught by Zhang et al. for the benefit of performing machine translation, using teaching of sentence paraphrase model as taught by Qian et al. for the benefit of generating and improving diversity and quality of the paraphrased sentences (Qian et al. [0008] In this embodiment of this disclosure, the one or more of the plurality of paraphrased sentence generators are trained by using the source information and the similarity information as the reward, the source information is used to indicate the probability that each paraphrased sentence generator generates the paraphrased sentence of the training sentence, and the similarity information is used to indicate the similarity between the paraphrased sentence and the training sentence. In the sentence paraphrase model trained by using the method, both quality of the paraphrased sentence and diversity of the paraphrased sentence may be considered. Therefore, diversity of the paraphrased sentence and quality of the paraphrased sentence can be improved.)

 	With respect to Claim 28, Elfardy et al. in view of Zhang et al. teach all the limitations of Claim 1 upon which Claim 28 depends. Elfardy et al. in view of Zhang et al. fail to explicitly teach 
 	wherein the semantic relationship between the first and second semantic contexts is such that the first semantic context is a paraphrasing of the second semantic context. 
	However, Qian et al. teach  
 	wherein the semantic relationship between the first and second semantic contexts is such that the first semantic context is a paraphrasing of the second semantic context (Qian et al. [0096] A paraphrase is a different expression having same semantics as an input sentence. For example, if the input sentence is “What is the distance from the sun to the earth”, the input sentence may be paraphrased to obtain paraphrased sentences such as “How far is the sun from the earth”, “How many kilometers is it from the earth to the sun”, “How many kilometers is the earth from the sun”. These paraphrased sentences all express same or similar semantics as the input sentence, namely, “What is the distance between the sun and the earth”. Therefore, it may also be understood that these sentences are paraphrases of each other.)
Elfardy et al., Zhang et al. and Qian et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of embeddings as taught by Elfardy et al. using teaching of  mapping a word’s semantic meaning into a universal word embedding space as taught by Zhang et al. for the benefit of performing machine translation, using teaching of sentence paraphrase model as taught by Qian et al. for the benefit of generating and improving diversity and quality of the paraphrased sentences (Qian et al. [0008] In this embodiment of this disclosure, the one or more of the plurality of paraphrased sentence generators are trained by using the source information and the similarity information as the reward, the source information is used to indicate the probability that each paraphrased sentence generator generates the paraphrased sentence of the training sentence, and the similarity information is used to indicate the similarity between the paraphrased sentence and the training sentence. In the sentence paraphrase model trained by using the method, both quality of the paraphrased sentence and diversity of the paraphrased sentence may be considered. Therefore, diversity of the paraphrased sentence and quality of the paraphrased sentence can be improved.)

7. 	Claim 25 is rejected under 35 U.S.C.103 as being unpatentable over Elfardy et al. (US 11,418,461 B1) in view of Zhang et al. (US 2019/0197119 A1) and Johnson Premkumar et al. (US 2020/0342182 A1.)

 	With respect to Claim 25, Elfardy et al. in view of Zhang et al. teach all the limitations of Claim 1 upon which Claim 25 depends. Elfardy et al. in view of Zhang et al. fail to explicitly teach 
 	wherein the second language model was trained by employing a plurality of semantic tasks that includes a natural language inference task that classifies a relationship between an ordered pair of natural language phrases, wherein training data for the natural language inference task includes a plurality of ordered pairs of phrases, each ordered pair of phrases of the plurality of ordered phrases includes a first phrase and a second phrase and is labeled with a ground truth relationship between ordered pair of phrases, wherein the classification of the relationship between the ordered pair of phrases includes one of entailment, contradiction, or neutrality.  
	However, Johnson Premkumar et al. teach 
 	wherein the second language model was trained by employing a plurality of semantic tasks that includes a natural language inference task that classifies a relationship between an ordered pair of natural language phrases, wherein training data for the natural language inference task includes a plurality of ordered pairs of phrases, each ordered pair of phrases of the plurality of ordered phrases includes a first phrase and a second phrase and is labeled with a ground truth relationship between ordered pair of phrases, wherein the classification of the relationship between the ordered pair of phrases includes one of entailment, contradiction, or neutrality.  
(Johnson Premkumar et al. [0043] Multilingual classification model 120 can be trained to solve a variety of NLP tasks including: semantic analysis, recognizing textual entailment, word sense disambiguation, topic recognition, sentiment analysis, etc. For example, recognizing textual entailment determines a relationship between sentences and/or phrases. In a variety of implementations, a multilingual classification model 120 trained for the NLP task of recognizing textual entailment can generate at least three classification labels: entailment, contradiction, and neutral. An entailment classification label indicates a second phrase and/or sentence is entailed (i.e., a consequence of) the first phrase and/or sentence. A contradiction classification label can indicate the second phrase and/or sentence is contradicted by the first phrase and/or sentence. Similarly, a neutral classification label can indicate a second sentence and/or phrase is neither entailed nor contradicted by the first sentence and/or phrase.)
 	Elfardy et al., Zhang et al. and Johnson Premkumar et al. are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of embeddings as taught by Elfardy et al. using teaching of  mapping a word’s semantic meaning into a universal word embedding space as taught by Zhang et al. for the benefit of performing machine translation, using teaching of classification model as taught by Johnson Premkumar et al. for the benefit of generating at least three classification labels: entailment, contradiction, and neutral (Johnson Premkumar et al. [0043] Multilingual classification model 120 can be trained to solve a variety of NLP tasks including: semantic analysis, recognizing textual entailment, word sense disambiguation, topic recognition, sentiment analysis, etc. For example, recognizing textual entailment determines a relationship between sentences and/or phrases. In a variety of implementations, a multilingual classification model 120 trained for the NLP task of recognizing textual entailment can generate at least three classification labels: entailment, contradiction, and neutral. An entailment  classification label indicates a second phrase and/or sentence is entailed (i.e., a consequence of) the first phrase and/or sentence. A contradiction classification label can indicate the second phrase and/or sentence is contradicted by the first phrase and/or sentence. Similarly, a neutral classification label can indicate a second sentence and/or phrase is neither entailed nor contradicted by the first sentence and/or phrase.)

8. 	Claim 26 is rejected under 35 U.S.C.103 as being unpatentable over Elfardy et al. (US 11,418,461 B1) in view of Zhang et al. (US 2019/0197119 A1) and Neumann (US 2021/0342212 A1.)

 	With respect to Claim 26, Elfardy et al. in view of Zhang et al. teach all the limitations of Claim 1 upon which Claim 26 depends. Elfardy et al. in view of Zhang et al. fail to explicitly teach 
 	wherein the second language model was trained by employing a plurality of semantic tasks that includes a semantic similarity inference task that classifies a similarity for a pair of sentences, wherein training data for a semantic text similarity task includes a plurality of pairs of phrases, each pair of phrases of the plurality of phrases includes a first phrase and a second phrase and is labeled with a ground truth relationship between pair of phrases, wherein the classification of the similarity between the pair of phrases includes one of similar and not similar. 
	However, Neumann teaches 
	wherein the second language model was trained by employing a plurality of semantic tasks that includes a semantic similarity inference task that classifies a similarity for a pair of sentences, wherein training data for a semantic text similarity task includes a plurality of pairs of phrases, each pair of phrases of the plurality of phrases includes a first phrase and a second phrase and is labeled with a ground truth relationship between pair of phrases, wherein the classification of the similarity between the pair of phrases includes one of similar and not similar (Neumann [0027] With continued reference to FIG. 1, parsing module 112 may utilize, incorporate, or be a language processing module 116 as described above. Language processing module 116 may be configured to map at least a user input to at least a query, using any process as described above for a language processing module 116. Extraction and/or analysis may further involve polarity classification, in which parsing module 112 may determine, for instance, whether a phrase or sentence is a negation of the semantic content thereof, or a positive recitation of the semantic content; as a non-limiting example, polarity classification may enable parsing module 112 to determine that “my feet hurt” has a divergent meaning, or the opposite meaning, of the phrase “my feet don't hurt.” Polarity classification may be performed, without limitation, by consultation of a database of words that negate sentences, and/or geometrically within a vector space, where a negation of a given phrase may be distant from the non-negated version of the same phrase according to norms such as cosine similarity, see paragraphs [0029-0031])
 	Elfardy et al., Zhang et al. and Neumann are analogous art because they are from a similar field of endeavor in the Signal Processing techniques and applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the steps of embeddings as taught by Elfardy et al. using teaching of  mapping a word’s semantic meaning into a universal word embedding space as taught by Zhang et al. for the benefit of performing machine translation, using teaching of classification as taught by Neumann to determine the opposite and/or similar meaning between two sentences (Neumann [0027] With continued reference to FIG. 1, parsing module 112 may utilize, incorporate, or be a language processing module 116 as described above. Language processing module 116 may be configured to map at least a user input to at least a query, using any process as described above for a language processing module 116. Extraction and/or analysis may further involve polarity classification, in which parsing module 112 may determine, for instance, whether a phrase or sentence is a negation of the semantic content thereof, or a positive recitation of the semantic content; as a non-limiting example, polarity classification may enable parsing module 112 to determine that “my feet hurt” has a divergent meaning, or the opposite meaning, of the phrase “my feet don't hurt.” Polarity classification may be performed, without limitation, by consultation of a database of words that negate sentences, and/or geometrically within a vector space, where a negation of a given phrase may be distant from the non-negated version of the same phrase according to norms such as cosine similarity, see paragraphs [0029-0031])

Allowable Subject Matter
9.	Claims 6, 8-14, 16-20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. 

Conclusion
10. 	The prior art made of record and not relied upon is considered pertinent to application’s disclosure. See PTO-892
a.	He et al. (US 2018/0165278 A1.) In this reference, He et al. disclose a method for translating based on artificial intelligence. With the method, the text to be translated from the source language to the target language is acquired, in which, the text includes the target language term and the source language term. The candidate terms for translating the source language term and confidences of the candidate terms are determined. The candidate terms are used to replace the corresponding source language term, and each candidate term is combined with the target language term, so as to obtain each candidate translation. A probability of forming a smooth text when the candidate term is used in the candidate translation is predicted. Then the target term is chosen to be recommended according to the language probabilities of the candidate translations and the confidences of the candidate terms.
b.	Eck (US 2017/0371866 A1.) In this reference, Eck disclose a method for improving machine translation system. The machine translation system may apply one or more models for translating material from a source language into a destination language. The models are initially trained using training data. According to exemplary embodiments, supplemental training data is used to train the models, where the supplemental training data uses in-domain material to improve the quality of output translations. In-domain data may include data that relates to the same or similar topics as those expected to be encountered in a translation of material from the source language into the destination language. In-domain data may include material previously translated from the source language into the destination language, material similar to previous translations, and destination language material that has previously been the subject of a request for translation into the source language.
c. 	Eck et al. (US 2017/0371865 A1.) In this reference, Eck et al. disclose for detecting, removing, and/or replacing objectionable words and phrases in a machine-generated translation. A classifier identifies translations containing target words or phrases. The classifier may be applied to the output translation to remove target words and phrases from the translation, or to prevent target words and phrases from being automatically presented. Further, the classifier may be applied to a translation model to prevent the target words and phrases from appearing in the output translation. Still further, the classifier may be applied to training data so that the translation model is not trained using the target words of phrases. The classifier may remove target words or phrases only when the target words or phrases appear in the output translation but not the source language input data. The classifier may be provided as a standalone service, or may be employed in the context of a machine translation system.

11. 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to THUYKHANH LE whose telephone number is (571)272-6429. The examiner can normally be reached Mon-Fri: 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew C. Flanders can be reached on 571-272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/THUYKHANH LE/Primary Examiner, Art Unit 2655