Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
2.	Claims 1-20 are present for examination.

Information Disclosure Statement
3.	The information disclosure statement (IDS) filed on 08/31/2021 was considered by the examiner.

Claim Rejections - 35 USC § 103
4.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
5.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

6.	Claims 1-2, 4-5, 14-16 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. 2018/0189259 (hereinafter Merl) in view of U.S. 2019/0349320 (hereinafter Karuppusamy), and further in view of U.S. 2019/0197106 (hereinafter Doggett).

receiving the user query ([0012]; “Language identification can be prerequisite, for many text processing and information retrieval applications…”);
separating the user query into a plurality of n-grams ([0016]; “An n-gram observation models and dictionary observation models can provide a prediction for a given token in a content item for whether the token is in a particular language. In various implementations, an observation model can include a probability distribution that provides, for a given input token, a likelihood that the input token is in a given language”); and
determining, by applying the machine learning model, a respective confidence score for each of the at least one candidate language ([0032]; “Given an input text (e.g., a social media post), the goal of the language inferring system can be to infer one or more languages of the input text (or at least the top language candidates), along with their associated confidence levels”).
Merl does not explicitly disclose the features of identifying at least one candidate language of the user query, by applying a machine learning model to the plurality of n-grams of the user query, wherein the machine learning model is trained with at least one multilingual text corpus and game-related data; and determining a respective match score for each of the one or more response matches.  However, Karuppusamy discloses that “The output from the trained category classifier 406 can include predicted probabilities for each category.  For example, the category classifier 406 can indicate that the user request 402 has a 90% probability of belonging to a first category and a 10% probability of belonging to a second category. A wide variety of categories can be utilized, depending on the types of requests received and/or the type of software application associated with the requests. Table 1 includes a listing of example user requests and categories related to a software application for a multiplayer online game” ([0031]).  Karuppusamy further discloses that “…the suggestion module 418 can utilize or implement various types of relevance models to rank Response IDs by relevance.  For example, a TF-IDF model can be used to weight each word (or each important word) in a query based on…” ([0045]) and it would have been obvious for one with ordinary skill in the art to utilize the teachings of Karuppusamy in the system of Merl in view of the desire to enhance the language detection system by utilizing the game-related text scheme resulting in improving the efficiency of utilizing a machine learning scheme.
The references do not explicitly disclose the features of identifying one or more response matches to the user query in language-specific game databases associated with each of the at least one candidate language that has a confidence score meeting a confidence threshold; and providing a response of search results including game information associated with particular response matches, based, at least in part, on respective weighted scores of at least the respective confidence score and the respective match score. However, Doggett discloses that “…In the case of the confidence level being non-determinative (e.g., the proposed language response is deemed to be 50/50 appropriate or inappropriate), the system may be configured to react in accordance with a variety of behaviors depending on the preference(s) of a system designer.  For example, in some embodiments, the system may move forward with using a proposed language response despite a lack of confidence in its appropriateness…” ([0044]).  In addition, Doggett discloses that “…classifier component 214 may be implemented as a ‘hardcoded’ algorithm/software in which matching operations can be performed to match one or more keywords resulting from the parsing process with keywords correlated to, in this example, either conversational response 

Regarding claims 2 and 15, Merl in view of Karuppusamy and Doggett disclose the method wherein providing the response includes ranking the response matches according to the respective weighted scores, and the method further comprises: generating the search results based, at least in part, on the one or mom response matches that meet a ranking threshold (Doggett: [0044]).  Therefore, the limitations of claims 2 and 15 are rejected in the analysis of claims 1 or 14, and the claims are rejected on that basis. 

Regarding claims 4 and 18, Merl in view of Karuppusamy and Doggett disclose the method wherein the game information includes connect information for online games and the response includes one or more user interface elements that, when activated by a user, navigates to digital information associated with playing a respective one of the online games (Karuppusamy: [0031]; Table 1).  Therefore, the limitations of claims 4 and 18 are rejected in the analysis of claims 1 or 14, and the claims are rejected on that basis. 




Regarding claim 16, Merl in view of Karuppusamy and Doggett discloses the system wherein the ranking threshold includes at least one of a top pre-defined number of response matches or response matches with a weighted score over a predefined value (Doggett: [0044]).  Therefore, the limitations of claim 16 are rejected in the analysis of claim 14, and the claim is rejected on that basis. 

7.	Claims 3, 6, 17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Merl in view of Karuppusamy and Doggett, and further in view of U.S. 10,388,272 (hereinafter Thomson).

Regarding claims 3 and 17, Merl in view of Karuppusamy and Doggett discloses the method wherein generating the search results is further based on user profile information associated with a user of the user query.  However, Thomson discloses that “…The configuration service may store information on the individual devices or on a server in the transcription system 108” (col. 17, lns. 27-34) and it would have been obvious for one with ordinary skill in the art to utilize the teaching of Thomson in the modified system of Merl in view of the desire to enhance the language detection system by utilizing the user profile system resulting in improving the efficiency of utilizing a machine learning scheme.
For a word that is out-of-vocabulary, such as when the word is not part of a first lexicon or does not have a known probability in the first language model, the second language model may be used to estimate probabilities based on subword units” (col. 41, lns. 56-col. 42, lns. 3).  Thomson additionally discloses that “…a language manager tool 1210 may be configured to manage the candidate lexicon database 1208.  For example, in some embodiments, the language manager tool 1210 may manage the candidate lexicon database 1208 automatically or based on user input.  Management of the candidate lexicon database 1208 may include reviewing the term has been reviewed, the candidate lexicon database 1207 may be updated to either remove the term or mark the term as accepted or rejected” (col. 53, lns. 4-31) and it would have been obvious for one with ordinary skill in the art to utilize the teaching of Thomson in the modified system of Merl in view of the desire to enhance the language detection system by utilizing the unknown language data resulting in improving the efficiency of utilizing a machine learning scheme.

8.	Claims 7-8, 10 and 12-13 are rejected under 35 U.S.C. 103 as being unpatentable over Merl in view of Thomson.

Regarding claim 7, Merl discloses a method of training a machine learning model to identify candidate languages of a user query for online games, wherein the method comprises:
the observation model can be seeded with public, labeled data (e.g., online encyclopedia content) and then can be trained with unlabeled organizc data (unlabeled means that the language of data has not been identified yet)”);
creating training text data by separating the text data into n-grams ([0016]; “An n-gram observation models and dictionary observation models can provide a prediction for a given token in a content item for whether the token is in a particular language. In various implementations, an observation model can include a probability distribution that provides, for a given input token, a likelihood that the input token is in a given language”); and
training the machine learning model to generate at least one candidate language identifier of the user query and a respective language confidence score, wherein the training comprises:
training with the training text data without the associated language labels to obtain current predicted language labels by the machine learning model ([0030]; “A ‘model’ as used herein, refers to a component that is trained using training data to make predictions or provide probabilities for new data items, whether or not the new data items were included in the training data”).
Merl does not explicitly disclose the features of determining associated language labels for the second text data, based on a threshold number of a plurality of non-specific language detectors identifying at least one language; and retraining the machine learning model with discrepancy information between the current predicted language labels and the associated language labels, to update the current predicted language labels.  However, Thomson discloses that “…the accuracy requirement may be associated with a selection threshold value.  In these and other embodiments, the selector 406 may compare the estimated accuracy of a first unit, such as one of the ASR systems 4240 or one of the transcription unit 414 to the selection threshold value” (col. 35, lns. 65-col. 36, lns. 17; table 4) and it would have been obvious for one with ordinary skill in the art to utilize the teaching of Thomson in the system of Merl in view of the desire to enhance the language detection system by retraining a model to update language predictions data resulting in improving the efficiency of utilizing a machine learning scheme.

Regarding claim 8, Merl in view of Thomson discloses the method wherein the training further comprises iteratively repeating the retraining until the discrepancy information meets a threshold of accuracy (Thomson: table 4).  Therefore, the limitations of claim 8 are rejected in the analysis of claim 7, and the claim is rejected on that basis. 

Regarding claim 10, Merl in view of Thomason discloses the method wherein creating the training text data further includes aggregating work strings of text data having a length shorter than a low threshold length, with related text to create tended strings of words (Merl: [0002 and 0018]).  

Regarding claim 12, Merl in view of Thomason discloses the method wherein the n-grams include at least one of: one character, two characters, or three characters (Thomson: col. 41, lns. 56-col. 42, lns. 3).  Therefore, the limitations of claim 12 are rejected in the analysis of claim 7, and the claim is rejected on that basis. 
. 

9.	Claims 9 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Merl in view of Thomson, and further in view of U.S. 10,083,176 (hereinafter Desai).

Regarding claim 9, Merl in view of Thomason does not explicitly disclose the method wherein creating the training text data further includes splitting word strings of the text data having a length greater than an upper threshold length, into random word lengths of a predetermined length.  However, Desai discloses that “In step 1130, the document is tokenized.  For example, content of the document or other associated metadata may be tokenized.  Tokenization may occur using tokenizer to extract individual terms or sequences of terms” (col. 26, lns. 12-27) and it would have been obvious for one with ordinary skill in the art to utilize the teaching of Desai in the system of Merl in view of the desire to enhance the language detection system by utilizing the feature of splitting text based on its length resulting in improving the efficiency of utilizing a machine learning scheme.

Regarding claim 11, Merl in view of Thomason discloses the method wherein creating the training txt data is by determining the n-grams that meet a commonality threshold across languages and within languages.  However, Desai discloses that “To further improve the quality of semantic vectors, semantic vector analysis module 340 may apply certain filters.  In one example, semantic vector analysis module 340 may apply one or more rules to remove terms with low Inverse Document Frequency (IDF)” (col. 15, lns. 58-col. 16, lns. 6) and it would have been obvious for one with ordinary skill in the art to utilize the teaching of Desai in the system of Merl in view of the desire to enhance the language detection system by utilizing the n-gram commonality resulting in improving the efficiency of utilizing a machine learning scheme.
 
Conclusion
10.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MONICA M PYO whose telephone number is (571)272-8192. The examiner can normally be reached Monday-Friday 8am-4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, APU MOFIZ can be reached on 571-272-4080. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.








/MONICA M PYO/Primary Examiner, Art Unit 2161