DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The Amendment filed October 10, 2022 has been entered.  Claims 1 – 20 are pending in the application.  Applicant’s amendments to the Specification and Claims have overcome each and every objection and 35 U.S.C. 112(a) rejection previously set forth in the Non-Final Office Action mailed July 21, 2022.
Response to Arguments
Applicant's arguments filed October 10, 2022 have been fully considered but they are not persuasive.
On page 22 of Applicant’s response, Applicant argues that Arnold et al. (US Patent Application Publication No. 2018/0101599), hereinafter Arnold, does not disclose “The predicting is based on the input phrase and the corpus of the documents in a search index for the specific domain, and the input phrase and a next word label for the input phrase for the prediction of the one or more next words is derived from each sentence in the corpus.”, as included in the amended independent claims 1, 8 and 15.
Arnold recites, in paragraph 0009, lines 1-12, “In general, the document context is a function of various features, including, but not limited to, previous documents (including document sentences and metadata), preceding sentences in the current document, and metadata of the current document. In general, the use of the term “previous documents” in terms of document context refers to either all previous documents (e.g., some or all of the documents used for training the language model), or to a subset of those previous documents that are determined to be relevant based on the document context (e.g., prior emails with the same recipient) and/or content of the current user document.”, disclosing basing the text completion prediction on sentences in previous documents, where “a subset of those previous documents that are determined to be relevant based on the document context” reads on a specific domain.
Arnold also recites, in paragraph 0036, lines 1-4, “the retrieval model is derived from a store of prior documents created or edited by the user (or any other desired source or corpus of existing documents”, disclosing basing the text completion prediction on a corpus of documents, where “a store of prior documents” reads on documents in a search index.
Arnold further recites, in paragraph 0116, lines 11-18, “Similarly, at any time that the user enters text via the keyboard 215 (or other input mechanism) rather than selecting one of the completion suggestion controls (e.g., 220, 225 or 230), some or all of those completion suggestion controls may change in response to user this user text entry based the combination of the content of the document and the current document context, as discussed above.”, disclosing basing the text completion prediction on the input phrase, where “user text entry” reads on the input phrase.
Claim Objections
Claims 1, 8 and 15 are objected to because of the following informalities:
In Claim 1, line 8, “an search index” should read “a search index”.
In Claim 8, line 10, “an search index” should read “a search index”.
In Claim 15, line 13, “an search index” should read “a search index”.
Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1 – 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 1, the limitation “predicting, by the one or more computer processors, one or more next words for the input phrase, wherein the prediction is based on a deep neural network model that has been trained with a corpus of documents for a specific domain and the predicting being based on the input phrase and the corpus of the documents in a search index for the specific domain, the input phrase and a next word label for the input phrase for the prediction of the one or more next words being derived from each sentence in the corpus” is indefinite because it is not clear how the input phrase is being derived from sentences in the corpus.  For examination purposes, “the input phrase and a next word label for the input phrase for the prediction of the one or more next words being derived from each sentence in the corpus” will be interpreted to mean that the input phrase is matched to phrases contained in sentences in the corpus to predict the one or more next words.
Claims 2 – 7 depend from claim 1, and thus recite the limitations of claim 1, and do not resolve the indefinite language from claim 1.
Also, in claim 5, the limitation “wherein the prediction weight is a mean of a combined string similarity for each prediction pair of the plurality of prediction pairs” is indefinite because it is not clear how “combined” is being used to limit the “string similarity for each prediction pair”, and it is not clear whether “a mean of the combined string similarity for each prediction pair” is a mean of all the string similarity scores or a separate mean for each predicted pair.  If the mean is interpreted as a mean of all string similarity scores, the prediction weight would be the same for all predictions, and the claim 5 limitation “multiplying, by the one or more computer processors, a first normalized score of each first prediction of the one or more first predictions by the prediction weight to create a first weighted score for each first prediction” would not perform the function of generating weighted scores.  For examination purposes, the term “prediction weight” will be interpreted as a weight used for comparing similarity scores from the first system with similarity scores from the second system.
In addition, claim 6 recites the limitation “selecting, by the one or more computer processors, one or more top query predictions, wherein the one or more top query predictions are the one or more predicted phrases that have a highest similarity based on the vector similarity“.  There is insufficient antecedent basis for “the vector similarity” in the claim limitation.
Regarding claim 8, the limitation “predict one or more next words for the input phrase, wherein the prediction is based on a deep neural network model that has been trained with a corpus of documents for a specific domain, and the prediction being based on the input phrase and the corpus of the documents in a search index for the specific domain, the input phrase and a next word label for the input phrase for the prediction of the one or more next words being derived from each sentence in the corpus” is indefinite because it is not clear how the input phrase is being derived from sentences in the corpus.  For examination purposes, “the input phrase and a next word label for the input phrase for the prediction of the one or more next words being derived from each sentence in the corpus” will be interpreted to mean that the input phrase is matched to phrases contained in sentences in the corpus to predict the one or more next words.
Claims 9 – 14 depend from claim 8, and thus recite the limitations of claim 8, and do not resolve the indefinite language from claim 8.
Also, in claim 12, the limitation “wherein the prediction weight is a mean of a combined string similarity for each prediction pair of the plurality of prediction pairs” is indefinite because it is not clear how “combined” is being used to limit the “string similarity for each prediction pair”, and it is not clear whether “a mean of the combined string similarity for each prediction pair” is a mean of all the string similarity scores or a separate mean for each predicted pair.  If the mean is interpreted as a mean of all string similarity scores, the prediction weight would be the same for all predictions, and the claim 12 limitation “multiply a first normalized score of each first prediction of the one or more first predictions by the prediction weight to create a first weighted score for each first prediction” would not perform the function of generating weighted scores.  For examination purposes, the term “prediction weight” will be interpreted as a weight used for comparing similarity scores from the first system with similarity scores from the second system.
In addition, claim 13 recites the limitation “select one or more top query predictions, wherein the one or more top query predictions are the one or more predicted phrases that have a highest similarity based on the vector similarity“.  There is insufficient antecedent basis for “the vector similarity” in the claim limitation.
Regarding claim 15, the limitation “predict one or more next words for the input phrase, wherein the prediction is based on a deep neural network model that has been trained with a corpus of documents for a specific domain, and the prediction being based on the input phrase and the corpus of the documents in a search index for the specific domain, the input phrase and a next word label for the input phrase for the prediction of the one or more next words being derived from each sentence in the corpus” is indefinite because it is not clear how the input phrase is being derived from sentences in the corpus.  For examination purposes, “the input phrase and a next word label for the input phrase for the prediction of the one or more next words being derived from each sentence in the corpus” will be interpreted to mean that the input phrase is matched to phrases contained in sentences in the corpus to predict the one or more next words.
Claims 16 – 20 depend from claim 15, and thus recite the limitations of claim 15, and do not resolve the indefinite language from claim 15.
Also, in claim 19, the limitation “wherein the prediction weight is a mean of a combined string similarity for each prediction pair of the plurality of prediction pairs” is indefinite because it is not clear how “combined” is being used to limit the “string similarity for each prediction pair”, and it is not clear whether “a mean of the combined string similarity for each prediction pair” is a mean of all the string similarity scores or a separate mean for each predicted pair.  If the mean is interpreted as a mean of all string similarity scores, the prediction weight would be the same for all predictions, and the claim 19 limitation “multiply a first normalized score of each first prediction of the one or more first predictions by the prediction weight to create a first weighted score for each first prediction” would not perform the function of generating weighted scores.  For examination purposes, the term “prediction weight” will be interpreted as a weight used for comparing similarity scores from the first system with similarity scores from the second system.
In addition, claim 20 recites the limitation “select one or more top query predictions, wherein the one or more top query predictions are the one or more predicted phrases that have a highest similarity based on the vector similarity“.  There is insufficient antecedent basis for “the vector similarity” in the claim limitation.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 8 and 15 are rejected under 35 U.S.C. 102(a)(1) and 102(a)(2) as being anticipated by Arnold et al. (US Patent Application Publication No. 2018/0101599), hereinafter Arnold.
Regarding claim 1, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses a computer-implemented method for predictive auto completion, the computer- implemented method comprising:
receiving, by one or more computer processors (Figure 8, "Processing Unit(s) 810"), an input phrase for an inquiry, wherein the input phrase is a sequence of words (Paragraph 0001, lines 1-2, "A wide variety of applications have been implemented for autocompleting words or search queries."; Paragraph 0109, lines 1-10, "As illustrated by FIG. 2, in various implementations, the user interface of the Interactive Text Completion System includes a text input region 210 in which text is entered and/or edited by the user. For example, the text input region 210 of FIG. 2 illustrates user entered and/or selected text of “My address”. In addition, in various implementations, text, including whole words, individual characters, punctuation and symbols, may be entered into the text input region 210 and/or edited via a virtual keyboard 215 rendered on the display 200.");
predicting, by the one or more computer processors, one or more next words for the input phrase, wherein the prediction is based on a deep neural network model that has been trained with a corpus of documents for a specific domain (Paragraph 0007, lines 1-6, "In various implementations, the multi-word text completion suggestions are a combination of both predicted word completions (e.g., the remainder of a word that a user has begun entering, a spelling corrected word, or a complete next word) and sequences of multiple suggested follow-on words."; Paragraph 0038, lines 1-5, "Techniques applied to generate the retrieval model and the language model include, but are not limited to, statistical language models, contextual neural language models, recurrent neural networks, N-gram based language prediction models, etc."; Paragraph 0008, lines 1-3, "Both the retrieval model and the language model are trained, via any of a variety of machine-learning techniques, on a corpus of preexisting documents." Paragraph 0039, lines 8-12, "The source documents from which the candidates are extracted may be the user's prior documents (or some context-based subset of those documents) or any other corpus of preexisting documents created by one or more authors."),
and the predicting being based on the input phrase and the corpus of the documents in a search index for the specific domain, the input phrase and a next word label for the input phrase for the prediction of the one or more next words being derived from each sentence in the corpus (Paragraph 0009, lines 1-12, “In general, the document context is a function of various features, including, but not limited to, previous documents (including document sentences and metadata), preceding sentences in the current document, and metadata of the current document. In general, the use of the term “previous documents” in terms of document context refers to either all previous documents (e.g., some or all of the documents used for training the language model), or to a subset of those previous documents that are determined to be relevant based on the document context (e.g., prior emails with the same recipient) and/or content of the current user document.”; Paragraph 0036, lines 1-4, “the retrieval model is derived from a store of prior documents created or edited by the user (or any other desired source or corpus of existing documents)”; Paragraph 0116, lines 11-18, “Similarly, at any time that the user enters text via the keyboard 215 (or other input mechanism) rather than selecting one of the completion suggestion controls (e.g., 220, 225 or 230), some or all of those completion suggestion controls may change in response to user this user text entry based the combination of the content of the document and the current document context, as discussed above.”); The store of prior documents reads on the corpus of the documents in a search index, the subset of previous documents that are determined to be relevant based on the document context reads on documents for the specific domain, and the user text entry reads on the input phrase.);
appending, by the one or more computer processors, the one or more next words to the input phrase to create one or more predicted phrases (Paragraph 0059, lines 11-15, "In various implementations, the multi-word text completion suggestions are a combination of both predicted word completions (e.g., the remainder of a word, or a spelling corrected word, that a user has begun entering) and sequences of multiple suggested follow-on words."; Paragraph 0128, lines 14-20,"If the user has not entered one or more letters of a partial word, then the first selectable word in completion suggestion control 420 will simply be a full word. In this case, this full word may be optionally highlighted to show that it is the next word that will be appended to the text in text input region 410 if it, or any subsequent word in the completion suggestion control, is selected by the user.");
and sorting, by the one or more computer processors, the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on a similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain (Paragraph 0007, lines 1-15, " In various implementations, the multi-word text completion suggestions are a combination of both predicted word completions (e.g., the remainder of a word that a user has begun entering, a spelling corrected word, or a complete next word) and sequences of multiple suggested follow-on words. When generating the multi-word completion suggestions, a machine-learned retrieval model is first applied to extract a plurality of candidate suggestions from a corpus of pre-existing documents (also referred to herein as “source documents”) based on a current content of the current user document. These candidate suggestions are then scored by a language model based on an automatically determined document context, with a plurality of highest scoring candidate suggestions then being presented or otherwise output as the aforementioned multi-word completion suggestions.").
Regarding claim 8, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses a computer program product for predictive auto completion, the computer program product comprising one or more computer readable storage media and program instructions stored on the one or more computer readable storage media (Paragraph 0049, lines 3-7, “The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.”), the program instructions including instructions to:
receive an input phrase for an inquiry, wherein the input phrase is a sequence of words (Paragraph 0001, lines 1-2, "A wide variety of applications have been implemented for autocompleting words or search queries."; Paragraph 0109, lines 1-10, "As illustrated by FIG. 2, in various implementations, the user interface of the Interactive Text Completion System includes a text input region 210 in which text is entered and/or edited by the user. For example, the text input region 210 of FIG. 2 illustrates user entered and/or selected text of “My address”. In addition, in various implementations, text, including whole words, individual characters, punctuation and symbols, may be entered into the text input region 210 and/or edited via a virtual keyboard 215 rendered on the display 200.");
predict one or more next words for the input phrase, wherein the prediction is based on a deep neural network model that has been trained with a corpus of documents for a specific domain (Paragraph 0007, lines 1-6, "In various implementations, the multi-word text completion suggestions are a combination of both predicted word completions (e.g., the remainder of a word that a user has begun entering, a spelling corrected word, or a complete next word) and sequences of multiple suggested follow-on words."; Paragraph 0038, lines 1-5, "Techniques applied to generate the retrieval model and the language model include, but are not limited to, statistical language models, contextual neural language models, recurrent neural networks, N-gram based language prediction models, etc."; Paragraph 0008, lines 1-3, "Both the retrieval model and the language model are trained, via any of a variety of machine-learning techniques, on a corpus of preexisting documents." Paragraph 0039, lines 8-12, "The source documents from which the candidates are extracted may be the user's prior documents (or some context-based subset of those documents) or any other corpus of preexisting documents created by one or more authors."),
and the prediction being based on the input phrase and the corpus of the documents in a search index for the specific domain, the input phrase and a next word label for the input phrase for the prediction of the one or more next words being derived from each sentence in the corpus (Paragraph 0009, lines 1-12, “In general, the document context is a function of various features, including, but not limited to, previous documents (including document sentences and metadata), preceding sentences in the current document, and metadata of the current document. In general, the use of the term “previous documents” in terms of document context refers to either all previous documents (e.g., some or all of the documents used for training the language model), or to a subset of those previous documents that are determined to be relevant based on the document context (e.g., prior emails with the same recipient) and/or content of the current user document.”; Paragraph 0036, lines 1-4, “the retrieval model is derived from a store of prior documents created or edited by the user (or any other desired source or corpus of existing documents)”; Paragraph 0116, lines 11-18, “Similarly, at any time that the user enters text via the keyboard 215 (or other input mechanism) rather than selecting one of the completion suggestion controls (e.g., 220, 225 or 230), some or all of those completion suggestion controls may change in response to user this user text entry based the combination of the content of the document and the current document context, as discussed above.”); The store of prior documents reads on the corpus of the documents in a search index, the subset of previous documents that are determined to be relevant based on the document context reads on documents for the specific domain, and the user text entry reads on the input phrase.);
append the one or more next words to the input phrase to create one or more predicted phrases (Paragraph 0059, lines 11-15, "In various implementations, the multi-word text completion suggestions are a combination of both predicted word completions (e.g., the remainder of a word, or a spelling corrected word, that a user has begun entering) and sequences of multiple suggested follow-on words."; Paragraph 0128, lines 14-20,"If the user has not entered one or more letters of a partial word, then the first selectable word in completion suggestion control 420 will simply be a full word. In this case, this full word may be optionally highlighted to show that it is the next word that will be appended to the text in text input region 410 if it, or any subsequent word in the completion suggestion control, is selected by the user.");
and sort the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on a similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain (Paragraph 0007, lines 1-15, " In various implementations, the multi-word text completion suggestions are a combination of both predicted word completions (e.g., the remainder of a word that a user has begun entering, a spelling corrected word, or a complete next word) and sequences of multiple suggested follow-on words. When generating the multi-word completion suggestions, a machine-learned retrieval model is first applied to extract a plurality of candidate suggestions from a corpus of pre-existing documents (also referred to herein as “source documents”) based on a current content of the current user document. These candidate suggestions are then scored by a language model based on an automatically determined document context, with a plurality of highest scoring candidate suggestions then being presented or otherwise output as the aforementioned multi-word completion suggestions.").
Regarding claim 15, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses a computer system for predictive auto completion, the computer system comprising:
one or more computer processors (Figure 8, "Processing Unit(s) 810");
one or more computer readable storage media (Paragraph 0049, lines 3-7, “The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.”);
and program instructions stored on the one or more computer readable storage media for execution by at least one of the one or more computer processors, the stored program instructions including instructions to:
receive an input phrase for an inquiry, wherein the input phrase is a sequence of words (Paragraph 0001, lines 1-2, "A wide variety of applications have been implemented for autocompleting words or search queries."; Paragraph 0109, lines 1-10, "As illustrated by FIG. 2, in various implementations, the user interface of the Interactive Text Completion System includes a text input region 210 in which text is entered and/or edited by the user. For example, the text input region 210 of FIG. 2 illustrates user entered and/or selected text of “My address”. In addition, in various implementations, text, including whole words, individual characters, punctuation and symbols, may be entered into the text input region 210 and/or edited via a virtual keyboard 215 rendered on the display 200.");
predict one or more next words for the input phrase, wherein the prediction is based on a deep neural network model that has been trained with a corpus of documents for a specific domain (Paragraph 0007, lines 1-6, "In various implementations, the multi-word text completion suggestions are a combination of both predicted word completions (e.g., the remainder of a word that a user has begun entering, a spelling corrected word, or a complete next word) and sequences of multiple suggested follow-on words."; Paragraph 0038, lines 1-5, "Techniques applied to generate the retrieval model and the language model include, but are not limited to, statistical language models, contextual neural language models, recurrent neural networks, N-gram based language prediction models, etc."; Paragraph 0008, lines 1-3, "Both the retrieval model and the language model are trained, via any of a variety of machine-learning techniques, on a corpus of preexisting documents." Paragraph 0039, lines 8-12, "The source documents from which the candidates are extracted may be the user's prior documents (or some context-based subset of those documents) or any other corpus of preexisting documents created by one or more authors."),
and the prediction being based on the input phrase and the corpus of the documents in an search index for the specific domain, the input phrase and a next word label for the input phrase for the prediction of the one or more next words being derived from each sentence in the corpus (Paragraph 0009, lines 1-12, “In general, the document context is a function of various features, including, but not limited to, previous documents (including document sentences and metadata), preceding sentences in the current document, and metadata of the current document. In general, the use of the term “previous documents” in terms of document context refers to either all previous documents (e.g., some or all of the documents used for training the language model), or to a subset of those previous documents that are determined to be relevant based on the document context (e.g., prior emails with the same recipient) and/or content of the current user document.”; Paragraph 0036, lines 1-4, “the retrieval model is derived from a store of prior documents created or edited by the user (or any other desired source or corpus of existing documents)”; Paragraph 0116, lines 11-18, “Similarly, at any time that the user enters text via the keyboard 215 (or other input mechanism) rather than selecting one of the completion suggestion controls (e.g., 220, 225 or 230), some or all of those completion suggestion controls may change in response to user this user text entry based the combination of the content of the document and the current document context, as discussed above.”); The store of prior documents reads on the corpus of the documents in a search index, the subset of previous documents that are determined to be relevant based on the document context reads on documents for the specific domain, and the user text entry reads on the input phrase.);
append the one or more next words to the input phrase to create one or more predicted phrases (Paragraph 0059, lines 11-15, "In various implementations, the multi-word text completion suggestions are a combination of both predicted word completions (e.g., the remainder of a word, or a spelling corrected word, that a user has begun entering) and sequences of multiple suggested follow-on words."; Paragraph 0128, lines 14-20,"If the user has not entered one or more letters of a partial word, then the first selectable word in completion suggestion control 420 will simply be a full word. In this case, this full word may be optionally highlighted to show that it is the next word that will be appended to the text in text input region 410 if it, or any subsequent word in the completion suggestion control, is selected by the user.");
and sort the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on a similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain (Paragraph 0007, lines 1-15, " In various implementations, the multi-word text completion suggestions are a combination of both predicted word completions (e.g., the remainder of a word that a user has begun entering, a spelling corrected word, or a complete next word) and sequences of multiple suggested follow-on words. When generating the multi-word completion suggestions, a machine-learned retrieval model is first applied to extract a plurality of candidate suggestions from a corpus of pre-existing documents (also referred to herein as “source documents”) based on a current content of the current user document. These candidate suggestions are then scored by a language model based on an automatically determined document context, with a plurality of highest scoring candidate suggestions then being presented or otherwise output as the aforementioned multi-word completion suggestions.").
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 7, 9, 14 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Arnold in view of Chen et al. ("Gmail Smart Compose: Real-Time Assisted Writing"), hereinafter Chen.
Regarding claim 2, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer-implemented method as claimed in claim 1, but does not specifically disclose: wherein appending the one or more next words to the input phrase to create the one or more predicted phrases comprises: determining, by the one or more computer processors, whether any next word of the one or more next words for the input phrase is an end of sentence, wherein the end of sentence denotes that a specific phrase of the one or more predicted phrases is complete; responsive to determining that any specific phrase of the one or more predicted phrases is complete, storing, by the one or more computer processors, the completed any specific phrase in a list of completed phrases; responsive to determining that any specific phrase of the one or more predicted phrases is not complete, appending, by the one or more computer processors, the one or more next words to the any specific phrase of the one or more predicted phrases that is not complete; and predicting, by the one or more computer processors, one or more next words for the any specific phrase of the one or more predicted phrases that is not complete.
Chen teaches:
wherein appending the one or more next words to the input phrase to create the one or more predicted phrases comprises:
determining, by the one or more computer processors, whether any next word of the one or more next words for the input phrase is an end of sentence, wherein the end of sentence denotes that a specific phrase of the one or more predicted phrases is complete (Section 3.3, lines 5-15, "At each beam search step, new candidates are generated by extending each candidate by one token and adding them to the heap. Specifically, for each candidate sequence, we use the output of the softmax to get a probability distribution over the vocabulary, select the top k most likely tokens and add the k possible extensions into the heap. We do this for all candidate sequences. At the end of the step, the heap is always pruned to only keep m best candidates. Each candidate sequence is considered complete when a sentence punctuation token or a special end-of-sequence (<EOS>) token is generated, or when the candidate reaches a predefined maximum output sequence length.");
responsive to determining that any specific phrase of the one or more predicted phrases is complete, storing, by the one or more computer processors, the completed any specific phrase in a list of completed phrases (Section 3.3, lines 12-16, "Each candidate sequence is considered complete when a sentence punctuation token or a special end-of-sequence (<EOS>) token is generated, or when the candidate reaches a predefined maximum output sequence length. Upon completion, a candidate sequence will be added to the set of generated suggestions.");
responsive to determining that any specific phrase of the one or more predicted phrases is not complete, appending, by the one or more computer processors, the one or more next words to the any specific phrase of the one or more predicted phrases that is not complete (Section 3.3, lines 5-15, "At each beam search step, new candidates are generated by extending each candidate by one token and adding them to the heap. Specifically, for each candidate sequence, we use the output of the softmax to get a probability distribution over the vocabulary, select the top k most likely tokens and add the k possible extensions into the heap. We do this for all candidate sequences. At the end of the step, the heap is always pruned to only keep m best candidates. Each candidate sequence is considered complete when a sentence punctuation token or a special end-of-sequence (<EOS>) token is generated, or when the candidate reaches a predefined maximum output sequence length."; In the case where the candidate sequence is not considered complete, extending a candidate by one token and adding it to the heap reads on appending the next word.);
and predicting, by the one or more computer processors, one or more next words for the any specific phrase of the one or more predicted phrases that is not complete (Section 3.3, lines 5-15, "At each beam search step, new candidates are generated by extending each candidate by one token and adding them to the heap. Specifically, for each candidate sequence, we use the output of the softmax to get a probability distribution over the vocabulary, select the top k most likely tokens and add the k possible extensions into the heap. We do this for all candidate sequences. At the end of the step, the heap is always pruned to only keep m best candidates. Each candidate sequence is considered complete when a sentence punctuation token or a special end-of-sequence (<EOS>) token is generated, or when the candidate reaches a predefined maximum output sequence length."; In the case where the candidate sequence is not considered complete, generating a new candidate by extending a candidate by one token reads on predicting the next word.).
Chen teaches generating a next word, appending the next word to the candidate sequence to form a new candidate sequence, determining if the candidate sequence is complete, and adding the candidate sequence to a set of generated suggestions when the phrase is complete, in order to assist the user in writing by reducing repetitive typing (Abstract, lines 1-3, "In this paper, we present Smart Compose, a novel system for generating interactive, real-time suggestions in Gmail that assists users in writing mails by reducing repetitive typing.").
Arnold and Chen are considered to be analogous to the claimed invention because they are in the same field of text autocomplete systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Chen to generate a next word, append the next word to the candidate sequence to form a new candidate sequence, determine if the candidate sequence is complete, and add the candidate sequence to a set of generated suggestions when the phrase is complete.  Doing so would allow for assisting the user in writing by reducing repetitive typing.
Regarding claim 7, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer-implemented method as claimed in claim 1, but does not specifically disclose: further comprising: selecting, by the one or more computer processors, a subset of the one or more predicted phrases, wherein a size of the subset is a predetermined number; and sending, by the one or more computer processors, the subset of the one or more predicted phrases to a user.
Chen teaches:
selecting, by the one or more computer processors, a subset of the one or more predicted phrases, wherein a size of the subset is a predetermined number (Section 3.3, lines 7-11, "Specifically, for each candidate sequence, we use the output of the softmax to get a probability distribution over the vocabulary, select the top k most likely tokens and add the k possible extensions into the heap. We do this for all candidate sequences. At the end of the step, the heap is always pruned to only keep m best candidates.");
and sending, by the one or more computer processors, the subset of the one or more predicted phrases to a user (Section 1, lines 13-15, "In this paper, we introduce Smart Compose, a system for providing real-time, interactive suggestions to help users compose messages”).
Chen teaches keeping a fixed number of candidate sequences and providing the suggestions to the user in order to assist the user in writing by reducing repetitive typing (Abstract, lines 1-3, "In this paper, we present Smart Compose, a novel system for generating interactive, real-time suggestions in Gmail that assists users in writing mails by reducing repetitive typing.").
Arnold and Chen are considered to be analogous to the claimed invention because they are in the same field of text autocomplete systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Chen to keep a fixed number of candidate sequences and provide the suggestions to the user.  Doing so would allow for assisting the user in writing by reducing repetitive typing.
Regarding claim 9, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer program product as claimed in claim 8, but does not specifically disclose: wherein append the one or more next words to the input phrase to create the one or more predicted phrases comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to: determine whether any next word of the one or more next words for the input phrase is an end of sentence, wherein the end of sentence denotes that a specific phrase of the one or more predicted phrases is complete; responsive to determining that any specific phrase of the one or more predicted phrases is complete, store the completed any specific phrase in a list of completed phrases; responsive to determining that any specific phrase of the one or more predicted phrases is not complete, append the one or more next words to the any specific phrase of the one or more predicted phrases that is not complete; and predict one or more next words for the any specific phrase of the one or more predicted phrases that is not complete.
Chen teaches:
wherein append the one or more next words to the input phrase to create the one or more predicted phrases comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to:
determine whether any next word of the one or more next words for the input phrase is an end of sentence, wherein the end of sentence denotes that a specific phrase of the one or more predicted phrases is complete (Section 3.3, lines 5-15, "At each beam search step, new candidates are generated by extending each candidate by one token and adding them to the heap. Specifically, for each candidate sequence, we use the output of the softmax to get a probability distribution over the vocabulary, select the top k most likely tokens and add the k possible extensions into the heap. We do this for all candidate sequences. At the end of the step, the heap is always pruned to only keep m best candidates. Each candidate sequence is considered complete when a sentence punctuation token or a special end-of-sequence (<EOS>) token is generated, or when the candidate reaches a predefined maximum output sequence length.");
responsive to determining that any specific phrase of the one or more predicted phrases is complete, store the completed any specific phrase in a list of completed phrases (Section 3.3, lines 12-16, "Each candidate sequence is considered complete when a sentence punctuation token or a special end-of-sequence (<EOS>) token is generated, or when the candidate reaches a predefined maximum output sequence length. Upon completion, a candidate sequence will be added to the set of generated suggestions.");
responsive to determining that any specific phrase of the one or more predicted phrases is not complete, append the one or more next words to the any specific phrase of the one or more predicted phrases that is not complete (Section 3.3, lines 5-15, "At each beam search step, new candidates are generated by extending each candidate by one token and adding them to the heap. Specifically, for each candidate sequence, we use the output of the softmax to get a probability distribution over the vocabulary, select the top k most likely tokens and add the k possible extensions into the heap. We do this for all candidate sequences. At the end of the step, the heap is always pruned to only keep m best candidates. Each candidate sequence is considered complete when a sentence punctuation token or a special end-of-sequence (<EOS>) token is generated, or when the candidate reaches a predefined maximum output sequence length."; In the case where the candidate sequence is not considered complete, extending a candidate by one token and adding it to the heap reads on appending the next word.);
and predict one or more next words for the any specific phrase of the one or more predicted phrases that is not complete (Section 3.3, lines 5-15, "At each beam search step, new candidates are generated by extending each candidate by one token and adding them to the heap. Specifically, for each candidate sequence, we use the output of the softmax to get a probability distribution over the vocabulary, select the top k most likely tokens and add the k possible extensions into the heap. We do this for all candidate sequences. At the end of the step, the heap is always pruned to only keep m best candidates. Each candidate sequence is considered complete when a sentence punctuation token or a special end-of-sequence (<EOS>) token is generated, or when the candidate reaches a predefined maximum output sequence length."; In the case where the candidate sequence is not considered complete, generating a new candidate by extending a candidate by one token reads on predicting the next word.).
Chen teaches generating a next word, appending the next word to the candidate sequence to form a new candidate sequence, determining if the candidate sequence is complete, and adding the candidate sequence to a set of generated suggestions when the phrase is complete, in order to assist the user in writing by reducing repetitive typing (Abstract, lines 1-3, "In this paper, we present Smart Compose, a novel system for generating interactive, real-time suggestions in Gmail that assists users in writing mails by reducing repetitive typing.").
Arnold and Chen are considered to be analogous to the claimed invention because they are in the same field of text autocomplete systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Chen to generate a next word, append the next word to the candidate sequence to form a new candidate sequence, determine if the candidate sequence is complete, and add the candidate sequence to a set of generated suggestions when the phrase is complete.  Doing so would allow for assisting the user in writing by reducing repetitive typing.
Regarding claim 14, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer program product as claimed in claim 8, but does not specifically disclose: further comprising one or more of the following program instructions, stored on the one or more computer readable storage media, to: select a subset of the one or more predicted phrases, wherein a size of the subset is a predetermined number; and send the subset of the one or more predicted phrases to a user.
Chen teaches:
select a subset of the one or more predicted phrases, wherein a size of the subset is a predetermined number (Section 3.3, lines 7-11, "Specifically, for each candidate sequence, we use the output of the softmax to get a probability distribution over the vocabulary, select the top k most likely tokens and add the k possible extensions into the heap. We do this for all candidate sequences. At the end of the step, the heap is always pruned to only keep m best candidates.");
and send the subset of the one or more predicted phrases to a user (Section 1, lines 13-15, "In this paper, we introduce Smart Compose, a system for providing real-time, interactive suggestions to help users compose messages”).
Chen teaches keeping a fixed number of candidate sequences and providing the suggestions to the user in order to assist the user in writing by reducing repetitive typing (Abstract, lines 1-3, "In this paper, we present Smart Compose, a novel system for generating interactive, real-time suggestions in Gmail that assists users in writing mails by reducing repetitive typing.").
Arnold and Chen are considered to be analogous to the claimed invention because they are in the same field of text autocomplete systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Chen to keep a fixed number of candidate sequences and provide the suggestions to the user.  Doing so would allow for assisting the user in writing by reducing repetitive typing.
Regarding claim 16, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer system as claimed in claim 15, but does not specifically disclose: wherein append the one or more next words to the input phrase to create the one or more predicted phrases comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to: determine whether any next word of the one or more next words for the input phrase is an end of sentence, wherein the end of sentence denotes that a specific phrase of the one or more predicted phrases is complete; responsive to determining that any specific phrase of the one or more predicted phrases is complete, store the completed any specific phrase in a list of completed phrases; responsive to determining that any specific phrase of the one or more predicted phrases is not complete, append the one or more next words to the any specific phrase of the one or more predicted phrases that is not complete; and predict one or more next words for the any specific phrase of the one or more predicted phrases that is not complete.
Chen teaches:
wherein append the one or more next words to the input phrase to create the one or more predicted phrases comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to:
determine whether any next word of the one or more next words for the input phrase is an end of sentence, wherein the end of sentence denotes that a specific phrase of the one or more predicted phrases is complete (Section 3.3, lines 5-15, "At each beam search step, new candidates are generated by extending each candidate by one token and adding them to the heap. Specifically, for each candidate sequence, we use the output of the softmax to get a probability distribution over the vocabulary, select the top k most likely tokens and add the k possible extensions into the heap. We do this for all candidate sequences. At the end of the step, the heap is always pruned to only keep m best candidates. Each candidate sequence is considered complete when a sentence punctuation token or a special end-of-sequence (<EOS>) token is generated, or when the candidate reaches a predefined maximum output sequence length.");
responsive to determining that any specific phrase of the one or more predicted phrases is complete, store the completed any specific phrase in a list of completed phrases (Section 3.3, lines 12-16, "Each candidate sequence is considered complete when a sentence punctuation token or a special end-of-sequence (<EOS>) token is generated, or when the candidate reaches a predefined maximum output sequence length. Upon completion, a candidate sequence will be added to the set of generated suggestions.");
responsive to determining that any specific phrase of the one or more predicted phrases is not complete, append the one or more next words to the any specific phrase of the one or more predicted phrases that is not complete (Section 3.3, lines 5-15, "At each beam search step, new candidates are generated by extending each candidate by one token and adding them to the heap. Specifically, for each candidate sequence, we use the output of the softmax to get a probability distribution over the vocabulary, select the top k most likely tokens and add the k possible extensions into the heap. We do this for all candidate sequences. At the end of the step, the heap is always pruned to only keep m best candidates. Each candidate sequence is considered complete when a sentence punctuation token or a special end-of-sequence (<EOS>) token is generated, or when the candidate reaches a predefined maximum output sequence length."; In the case where the candidate sequence is not considered complete, extending a candidate by one token and adding it to the heap reads on appending the next word.);
and predict one or more next words for the any specific phrase of the one or more predicted phrases that is not complete (Section 3.3, lines 5-15, "At each beam search step, new candidates are generated by extending each candidate by one token and adding them to the heap. Specifically, for each candidate sequence, we use the output of the softmax to get a probability distribution over the vocabulary, select the top k most likely tokens and add the k possible extensions into the heap. We do this for all candidate sequences. At the end of the step, the heap is always pruned to only keep m best candidates. Each candidate sequence is considered complete when a sentence punctuation token or a special end-of-sequence (<EOS>) token is generated, or when the candidate reaches a predefined maximum output sequence length."; In the case where the candidate sequence is not considered complete, generating a new candidate by extending a candidate by one token reads on predicting the next word.).
Chen teaches generating a next word, appending the next word to the candidate sequence to form a new candidate sequence, determining if the candidate sequence is complete, and adding the candidate sequence to a set of generated suggestions when the phrase is complete, in order to assist the user in writing by reducing repetitive typing (Abstract, lines 1-3, "In this paper, we present Smart Compose, a novel system for generating interactive, real-time suggestions in Gmail that assists users in writing mails by reducing repetitive typing.").
Arnold and Chen are considered to be analogous to the claimed invention because they are in the same field of text autocomplete systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Chen to generate a next word, append the next word to the candidate sequence to form a new candidate sequence, determine if the candidate sequence is complete, and add the candidate sequence to a set of generated suggestions when the phrase is complete.  Doing so would allow for assisting the user in writing by reducing repetitive typing.
Claims 3, 10 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Arnold in view of Marey (US Patent Application Publication No. 2022/0012296).
Regarding claim 3, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer-implemented method as claimed in claim 1.  Arnold further discloses:
wherein predicting the one or more next words for the input phrase, wherein the prediction is based on the deep neural network model that has been trained with the corpus of documents for the specific domain further comprises:
generating, by the one or more computer processors, training data from a plurality of sentences in the corpus of documents for the specific domain (Paragraph 0097, lines 1-6, "In various implementations, training data (i.e., the corpus of existing documents) for learning the retrieval model and/or the language model was optionally preprocessed to remove duplicate messages, identify greetings and signatures, parse sentences, and tokenize document bodies into sentences.").
Arnold does not specifically disclose: creating, by the one or more computer processors, a word vector representation for each word of one or more words in a training phrase; inputting, by the one or more computer processors, the word vector representation for each word of one or more words in the training phrase into a deep learning sequence model to create a phrase vector representation of the training phrase; and inputting, by the one or more computer processors, the phrase vector representation of the training phrase into the deep neural network model.
Marey teaches:
creating, by the one or more computer processors, a word vector representation for each word of one or more words in a training phrase (Paragraph 0040, lines 1-9, "As another example, in generating and/or retrieving recommended social media posts 418, social media post analyzer 406 and/or social media post generator 416 may employ a word (or phrase or sentence) embedding machine learning model to recommend a semantically similar post (e.g., to posts 402 and/or 404). For example, a text corpus may be used to train a word embedding machine learning model, in order to represent each word as a vector in a vector space.");
inputting, by the one or more computer processors, the word vector representation for each word of one or more words in the training phrase into a deep learning sequence model to create a phrase vector representation of the training phrase (Paragraph 0041, lines 1-10, "To determine the similarity between sentences or phrases in social media posts, social media post analyzer 406 and/or social media post generator 416 may perform operations on word embeddings included in the phrase or sentence (e.g., compute an average or weighted average of word vectors in the sentence), and perform a cosine similarity operation as between the computed vectors to determine sentence similarity. In some embodiments, one or more machine learning models may be used by the system to obtain sentence or phrase embeddings of social media posts"; Paragraph 0041, lines 19-23, "Various machine learning models may be employed for this task (e.g., recurrent neural networks, bidirectional recurrent neural networks, LSTM-RNN models, encoder-decoder models, transformers, etc.).");
and inputting, by the one or more computer processors, the phrase vector representation of the training phrase into the deep neural network model (Paragraph 0041, lines 1-10, "To determine the similarity between sentences or phrases in social media posts, social media post analyzer 406 and/or social media post generator 416 may perform operations on word embeddings included in the phrase or sentence (e.g., compute an average or weighted average of word vectors in the sentence), and perform a cosine similarity operation as between the computed vectors to determine sentence similarity. In some embodiments, one or more machine learning models may be used by the system to obtain sentence or phrase embeddings of social media posts"; Paragraph 0043, lines 1-6, "In some embodiments, social media post generator 416 may generate text for recommended posts using NLP generation (e.g., using one or more machine learning models). For example, neural networks may be employed to generate text (e.g., using a technique such as autoregressive generation in conjunction with social media post context).").
Marey teaches generating word vector embeddings for the words in a sentence or phrase, using a machine learning model to generate sentence or phrase embeddings from the word vector embeddings, and using the sentence or phrase embeddings as the input to a neural network to generate text, in order to generate a response to a social media post (Paragraph 0003, lines 1-11, "systems and methods are provided herein for generating recommended social media posts for user selection, by identifying one or more content categories associated with a first social media post by a first user, parsing one or more social media posts (associated with the first social media post) by one or more other users, and generating for presentation to a second user, based on the one or more identified content categories of the first social media post and the one or more parsed social media posts, one or more recommended social media posts for the second user.").
Arnold and Marey are considered to be analogous to the claimed invention because they are in the same field of text generation systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Marey to generate word vector embeddings for the words in a sentence or phrase, use a machine learning model to generate sentence or phrase embeddings from the word vector embeddings, and use the sentence or phrase embeddings as the input to a neural network to generate text.  Doing so would allow for generating a response to a social media post.
Regarding claim 10, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer program product as claimed in claim 8.  Arnold further discloses:
wherein predict the one or more next words for the input phrase, wherein the prediction is based on the deep neural network model that has been trained with the corpus of documents for the specific domain further comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to:
generate training data from a plurality of sentences in the corpus of documents for the specific domain (Paragraph 0097, lines 1-6, "In various implementations, training data (i.e., the corpus of existing documents) for learning the retrieval model and/or the language model was optionally preprocessed to remove duplicate messages, identify greetings and signatures, parse sentences, and tokenize document bodies into sentences.").
Arnold does not specifically disclose: create a word vector representation for each word of one or more words in a training phrase; input the word vector representation for each word of one or more words in the training phrase into a deep learning sequence model to create a phrase vector representation of the training phrase; and input the phrase vector representation of the training phrase into the deep neural network model.
Marey teaches:
create a word vector representation for each word of one or more words in a training phrase (Paragraph 0040, lines 1-9, "As another example, in generating and/or retrieving recommended social media posts 418, social media post analyzer 406 and/or social media post generator 416 may employ a word (or phrase or sentence) embedding machine learning model to recommend a semantically similar post (e.g., to posts 402 and/or 404). For example, a text corpus may be used to train a word embedding machine learning model, in order to represent each word as a vector in a vector space.");
input the word vector representation for each word of one or more words in the training phrase into a deep learning sequence model to create a phrase vector representation of the training phrase (Paragraph 0041, lines 1-10, "To determine the similarity between sentences or phrases in social media posts, social media post analyzer 406 and/or social media post generator 416 may perform operations on word embeddings included in the phrase or sentence (e.g., compute an average or weighted average of word vectors in the sentence), and perform a cosine similarity operation as between the computed vectors to determine sentence similarity. In some embodiments, one or more machine learning models may be used by the system to obtain sentence or phrase embeddings of social media posts"; Paragraph 0041, lines 19-23, "Various machine learning models may be employed for this task (e.g., recurrent neural networks, bidirectional recurrent neural networks, LSTM-RNN models, encoder-decoder models, transformers, etc.).");
and input the phrase vector representation of the training phrase into the deep neural network model (Paragraph 0041, lines 1-10, "To determine the similarity between sentences or phrases in social media posts, social media post analyzer 406 and/or social media post generator 416 may perform operations on word embeddings included in the phrase or sentence (e.g., compute an average or weighted average of word vectors in the sentence), and perform a cosine similarity operation as between the computed vectors to determine sentence similarity. In some embodiments, one or more machine learning models may be used by the system to obtain sentence or phrase embeddings of social media posts"; Paragraph 0043, lines 1-6, "In some embodiments, social media post generator 416 may generate text for recommended posts using NLP generation (e.g., using one or more machine learning models). For example, neural networks may be employed to generate text (e.g., using a technique such as autoregressive generation in conjunction with social media post context).").
Marey teaches generating word vector embeddings for the words in a sentence or phrase, using a machine learning model to generate sentence or phrase embeddings from the word vector embeddings, and using the sentence or phrase embeddings as the input to a neural network to generate text, in order to generate a response to a social media post (Paragraph 0003, lines 1-11, "systems and methods are provided herein for generating recommended social media posts for user selection, by identifying one or more content categories associated with a first social media post by a first user, parsing one or more social media posts (associated with the first social media post) by one or more other users, and generating for presentation to a second user, based on the one or more identified content categories of the first social media post and the one or more parsed social media posts, one or more recommended social media posts for the second user.").
Arnold and Marey are considered to be analogous to the claimed invention because they are in the same field of text generation systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Marey to generate word vector embeddings for the words in a sentence or phrase, use a machine learning model to generate sentence or phrase embeddings from the word vector embeddings, and use the sentence or phrase embeddings as the input to a neural network to generate text.  Doing so would allow for generating a response to a social media post.
Regarding claim 17, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer system as claimed in claim 15.  Arnold further discloses:
wherein predict the one or more next words for the input phrase, wherein the prediction is based on the deep neural network model that has been trained with the corpus of documents for the specific domain further comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to:
generate training data from a plurality of sentences in the corpus of documents for the specific domain (Paragraph 0097, lines 1-6, "In various implementations, training data (i.e., the corpus of existing documents) for learning the retrieval model and/or the language model was optionally preprocessed to remove duplicate messages, identify greetings and signatures, parse sentences, and tokenize document bodies into sentences.").
Arnold does not specifically disclose: create a word vector representation for each word of one or more words in a training phrase; input the word vector representation for each word of one or more words in the training phrase into a deep learning sequence model to create a phrase vector representation of the training phrase; and input the phrase vector representation of the training phrase into the deep neural network model.
Marey teaches:
create a word vector representation for each word of one or more words in a training phrase (Paragraph 0040, lines 1-9, "As another example, in generating and/or retrieving recommended social media posts 418, social media post analyzer 406 and/or social media post generator 416 may employ a word (or phrase or sentence) embedding machine learning model to recommend a semantically similar post (e.g., to posts 402 and/or 404). For example, a text corpus may be used to train a word embedding machine learning model, in order to represent each word as a vector in a vector space.");
input the word vector representation for each word of one or more words in the training phrase into a deep learning sequence model to create a phrase vector representation of the training phrase (Paragraph 0041, lines 1-10, "To determine the similarity between sentences or phrases in social media posts, social media post analyzer 406 and/or social media post generator 416 may perform operations on word embeddings included in the phrase or sentence (e.g., compute an average or weighted average of word vectors in the sentence), and perform a cosine similarity operation as between the computed vectors to determine sentence similarity. In some embodiments, one or more machine learning models may be used by the system to obtain sentence or phrase embeddings of social media posts"; Paragraph 0041, lines 19-23, "Various machine learning models may be employed for this task (e.g., recurrent neural networks, bidirectional recurrent neural networks, LSTM-RNN models, encoder-decoder models, transformers, etc.).");
and input the phrase vector representation of the training phrase into the deep neural network model (Paragraph 0041, lines 1-10, "To determine the similarity between sentences or phrases in social media posts, social media post analyzer 406 and/or social media post generator 416 may perform operations on word embeddings included in the phrase or sentence (e.g., compute an average or weighted average of word vectors in the sentence), and perform a cosine similarity operation as between the computed vectors to determine sentence similarity. In some embodiments, one or more machine learning models may be used by the system to obtain sentence or phrase embeddings of social media posts"; Paragraph 0043, lines 1-6, "In some embodiments, social media post generator 416 may generate text for recommended posts using NLP generation (e.g., using one or more machine learning models). For example, neural networks may be employed to generate text (e.g., using a technique such as autoregressive generation in conjunction with social media post context).").
Marey teaches generating word vector embeddings for the words in a sentence or phrase, using a machine learning model to generate sentence or phrase embeddings from the word vector embeddings, and using the sentence or phrase embeddings as the input to a neural network to generate text, in order to generate a response to a social media post (Paragraph 0003, lines 1-11, "systems and methods are provided herein for generating recommended social media posts for user selection, by identifying one or more content categories associated with a first social media post by a first user, parsing one or more social media posts (associated with the first social media post) by one or more other users, and generating for presentation to a second user, based on the one or more identified content categories of the first social media post and the one or more parsed social media posts, one or more recommended social media posts for the second user.").
Arnold and Marey are considered to be analogous to the claimed invention because they are in the same field of text generation systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Marey to generate word vector embeddings for the words in a sentence or phrase, use a machine learning model to generate sentence or phrase embeddings from the word vector embeddings, and use the sentence or phrase embeddings as the input to a neural network to generate text.  Doing so would allow for generating a response to a social media post.
Claims 4, 11 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Arnold in view of Marey, and further in view of Ehsani et al. (US Patent No. 10,552,533), hereinafter Ehsani, Boxwell et al. (US Patent Application Publication No. 2018/0240008), hereinafter Boxwell, Abuammar et al. (US Patent Application Publication No. 2020/0372217), hereinafter Abuammar, and Li et al. ("An Efficient Method for High Quality and Cohesive Topical Phrase Mining"), hereinafter Li.
Regarding claim 4, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold in view of Marey discloses the computer-implemented method as claimed in claim 3.  Arnold further discloses:
wherein generating training data from the plurality of sentences in the corpus of documents for the specific domain further comprises: receiving, by the one or more computer processors, a plurality of domain content sentences from the corpus of documents from the specific domain (Paragraph 0050, lines 1-7, "Further, in various implementations, a Model Update Module 155 optionally periodically updates (via the Learning Module 100) either or both the retrieval model 110 and the language model 110 as additional source documents become available, or whenever it is desired to train or retrain these models on a new or expanded corpus of preexisting documents."; Paragraph 0097, lines 1-6, "In various implementations, training data (i.e., the corpus of existing documents) for learning the retrieval model and/or the language model was optionally preprocessed to remove duplicate messages, identify greetings and signatures, parse sentences, and tokenize document bodies into sentences.").
Arnold in view of Marey does not specifically disclose: extracting, by the one or more computer processors, the plurality of domain content sentences as at least one of one or more n-grams, one of one or more first natural language phrases based on a deep parsing, one of one or more second natural language phrases based on a semantic role labeling, and one of one or third more natural language phrases based on an abstract meaning representation; filtering, by the one or more computer processors, the plurality of domain content sentences for at least one of a sequence-to-sequence model and a language model; and preparing, by the one or more computer processors, a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences.
Ehsani teaches:
extracting, by the one or more computer processors, the plurality of domain content sentences as at least one of one or more n-grams, and one of one or more third natural language phrases based on an abstract meaning representation (Column 8, lines 43-44, "We begin by deriving n-gram statistics from a given corpus C1 using standard language modeling techniques."; Column 18, lines 42-44, "In this mode, the abstract meaning representation(s) for the selected phrases can be accessed and modified.").
Ehsani teaches determining n-grams and the abstract meaning representation in order to build a database of word combinations and phrases that can be used to model interactions across a variety of contexts, domains, and languages (Column 3, line 59 - Column 4, line 6, "In one aspect, the present invention concerns modeling generic aspects of interactive discourse based on statistical modeling of phrases in large amounts of conversational text data. It involves automatically extracting valid phrases from a given text corpus, and clustering these phrases into syntactically and/or semantically meaningful equivalent classes. Various existing statistical and computational techniques are combined in a new way to accomplish this end. The result is a large thesaurus of fixed word combinations and phrases. To the extent that this phrase thesaurus groups similar or semantically equivalent phrases into classes along with probabilities of their occurrence, it contains an implicit probabilistic model of generic structures found in interactive discourse, and thus can be used to model interactions across a large variety of different contexts, domains, and languages.").
Arnold, Marey, and Ehsani are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Marey to incorporate the teachings of Ehsani to determine n-grams and the abstract meaning representation.  Doing so would allow for building a database of word combinations and phrases that can be used to model interactions across a variety of contexts, domains, and languages.
Arnold in view of Marey and further in view of Ehsani does not specifically disclose: extracting, by the one or more computer processors, the plurality of domain content sentences as at least one of one or more first natural language phrases based on a deep parsing, and one of one or more second natural language phrases based on a semantic role labeling; filtering, by the one or more computer processors, the plurality of domain content sentences for at least one of a sequence-to-sequence model and a language model; and preparing, by the one or more computer processors, a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences.
Boxwell teaches:
extracting, by the one or more computer processors, the plurality of domain content sentences as at least one of one or more first natural language phrases based on a deep parsing, and one of one or more second natural language phrases based on a semantic role labeling (Paragraph 0023, lines 1-7, "In one embodiment, the question analysis module 204 may include instructions for performing natural language processing (NLP), decomposition, shallow parses, deep parses, logical forms, semantic role labels, coreference, relations (e.g., subject-verb-object predicates or semantic relationships between entities), named entities, and so on, as well as specific kinds of analysis for question classification.").
Boxwell teaches performing deep parsing and semantic role labeling in order to identify components of a question and generate candidate answers from a corpus of data (Paragraph 0019, lines 1-15, "In one embodiment, the QA system 100 parses the question to identify components of the question (e.g., subject, predicate, and object), uses the identified components to formulate queries, and then applies those queries to the corpus of data contained in the knowledge base 110. Based on the application of the queries to the corpus of data, the QA system 100 generates candidate answers to the input question. The QA system 100 may utilize various scoring algorithms in generating the candidate answers. For example, a scoring algorithm may look at the matching of terms and synonyms within the language of the input question and the found portions of the corpus of data. Other scoring algorithms may look at temporal or spatial features in the language, while others may evaluate the source of the portion of the corpus of data and evaluate its reliability.").
Arnold, Marey, Ehsani, and Boxwell are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Marey and further in view of Ehsani to incorporate the teachings of Boxwell to perform deep parsing and semantic role labeling.  Doing so would allow for identifying components of a question and generate candidate answers from a corpus of data.
Arnold in view of Marey and further in view of Ehsani and Boxwell does not specifically disclose: filtering, by the one or more computer processors, the plurality of domain content sentences for at least one of a sequence-to-sequence model and a language model; and preparing, by the one or more computer processors, a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences.
Abuammar teaches:
filtering, by the one or more computer processors, the plurality of domain content sentences for at least one of a sequence-to-sequence model and a language model (Paragraph 0164, lines 1-8, "The training data selector 1530 may select data for training from among the pre-processed data. The selected data may be provided to the model trainer 1540. The training data selector 1530 may select data for training from among pre-processed data according to pre-set selection criteria for identifying a plurality of words or determining one or more paraphrased words thereof and a plurality of paraphrased sentences."; Paragraph 0030, lines 1-4, "FIG. 3 illustrates a block diagram showing a trained network model using a sequence-to-sequence encoder-decoder model, according to an embodiment of the disclosure"; Paragraph 0096, lines 1-3, "In operation S720, the server may assign a score to each of the plurality of paraphrased sentences corresponding to the source sentence by using a language model.").
Abuammar teaches selecting sentences for processing and performing processing with a sequence-to-sequence model and a language model in order to find paraphrased sentences for a source sentence based on similarity (Paragraph 0013, lines 1-11, "According to an embodiment of the disclosure, there is provided a method of processing a language based on a trained network model, the method including: obtaining a source sentence; obtaining a plurality of words constituting the source sentence; determining a plurality of paraphrased sentences including paraphrased words for each of the plurality of words constituting the source sentence and similarity levels between the plurality of paraphrased sentences and the source sentence; and obtaining a pre-set number of paraphrased sentences from among the plurality of paraphrased sentences based on the similarity levels.").
Arnold, Marey, Ehsani, Boxwell, and Abuammar are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Marey and further in view of Ehsani and Boxwell to incorporate the teachings of Abuammar to select sentences for processing and perform processing with a sequence-to-sequence model and a language model.  Doing so would allow for finding paraphrased sentences for a source sentence based on similarity.
Arnold in view of Marey and further in view of Ehsani, Boxwell, and Abuammar does not specifically disclose: preparing, by the one or more computer processors, a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences.
Li teaches:
preparing, by the one or more computer processors, a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences (Section 1, lines 1-3, "Topical phrase mining refers to automatically extracting phrases which grouped by individual themes from given text corpora."; Section 2, lines 4-7, "A phrase Pr can be formally represented as a consecutive sequence of words from jth position in a document d: Pr = d[j, j + l - 1], where d[j, j + l - 1] = wj, wj+1, ..., wj+l-1, and l is the phrase length."; Section 8.2, lines 10-12, "Bigram model is a probabilistic generative model that conditions on the previous word and topic when drawing the next word."; Drawing the next word reads on preparing an expected next word, and using a probabilistic generative model that conditions on the previous word and topic reads on not directly extracting the phrase and next word from the domain.).
Li teaches extracting phrases of different lengths from text corpora and determining the next word in phrases, with words of the phrases selected based on the previous word and the topic, in order to perform topical phrase mining with improved phrase quality and topical cohesion (Abstract, lines 7-11, "In this paper, we propose an efficient method for high quality and cohesive topical phrase mining. A high quality phrase should satisfy frequency, phraseness, completeness, and appropriateness criteria. In our framework, we integrate quality guaranteed phrase mining method, a novel topic model incorporating the constraint of phrases, and a novel document clustering method into an iterative framework to improve both phrase quality and topical cohesion.").
Arnold, Marey, Ehsani, Boxwell, Abuammar, and Li are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Marey and further in view of Ehsani, Boxwell, and Abuammar to incorporate the teachings of Li to extract phrases of different lengths from text corpora and determine the next word in phrases, with words of the phrases selected based on the previous word and the topic.  Doing so would allow for performing topical phrase mining with improved phrase quality and topical cohesion.
Regarding claim 11, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold in view of Marey discloses the computer program product as claimed in claim 10.  Arnold further discloses:
wherein generate training data from the plurality of sentences in the corpus of documents for the specific domain further comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to: receive a plurality of domain content sentences from the corpus of documents from the specific domain (Paragraph 0050, lines 1-7, "Further, in various implementations, a Model Update Module 155 optionally periodically updates (via the Learning Module 100) either or both the retrieval model 110 and the language model 110 as additional source documents become available, or whenever it is desired to train or retrain these models on a new or expanded corpus of preexisting documents."; Paragraph 0097, lines 1-6, "In various implementations, training data (i.e., the corpus of existing documents) for learning the retrieval model and/or the language model was optionally preprocessed to remove duplicate messages, identify greetings and signatures, parse sentences, and tokenize document bodies into sentences.").
Arnold in view of Marey does not specifically disclose: extract the plurality of domain content sentences as at least one of one or more n-grams, one of one or more first natural language phrases based on a deep parsing, one of one or more second natural language phrases based on a semantic role labeling, and one of one or third more natural language phrases based on an abstract meaning representation; filter the plurality of domain content sentences for at least one of a sequence-to-sequence model and a language model; and prepare a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences.
Ehsani teaches:
extract the plurality of domain content sentences as at least one of one or more n-grams, and one of one or more third natural language phrases based on an abstract meaning representation (Column 8, lines 43-44, "We begin by deriving n-gram statistics from a given corpus C1 using standard language modeling techniques."; Column 18, lines 42-44, "In this mode, the abstract meaning representation(s) for the selected phrases can be accessed and modified.").
Ehsani teaches determining n-grams and the abstract meaning representation in order to build a database of word combinations and phrases that can be used to model interactions across a variety of contexts, domains, and languages (Column 3, line 59 - Column 4, line 6, "In one aspect, the present invention concerns modeling generic aspects of interactive discourse based on statistical modeling of phrases in large amounts of conversational text data. It involves automatically extracting valid phrases from a given text corpus, and clustering these phrases into syntactically and/or semantically meaningful equivalent classes. Various existing statistical and computational techniques are combined in a new way to accomplish this end. The result is a large thesaurus of fixed word combinations and phrases. To the extent that this phrase thesaurus groups similar or semantically equivalent phrases into classes along with probabilities of their occurrence, it contains an implicit probabilistic model of generic structures found in interactive discourse, and thus can be used to model interactions across a large variety of different contexts, domains, and languages.").
Arnold, Marey, and Ehsani are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Marey to incorporate the teachings of Ehsani to determine n-grams and the abstract meaning representation.  Doing so would allow for building a database of word combinations and phrases that can be used to model interactions across a variety of contexts, domains, and languages.
Arnold in view of Marey and further in view of Ehsani does not specifically disclose: extract the plurality of domain content sentences as at least one of one or more first natural language phrases based on a deep parsing, and one of one or more second natural language phrases based on a semantic role labeling; filter the plurality of domain content sentences for at least one of a sequence-to-sequence model and a language model; and prepare a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences.
Boxwell teaches:
extract the plurality of domain content sentences as at least one of one or more first natural language phrases based on a deep parsing, and one of one or more second natural language phrases based on a semantic role labeling (Paragraph 0023, lines 1-7, "In one embodiment, the question analysis module 204 may include instructions for performing natural language processing (NLP), decomposition, shallow parses, deep parses, logical forms, semantic role labels, coreference, relations (e.g., subject-verb-object predicates or semantic relationships between entities), named entities, and so on, as well as specific kinds of analysis for question classification.").
Boxwell teaches performing deep parsing and semantic role labeling in order to identify components of a question and generate candidate answers from a corpus of data (Paragraph 0019, lines 1-15, "In one embodiment, the QA system 100 parses the question to identify components of the question (e.g., subject, predicate, and object), uses the identified components to formulate queries, and then applies those queries to the corpus of data contained in the knowledge base 110. Based on the application of the queries to the corpus of data, the QA system 100 generates candidate answers to the input question. The QA system 100 may utilize various scoring algorithms in generating the candidate answers. For example, a scoring algorithm may look at the matching of terms and synonyms within the language of the input question and the found portions of the corpus of data. Other scoring algorithms may look at temporal or spatial features in the language, while others may evaluate the source of the portion of the corpus of data and evaluate its reliability.").
Arnold, Marey, Ehsani, and Boxwell are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Marey and further in view of Ehsani to incorporate the teachings of Boxwell to perform deep parsing and semantic role labeling.  Doing so would allow for identifying components of a question and generate candidate answers from a corpus of data.
Arnold in view of Marey and further in view of Ehsani and Boxwell does not specifically disclose: filter the plurality of domain content sentences for at least one of a sequence-to-sequence model and a language model; and prepare a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences.
Abuammar teaches:
filter the plurality of domain content sentences for at least one of a sequence-to-sequence model and a language model (Paragraph 0164, lines 1-8, "The training data selector 1530 may select data for training from among the pre-processed data. The selected data may be provided to the model trainer 1540. The training data selector 1530 may select data for training from among pre-processed data according to pre-set selection criteria for identifying a plurality of words or determining one or more paraphrased words thereof and a plurality of paraphrased sentences."; Paragraph 0030, lines 1-4, "FIG. 3 illustrates a block diagram showing a trained network model using a sequence-to-sequence encoder-decoder model, according to an embodiment of the disclosure"; Paragraph 0096, lines 1-3, "In operation S720, the server may assign a score to each of the plurality of paraphrased sentences corresponding to the source sentence by using a language model.").
Abuammar teaches selecting sentences for processing and performing processing with a sequence-to-sequence model and a language model in order to find paraphrased sentences for a source sentence based on similarity (Paragraph 0013, lines 1-11, "According to an embodiment of the disclosure, there is provided a method of processing a language based on a trained network model, the method including: obtaining a source sentence; obtaining a plurality of words constituting the source sentence; determining a plurality of paraphrased sentences including paraphrased words for each of the plurality of words constituting the source sentence and similarity levels between the plurality of paraphrased sentences and the source sentence; and obtaining a pre-set number of paraphrased sentences from among the plurality of paraphrased sentences based on the similarity levels.").
Arnold, Marey, Ehsani, Boxwell, and Abuammar are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Marey and further in view of Ehsani and Boxwell to incorporate the teachings of Abuammar to select sentences for processing and perform processing with a sequence-to-sequence model and a language model.  Doing so would allow for finding paraphrased sentences for a source sentence based on similarity.
Arnold in view of Marey and further in view of Ehsani, Boxwell, and Abuammar does not specifically disclose: prepare a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences.
Li teaches:
prepare a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences (Section 1, lines 1-3, "Topical phrase mining refers to automatically extracting phrases which grouped by individual themes from given text corpora."; Section 2, lines 4-7, "A phrase Pr can be formally represented as a consecutive sequence of words from jth position in a document d: Pr = d[j, j + l - 1], where d[j, j + l - 1] = wj, wj+1, ..., wj+l-1, and l is the phrase length."; Section 8.2, lines 10-12, "Bigram model is a probabilistic generative model that conditions on the previous word and topic when drawing the next word."; Drawing the next word reads on preparing an expected next word, and using a probabilistic generative model that conditions on the previous word and topic reads on not directly extracting the phrase and next word from the domain.).
Li teaches extracting phrases of different lengths from text corpora and determining the next word in phrases, with words of the phrases selected based on the previous word and the topic, in order to perform topical phrase mining with improved phrase quality and topical cohesion (Abstract, lines 7-11, "In this paper, we propose an efficient method for high quality and cohesive topical phrase mining. A high quality phrase should satisfy frequency, phraseness, completeness, and appropriateness criteria. In our framework, we integrate quality guaranteed phrase mining method, a novel topic model incorporating the constraint of phrases, and a novel document clustering method into an iterative framework to improve both phrase quality and topical cohesion.").
Arnold, Marey, Ehsani, Boxwell, Abuammar, and Li are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Marey and further in view of Ehsani, Boxwell, and Abuammar to incorporate the teachings of Li to extract phrases of different lengths from text corpora and determine the next word in phrases, with words of the phrases selected based on the previous word and the topic.  Doing so would allow for performing topical phrase mining with improved phrase quality and topical cohesion.
Regarding claim 18, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold in view of Marey discloses the computer system as claimed in claim 17.  Arnold further discloses:
wherein generate training data from the plurality of sentences in the corpus of documents for the specific domain further comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to: receive a plurality of domain content sentences from the corpus of documents from the specific domain (Paragraph 0050, lines 1-7, "Further, in various implementations, a Model Update Module 155 optionally periodically updates (via the Learning Module 100) either or both the retrieval model 110 and the language model 110 as additional source documents become available, or whenever it is desired to train or retrain these models on a new or expanded corpus of preexisting documents."; Paragraph 0097, lines 1-6, "In various implementations, training data (i.e., the corpus of existing documents) for learning the retrieval model and/or the language model was optionally preprocessed to remove duplicate messages, identify greetings and signatures, parse sentences, and tokenize document bodies into sentences.").
Arnold in view of Marey does not specifically disclose: extract the plurality of domain content sentences as at least one of one or more n-grams, one of one or more first natural language phrases based on a deep parsing, one of one or more second natural language phrases based on a semantic role labeling, and one of one or third more natural language phrases based on an abstract meaning representation; filter the plurality of domain content sentences for at least one of a sequence-to-sequence model and a language model; and prepare a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences.
Ehsani teaches:
extract the plurality of domain content sentences as at least one of one or more n-grams, and one of one or more third natural language phrases based on an abstract meaning representation (Column 8, lines 43-44, "We begin by deriving n-gram statistics from a given corpus C1 using standard language modeling techniques."; Column 18, lines 42-44, "In this mode, the abstract meaning representation(s) for the selected phrases can be accessed and modified.").
Ehsani teaches determining n-grams and the abstract meaning representation in order to build a database of word combinations and phrases that can be used to model interactions across a variety of contexts, domains, and languages (Column 3, line 59 - Column 4, line 6, "In one aspect, the present invention concerns modeling generic aspects of interactive discourse based on statistical modeling of phrases in large amounts of conversational text data. It involves automatically extracting valid phrases from a given text corpus, and clustering these phrases into syntactically and/or semantically meaningful equivalent classes. Various existing statistical and computational techniques are combined in a new way to accomplish this end. The result is a large thesaurus of fixed word combinations and phrases. To the extent that this phrase thesaurus groups similar or semantically equivalent phrases into classes along with probabilities of their occurrence, it contains an implicit probabilistic model of generic structures found in interactive discourse, and thus can be used to model interactions across a large variety of different contexts, domains, and languages.").
Arnold, Marey, and Ehsani are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Marey to incorporate the teachings of Ehsani to determine n-grams and the abstract meaning representation.  Doing so would allow for building a database of word combinations and phrases that can be used to model interactions across a variety of contexts, domains, and languages.
Arnold in view of Marey and further in view of Ehsani does not specifically disclose: extract the plurality of domain content sentences as at least one of one or more first natural language phrases based on a deep parsing, and one of one or more second natural language phrases based on a semantic role labeling; filter the plurality of domain content sentences for at least one of a sequence-to-sequence model and a language model; and prepare a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences.
Boxwell teaches:
extract the plurality of domain content sentences as at least one of one or more first natural language phrases based on a deep parsing, and one of one or more second natural language phrases based on a semantic role labeling (Paragraph 0023, lines 1-7, "In one embodiment, the question analysis module 204 may include instructions for performing natural language processing (NLP), decomposition, shallow parses, deep parses, logical forms, semantic role labels, coreference, relations (e.g., subject-verb-object predicates or semantic relationships between entities), named entities, and so on, as well as specific kinds of analysis for question classification.").
Boxwell teaches performing deep parsing and semantic role labeling in order to identify components of a question and generate candidate answers from a corpus of data (Paragraph 0019, lines 1-15, "In one embodiment, the QA system 100 parses the question to identify components of the question (e.g., subject, predicate, and object), uses the identified components to formulate queries, and then applies those queries to the corpus of data contained in the knowledge base 110. Based on the application of the queries to the corpus of data, the QA system 100 generates candidate answers to the input question. The QA system 100 may utilize various scoring algorithms in generating the candidate answers. For example, a scoring algorithm may look at the matching of terms and synonyms within the language of the input question and the found portions of the corpus of data. Other scoring algorithms may look at temporal or spatial features in the language, while others may evaluate the source of the portion of the corpus of data and evaluate its reliability.").
Arnold, Marey, Ehsani, and Boxwell are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Marey and further in view of Ehsani to incorporate the teachings of Boxwell to perform deep parsing and semantic role labeling.  Doing so would allow for identifying components of a question and generate candidate answers from a corpus of data.
Arnold in view of Marey and further in view of Ehsani and Boxwell does not specifically disclose: filter the plurality of domain content sentences for at least one of a sequence-to-sequence model and a language model; and prepare a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences.
Abuammar teaches:
filter the plurality of domain content sentences for at least one of a sequence-to-sequence model and a language model (Paragraph 0164, lines 1-8, "The training data selector 1530 may select data for training from among the pre-processed data. The selected data may be provided to the model trainer 1540. The training data selector 1530 may select data for training from among pre-processed data according to pre-set selection criteria for identifying a plurality of words or determining one or more paraphrased words thereof and a plurality of paraphrased sentences."; Paragraph 0030, lines 1-4, "FIG. 3 illustrates a block diagram showing a trained network model using a sequence-to-sequence encoder-decoder model, according to an embodiment of the disclosure"; Paragraph 0096, lines 1-3, "In operation S720, the server may assign a score to each of the plurality of paraphrased sentences corresponding to the source sentence by using a language model.").
Abuammar teaches selecting sentences for processing and performing processing with a sequence-to-sequence model and a language model in order to find paraphrased sentences for a source sentence based on similarity (Paragraph 0013, lines 1-11, "According to an embodiment of the disclosure, there is provided a method of processing a language based on a trained network model, the method including: obtaining a source sentence; obtaining a plurality of words constituting the source sentence; determining a plurality of paraphrased sentences including paraphrased words for each of the plurality of words constituting the source sentence and similarity levels between the plurality of paraphrased sentences and the source sentence; and obtaining a pre-set number of paraphrased sentences from among the plurality of paraphrased sentences based on the similarity levels.").
Arnold, Marey, Ehsani, Boxwell, and Abuammar are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Marey and further in view of Ehsani and Boxwell to incorporate the teachings of Abuammar to select sentences for processing and perform processing with a sequence-to-sequence model and a language model.  Doing so would allow for finding paraphrased sentences for a source sentence based on similarity.
Arnold in view of Marey and further in view of Ehsani, Boxwell, and Abuammar does not specifically disclose: prepare a training dataset with a plurality of phrases of                                                                                                                                                                                                                                                                                                         different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences.
Li teaches:
prepare a training dataset with a plurality of phrases of different length and an expected next word, wherein the plurality of phrases and the next word are not directly extracted from the plurality of domain content sentences (Section 1, lines 1-3, "Topical phrase mining refers to automatically extracting phrases which grouped by individual themes from given text corpora."; Section 2, lines 4-7, "A phrase Pr can be formally represented as a consecutive sequence of words from jth position in a document d: Pr = d[j, j + l - 1], where d[j, j + l - 1] = wj, wj+1, ..., wj+l-1, and l is the phrase length."; Section 8.2, lines 10-12, "Bigram model is a probabilistic generative model that conditions on the previous word and topic when drawing the next word."; Drawing the next word reads on preparing an expected next word, and using a probabilistic generative model that conditions on the previous word and topic reads on not directly extracting the phrase and next word from the domain.).
Li teaches extracting phrases of different lengths from text corpora and determining the next word in phrases, with words of the phrases selected based on the previous word and the topic, in order to perform topical phrase mining with improved phrase quality and topical cohesion (Abstract, lines 7-11, "In this paper, we propose an efficient method for high quality and cohesive topical phrase mining. A high quality phrase should satisfy frequency, phraseness, completeness, and appropriateness criteria. In our framework, we integrate quality guaranteed phrase mining method, a novel topic model incorporating the constraint of phrases, and a novel document clustering method into an iterative framework to improve both phrase quality and topical cohesion.").
Arnold, Marey, Ehsani, Boxwell, Abuammar, and Li are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Marey and further in view of Ehsani, Boxwell, and Abuammar to incorporate the teachings of Li to extract phrases of different lengths from text corpora and determine the next word in phrases, with words of the phrases selected based on the previous word and the topic.  Doing so would allow for performing topical phrase mining with improved phrase quality and topical cohesion.                
Claims 5, 12 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Arnold in view of Bar-Yossef et al. (“Context-Sensitive Query Auto-Completion”), hereinafter Bar-Yossef.
Regarding claim 5, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer-implemented method as claimed in claim 1, but does not specifically disclose: wherein sorting the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on the similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain further comprises: receiving, by the one or more computer processors, one or more first predictions from a first system and one or more second predictions from a second system; normalizing, by the one or more computer processors, a first score for each first prediction of the one or more first predictions and a second score for each second prediction of the one or more second predictions; creating, by the one or more computer processors, a plurality of prediction pairs, wherein each prediction pair includes a first prediction from the one or more first predictions and a second prediction from the one or more second predictions; calculating, by the one or more computer processors, a prediction weight, wherein the prediction weight is a mean of a combined string similarity for each prediction pair of the plurality of prediction pairs; multiplying, by the one or more computer processors, a first normalized score of each first prediction of the one or more first predictions by the prediction weight to create a first weighted score for each first prediction; multiplying, by the one or more computer processors, a second normalized score of each second prediction of the one or more second predictions by the prediction weight to create a second weighted score for each second prediction; and merging, by the one or more computer processors, the one or more first predictions and the one or more second predictions into a prediction list, wherein the prediction list is sorted based on a descending order of the first weighted score and the second weighted score.
Bar-Yossef teaches:
wherein sorting the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on the similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain further comprises:
receiving, by the one or more computer processors, one or more first predictions from a first system and one or more second predictions from a second system (Section 5, lines 20-24, "Given a user input x and a context C, HybridCompletion produces two ranked lists of completions of x: LNC consists of the top ℓ completions returned by NearestCompletion and LMPC consists of the top ℓ completions returned by MostPopularCompletion."); 
normalizing, by the one or more computer processors, a first score for each first prediction of the one or more first predictions and a second score for each second prediction of the one or more second predictions (Section 5, lines 26-33, "The results in each of the two lists are ranked by quality scores: LNC is ranked by a similarity score, which we denote by simscore(·), and LMPC is ranked by a popularity score, which we denote by popscore(·). The aggregated list LHC is constructed by combining the two scoring functions into a single hybrid score, denote hybscore(·). As simscore and popscore use different units and scales, they need to be standardized before they can be combined.");
creating, by the one or more computer processors, a plurality of prediction pairs, wherein each prediction pair includes a first prediction from the one or more first predictions and a second prediction from the one or more second predictions (Section 5, lines 20-24, "Given a user input x and a context C, HybridCompletion produces two ranked lists of completions of x: LNC consists of the top ℓ completions returned by NearestCompletion and LMPC consists of the top ℓ completions returned by MostPopularCompletion."; The two ranked lists with ℓ completions demonstrate ℓ completion pairs that read on the prediction pairs.);
calculating, by the one or more computer processors, a prediction weight, wherein the prediction weight is a mean of a combined string similarity for each prediction pair of the plurality of prediction pairs (Section 5, lines 36-45, "The standard similarity score is then calculated as Zsimscore(q) = (simscore(q) − μ)/σ, where μ and σ are the estimated mean and standard deviation. The standard popularity score is calculated similarly. The hybrid score is defined as a convex combination of the two scores: hybscore(q) = α · Zsimscore(q) + (1 − α) · Zpopscore(q), where 0 ≤ α ≤ 1 is a tunable parameter determining the weight of the similarity score relative to the weight of the popularity score."; The parameters α, μ, and σ used to calculate the hybrid score  from the standard similarity score and the standard popularity score reads on the prediction weight.);
multiplying, by the one or more computer processors, a first normalized score of each first prediction of the one or more first predictions by the prediction weight to create a first weighted score for each first prediction (Section 5, lines 36-45, "The standard similarity score is then calculated as Zsimscore(q) = (simscore(q) − μ)/σ, where μ and σ are the estimated mean and standard deviation. The standard popularity score is calculated similarly. The hybrid score is defined as a convex combination of the two scores: hybscore(q) = α · Zsimscore(q) + (1 − α) · Zpopscore(q), where 0 ≤ α ≤ 1 is a tunable parameter determining the weight of the similarity score relative to the weight of the popularity score."; Using the coefficients 1/σ and α to calculate the hybrid score from the standard similarity score reads on multiplying a normalized score and a prediction weight to generate a weighted score for the first prediction.);
multiplying, by the one or more computer processors, a second normalized score of each second prediction of the one or more second predictions by the prediction weight to create a second weighted score for each second prediction (Section 5, lines 36-45, "The standard similarity score is then calculated as Zsimscore(q) = (simscore(q) − μ)/σ, where μ and σ are the estimated mean and standard deviation. The standard popularity score is calculated similarly. The hybrid score is defined as a convex combination of the two scores: hybscore(q) = α · Zsimscore(q) + (1 − α) · Zpopscore(q), where 0 ≤ α ≤ 1 is a tunable parameter determining the weight of the similarity score relative to the weight of the popularity score."; Using the coefficients 1/σ and (1-α) to calculate the hybrid score from the standard popularity score reads on multiplying a normalized score and a prediction weight to generate a weighted score for the second prediction.);
and merging, by the one or more computer processors, the one or more first predictions and the one or more second predictions into a prediction list, wherein the prediction list is sorted based on a descending order of the first weighted score and the second weighted score (Section 5, lines 20-25, "Given a user input x and a context C, HybridCompletion produces two ranked lists of completions of x: LNC consists of the top ℓ completions returned by NearestCompletion and LMPC consists of the top ℓ completions returned by MostPopularCompletion. The final ranked list of completions LHC is constructed by aggregating the two lists."; Section 5, lines 62-70, “It is important to note that HybridCompletion may rerank the original lists of completions it receives. For example, among the most popular completions, it will promote the ones that are more similar to the context, and, conversely, among the most similar completions, it will promote the more popular ones. This implies that HybridCompletion can dominate both MostPopularCompletion and NearestCompletion not only on average, but also on individual inputs.”; Aggregating the two lists reads on merging the first predictions and the second predictions into a prediction list, and reranking the original lists reads on sorting the prediction list.).
Bar-Yossef teaches using two methods to generate possible text completions, standardizing the scores from the two methods, and using the standardized scores to generate a list of completions ranked based on the combined scores, in order to improve the performance of query auto completion that incorporates query context (Abstract, lines 1-9, “Query auto completion is known to provide poor predictions of the user’s query when her input prefix is very short (e.g., one or two characters). In this paper we show that context, such as the user’s recent queries, can be used to improve the prediction quality considerably even for such short prefixes. We propose a context-sensitive query auto completion algorithm, NearestCompletion, which outputs the completions of the user’s input that are most similar to the context queries.”; Abstract, lines 16-28, "We demonstrate that when the recent user’s queries are relevant to the current query she is typing, then after typing a single character, NearestCompletion’s MRR is 48% higher relative to the MRR of the standard MostPopularCompletion algorithm on average. When the context is irrelevant, however, NearestCompletion’s MRR is essentially zero. To mitigate this problem, we propose HybridCompletion, which is a hybrid of NearestCompletion with MostPopularCompletion. HybridCompletion is shown to dominate both NearestCompletion and MostPopularCompletion, achieving a total improvement of 31.5% in MRR relative to MostPopularCompletion on average.").
Arnold and Bar-Yossef are considered to be analogous to the claimed invention because they are in the same field of text generation systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Bar-Yossef to use two methods to generate possible text completions, standardize the scores from the two methods, and use the standardized scores to generate a list of completions ranked based on the combined scores.  Doing so would allow for improving the performance of query auto completion that incorporates query context.
Regarding claim 12, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer program product as claimed in claim 8, but does not specifically disclose: wherein sort the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on the similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain further comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to: receive one or more first predictions from a first system and one or more second predictions from a second system; normalize a first score for each first prediction of the one or more first predictions and a second score for each second prediction of the one or more second predictions; create a plurality of prediction pairs, wherein each prediction pair includes a first prediction from the one or more first predictions and a second prediction from the one or more second predictions; calculate a prediction weight, wherein the prediction weight is a mean of a combined string similarity for each prediction pair of the plurality of prediction pairs; multiply a first normalized score of each first prediction of the one or more first predictions by the prediction weight to create a first weighted score for each first prediction; multiply a second normalized score of each second prediction of the one or more second predictions by the prediction weight to create a second weighted score for each second prediction; and merge the one or more first predictions and the one or more second predictions into a prediction list, wherein the prediction list is sorted based on a descending order of the first weighted score and the second weighted score.
Bar-Yossef teaches:
wherein sort the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on the similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain further comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to:
receive one or more first predictions from a first system and one or more second predictions from a second system (Section 5, lines 20-24, "Given a user input x and a context C, HybridCompletion produces two ranked lists of completions of x: LNC consists of the top ℓ completions returned by NearestCompletion and LMPC consists of the top ℓ completions returned by MostPopularCompletion."); 
normalize a first score for each first prediction of the one or more first predictions and a second score for each second prediction of the one or more second predictions (Section 5, lines 26-33, "The results in each of the two lists are ranked by quality scores: LNC is ranked by a similarity score, which we denote by simscore(·), and LMPC is ranked by a popularity score, which we denote by popscore(·). The aggregated list LHC is constructed by combining the two scoring functions into a single hybrid score, denote hybscore(·). As simscore and popscore use different units and scales, they need to be standardized before they can be combined.");
create a plurality of prediction pairs, wherein each prediction pair includes a first prediction from the one or more first predictions and a second prediction from the one or more second predictions (Section 5, lines 20-24, "Given a user input x and a context C, HybridCompletion produces two ranked lists of completions of x: LNC consists of the top ℓ completions returned by NearestCompletion and LMPC consists of the top ℓ completions returned by MostPopularCompletion."; The two ranked lists with ℓ completions demonstrate ℓ completion pairs that read on the prediction pairs.);
calculate a prediction weight, wherein the prediction weight is a mean of a combined string similarity for each prediction pair of the plurality of prediction pairs (Section 5, lines 36-45, "The standard similarity score is then calculated as Zsimscore(q) = (simscore(q) − μ)/σ, where μ and σ are the estimated mean and standard deviation. The standard popularity score is calculated similarly. The hybrid score is defined as a convex combination of the two scores: hybscore(q) = α · Zsimscore(q) + (1 − α) · Zpopscore(q), where 0 ≤ α ≤ 1 is a tunable parameter determining the weight of the similarity score relative to the weight of the popularity score."; The parameters α, μ, and σ used to calculate the hybrid score from the standard similarity score and the standard popularity score reads on the prediction weight.);
multiply a first normalized score of each first prediction of the one or more first predictions by the prediction weight to create a first weighted score for each first prediction (Section 5, lines 36-45, "The standard similarity score is then calculated as Zsimscore(q) = (simscore(q) − μ)/σ, where μ and σ are the estimated mean and standard deviation. The standard popularity score is calculated similarly. The hybrid score is defined as a convex combination of the two scores: hybscore(q) = α · Zsimscore(q) + (1 − α) · Zpopscore(q), where 0 ≤ α ≤ 1 is a tunable parameter determining the weight of the similarity score relative to the weight of the popularity score."; Using the coefficients 1/σ and α to calculate the hybrid score from the standard similarity score reads on multiplying a normalized score and a prediction weight to generate a weighted score for the first prediction.);
multiply a second normalized score of each second prediction of the one or more second predictions by the prediction weight to create a second weighted score for each second prediction (Section 5, lines 36-45, "The standard similarity score is then calculated as Zsimscore(q) = (simscore(q) − μ)/σ, where μ and σ are the estimated mean and standard deviation. The standard popularity score is calculated similarly. The hybrid score is defined as a convex combination of the two scores: hybscore(q) = α · Zsimscore(q) + (1 − α) · Zpopscore(q), where 0 ≤ α ≤ 1 is a tunable parameter determining the weight of the similarity score relative to the weight of the popularity score."; Using the coefficients 1/σ and (1-α) to calculate the hybrid score from the standard popularity score reads on multiplying a normalized score and a prediction weight to generate a weighted score for the second prediction.);
and merge the one or more first predictions and the one or more second predictions into a prediction list, wherein the prediction list is sorted based on a descending order of the first weighted score and the second weighted score (Section 5, lines 20-25, "Given a user input x and a context C, HybridCompletion produces two ranked lists of completions of x: LNC consists of the top ℓ completions returned by NearestCompletion and LMPC consists of the top ℓ completions returned by MostPopularCompletion. The final ranked list of completions LHC is constructed by aggregating the two lists."; Section 5, lines 62-70, “It is important to note that HybridCompletion may rerank the original lists of completions it receives. For example, among the most popular completions, it will promote the ones that are more similar to the context, and, conversely, among the most similar completions, it will promote the more popular ones. This implies that HybridCompletion can dominate both MostPopularCompletion and NearestCompletion not only on average, but also on individual inputs.”; Aggregating the two lists reads on merging the first predictions and the second predictions into a prediction list, and reranking the original lists reads on sorting the prediction list.).
Bar-Yossef teaches using two methods to generate possible text completions, standardizing the scores from the two methods, and using the standardized scores to generate a list of completions ranked based on the combined scores, in order to improve the performance of query auto completion that incorporates query context (Abstract, lines 1-9, “Query auto completion is known to provide poor predictions of the user’s query when her input prefix is very short (e.g., one or two characters). In this paper we show that context, such as the user’s recent queries, can be used to improve the prediction quality considerably even for such short prefixes. We propose a context-sensitive query auto completion algorithm, NearestCompletion, which outputs the completions of the user’s input that are most similar to the context queries.”; Abstract, lines 16-28, "We demonstrate that when the recent user’s queries are relevant to the current query she is typing, then after typing a single character, NearestCompletion’s MRR is 48% higher relative to the MRR of the standard MostPopularCompletion algorithm on average. When the context is irrelevant, however, NearestCompletion’s MRR is essentially zero. To mitigate this problem, we propose HybridCompletion, which is a hybrid of NearestCompletion with MostPopularCompletion. HybridCompletion is shown to dominate both NearestCompletion and MostPopularCompletion, achieving a total improvement of 31.5% in MRR relative to MostPopularCompletion on average.").
Arnold and Bar-Yossef are considered to be analogous to the claimed invention because they are in the same field of text generation systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Bar-Yossef to use two methods to generate possible text completions, standardize the scores from the two methods, and use the standardized scores to generate a list of completions ranked based on the combined scores.  Doing so would allow for improving the performance of query auto completion that incorporates query context.
Regarding claim 19, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer system as claimed in claim 15, but does not specifically disclose: wherein sort the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on the similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain further comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to: receive one or more first predictions from a first system and one or more second predictions from a second system; normalize a first score for each first prediction of the one or more first predictions and a second score for each second prediction of the one or more second predictions; create a plurality of prediction pairs, wherein each prediction pair includes a first prediction from the one or more first predictions and a second prediction from the one or more second predictions; calculate a prediction weight, wherein the prediction weight is a mean of a combined string similarity for each prediction pair of the plurality of prediction pairs; multiply a first normalized score of each first prediction of the one or more first predictions by the prediction weight to create a first weighted score for each first prediction; multiply a second normalized score of each second prediction of the one or more second predictions by the prediction weight to create a second weighted score for each second prediction; and merge the one or more first predictions and the one or more second predictions into a prediction list, wherein the prediction list is sorted based on a descending order of the first weighted score and the second weighted score.
Bar-Yossef teaches:
wherein sort the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on the similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain further comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to:
receive one or more first predictions from a first system and one or more second predictions from a second system (Section 5, lines 20-24, "Given a user input x and a context C, HybridCompletion produces two ranked lists of completions of x: LNC consists of the top ℓ completions returned by NearestCompletion and LMPC consists of the top ℓ completions returned by MostPopularCompletion."); 
normalize a first score for each first prediction of the one or more first predictions and a second score for each second prediction of the one or more second predictions (Section 5, lines 26-33, "The results in each of the two lists are ranked by quality scores: LNC is ranked by a similarity score, which we denote by simscore(·), and LMPC is ranked by a popularity score, which we denote by popscore(·). The aggregated list LHC is constructed by combining the two scoring functions into a single hybrid score, denote hybscore(·). As simscore and popscore use different units and scales, they need to be standardized before they can be combined.");
create a plurality of prediction pairs, wherein each prediction pair includes a first prediction from the one or more first predictions and a second prediction from the one or more second predictions (Section 5, lines 20-24, "Given a user input x and a context C, HybridCompletion produces two ranked lists of completions of x: LNC consists of the top ℓ completions returned by NearestCompletion and LMPC consists of the top ℓ completions returned by MostPopularCompletion."; The two ranked lists with ℓ completions demonstrate ℓ completion pairs that read on the prediction pairs.);
calculate a prediction weight, wherein the prediction weight is a mean of a combined string similarity for each prediction pair of the plurality of prediction pairs (Section 5, lines 36-45, "The standard similarity score is then calculated as Zsimscore(q) = (simscore(q) − μ)/σ, where μ and σ are the estimated mean and standard deviation. The standard popularity score is calculated similarly. The hybrid score is defined as a convex combination of the two scores: hybscore(q) = α · Zsimscore(q) + (1 − α) · Zpopscore(q), where 0 ≤ α ≤ 1 is a tunable parameter determining the weight of the similarity score relative to the weight of the popularity score."; The parameters α, μ, and σ used to calculate the hybrid score  from the standard similarity score and the standard popularity score reads on the prediction weight.);
multiply a first normalized score of each first prediction of the one or more first predictions by the prediction weight to create a first weighted score for each first prediction (Section 5, lines 36-45, "The standard similarity score is then calculated as Zsimscore(q) = (simscore(q) − μ)/σ, where μ and σ are the estimated mean and standard deviation. The standard popularity score is calculated similarly. The hybrid score is defined as a convex combination of the two scores: hybscore(q) = α · Zsimscore(q) + (1 − α) · Zpopscore(q), where 0 ≤ α ≤ 1 is a tunable parameter determining the weight of the similarity score relative to the weight of the popularity score."; Using the coefficients 1/σ and α to calculate the hybrid score from the standard similarity score reads on multiplying a normalized score and a prediction weight to generate a weighted score for the first prediction.);
multiply a second normalized score of each second prediction of the one or more second predictions by the prediction weight to create a second weighted score for each second prediction (Section 5, lines 36-45, "The standard similarity score is then calculated as Zsimscore(q) = (simscore(q) − μ)/σ, where μ and σ are the estimated mean and standard deviation. The standard popularity score is calculated similarly. The hybrid score is defined as a convex combination of the two scores: hybscore(q) = α · Zsimscore(q) + (1 − α) · Zpopscore(q), where 0 ≤ α ≤ 1 is a tunable parameter determining the weight of the similarity score relative to the weight of the popularity score."; Using the coefficients 1/σ and (1-α) to calculate the hybrid score from the standard popularity score reads on multiplying a normalized score and a prediction weight to generate a weighted score for the second prediction.);
and merge the one or more first predictions and the one or more second predictions into a prediction list, wherein the prediction list is sorted based on a descending order of the first weighted score and the second weighted score (Section 5, lines 20-25, "Given a user input x and a context C, HybridCompletion produces two ranked lists of completions of x: LNC consists of the top ℓ completions returned by NearestCompletion and LMPC consists of the top ℓ completions returned by MostPopularCompletion. The final ranked list of completions LHC is constructed by aggregating the two lists."; Section 5, lines 62-70, “It is important to note that HybridCompletion may rerank the original lists of completions it receives. For example, among the most popular completions, it will promote the ones that are more similar to the context, and, conversely, among the most similar completions, it will promote the more popular ones. This implies that HybridCompletion can dominate both MostPopularCompletion and NearestCompletion not only on average, but also on individual inputs.”; Aggregating the two lists reads on merging the first predictions and the second predictions into a prediction list, and reranking the original lists reads on sorting the prediction list.).
Bar-Yossef teaches using two methods to generate possible text completions, standardizing the scores from the two methods, and using the standardized scores to generate a list of completions ranked based on the combined scores, in order to improve the performance of query auto completion that incorporates query context (Abstract, lines 1-9, “Query auto completion is known to provide poor predictions of the user’s query when her input prefix is very short (e.g., one or two characters). In this paper we show that context, such as the user’s recent queries, can be used to improve the prediction quality considerably even for such short prefixes. We propose a context-sensitive query auto completion algorithm, NearestCompletion, which outputs the completions of the user’s input that are most similar to the context queries.”; Abstract, lines 16-28, "We demonstrate that when the recent user’s queries are relevant to the current query she is typing, then after typing a single character, NearestCompletion’s MRR is 48% higher relative to the MRR of the standard MostPopularCompletion algorithm on average. When the context is irrelevant, however, NearestCompletion’s MRR is essentially zero. To mitigate this problem, we propose HybridCompletion, which is a hybrid of NearestCompletion with MostPopularCompletion. HybridCompletion is shown to dominate both NearestCompletion and MostPopularCompletion, achieving a total improvement of 31.5% in MRR relative to MostPopularCompletion on average.").
Arnold and Bar-Yossef are considered to be analogous to the claimed invention because they are in the same field of text generation systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Bar-Yossef to use two methods to generate possible text completions, standardize the scores from the two methods, and use the standardized scores to generate a list of completions ranked based on the combined scores.  Doing so would allow for improving the performance of query auto completion that incorporates query context.
Claims 6, 13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Arnold in view of Gaskill (“The Bitwise Hashing Trick for Personalized Search”) and Chechik (US Patent No. 9,594,851).
Regarding claim 6, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer-implemented method as claimed in claim 1, but does not specifically disclose: wherein sorting the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on the similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain further comprises: converting, by the one or more computer processors, each document in the corpus of documents into a fixed length floating point vector, wherein the conversion is performed using a text encoder; and selecting, by the one or more computer processors, one or more top query predictions, wherein the one or more top query predictions are the one or more predicted phrases that have a highest similarity based on the vector similarity.
Gaskill teaches:
wherein sorting the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on the similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain further comprises: converting, by the one or more computer processors, each document in the corpus of documents into a fixed length floating point vector, wherein the conversion is performed using a text encoder (Page 830, lines 11-12, "A common method for representing and comparing items such as documents or listing titles is with a vector created by feature hashing"; Page 830, lines 19-21, "In our initial approach, we found that adequate performance could be obtained only with vectors with a minimum length of 8,000 32-bit floating point dimensions.").
Gaskill teaches representing documents as floating-point vectors in order to improve the performance of a search result (Page 834, lines 10-11,"For our experiment we make a rough simulation of a search result ranking task, without using actual search recall sets."; Page 834, lines 29-35, "Tables 1 and 2 contain a summary of results from our experiment. The columns include the dimension of the arrays, the storage size (assuming 32-bit floats), and the execution time in seconds. Finally, the accuracy of the method in identifying the purchased item by a sorted ranking of the recall set in the 1st position, top five positions, or top ten positions. Our best performing method (pairwise float) could predict the bought item 33.93% of the time in the top-1 position.").
Arnold and Gaskill are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Gaskill to represent documents as floating-point vectors.  Doing so would allow for improving the performance of a search result.
Arnold in view of Gaskill does not specifically disclose: selecting, by the one or more computer processors, one or more top query predictions, wherein the one or more top query predictions are the one or more predicted phrases that have a highest similarity based on the vector similarity.
Chechik teaches:
selecting, by the one or more computer processors, one or more top query predictions, wherein the one or more top query predictions are the one or more predicted phrases that have a highest similarity based on the vector similarity (Column 13, lines 14-20, "Similarity scores can be generated using the query suggestion rule without necessarily pre-filtering potential query suggestions. The scoring rule naturally distinguishes between highly relevant and irrelevant query suggestions. The system selects query suggestions for each document 620. It uses the scores to select the query suggestions.").
Chechik teaches calculating a similarity score between documents and query suggestions and selecting query suggestions based on the similarity scores in order to efficiently and accurately learn anticipated queries from a large collection of pairs of documents and user queries (Column 3, lines 5-14, "Particular implementations of the technology disclosed can be implemented to realize one or more of the following advantages. A system can learn anticipated queries from a large collection of pairs of documents and user queries efficiently and with high accuracy. Experience with queries subsequent to visiting of documents can be generalized for documents with content similar to documents. Infrequent queries can be evaluated for use as suggested queries, even if they did not appear frequently enough in a training set to be used during training.").
Arnold, Gaskill, and Chechik are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Gaskill to incorporate the teachings of Chechik to calculate a similarity score between documents and query suggestions and select query suggestions based on the similarity scores.  Doing so would allow for efficiently and accurately learning anticipated queries from a large collection of pairs of documents and user queries.
Regarding claim 13, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer program product as claimed in claim 8, but does not specifically disclose: wherein sort the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on the similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain further comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to: convert each document in the corpus of documents into a fixed length floating point vector, wherein the conversion is performed using a text encoder; and select one or more top query predictions, wherein the one or more top query predictions are the one or more predicted phrases that have a highest similarity based on the vector similarity.
Gaskill teaches:
wherein sort the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on the similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain further comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to: convert each document in the corpus of documents into a fixed length floating point vector, wherein the conversion is performed using a text encoder (Page 830, lines 11-12, "A common method for representing and comparing items such as documents or listing titles is with a vector created by feature hashing"; Page 830, lines 19-21, "In our initial approach, we found that adequate performance could be obtained only with vectors with a minimum length of 8,000 32-bit floating point dimensions.").
Gaskill teaches representing documents as floating-point vectors in order to improve the performance of a search result (Page 834, lines 10-11,"For our experiment we make a rough simulation of a search result ranking task, without using actual search recall sets."; Page 834, lines 29-35, "Tables 1 and 2 contain a summary of results from our experiment. The columns include the dimension of the arrays, the storage size (assuming 32-bit floats), and the execution time in seconds. Finally, the accuracy of the method in identifying the purchased item by a sorted ranking of the recall set in the 1st position, top five positions, or top ten positions. Our best performing method (pairwise float) could predict the bought item 33.93% of the time in the top-1 position.").
Arnold and Gaskill are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Gaskill to represent documents as floating-point vectors.  Doing so would allow for improving the performance of a search result.
Arnold in view of Gaskill does not specifically disclose: select one or more top query predictions, wherein the one or more top query predictions are the one or more predicted phrases that have a highest similarity based on the vector similarity.
Chechik teaches:
select one or more top query predictions, wherein the one or more top query predictions are the one or more predicted phrases that have a highest similarity based on the vector similarity (Column 13, lines 14-20, "Similarity scores can be generated using the query suggestion rule without necessarily pre-filtering potential query suggestions. The scoring rule naturally distinguishes between highly relevant and irrelevant query suggestions. The system selects query suggestions for each document 620. It uses the scores to select the query suggestions.").
Chechik teaches calculating a similarity score between documents and query suggestions and selecting query suggestions based on the similarity scores in order to efficiently and accurately learn anticipated queries from a large collection of pairs of documents and user queries (Column 3, lines 5-14, "Particular implementations of the technology disclosed can be implemented to realize one or more of the following advantages. A system can learn anticipated queries from a large collection of pairs of documents and user queries efficiently and with high accuracy. Experience with queries subsequent to visiting of documents can be generalized for documents with content similar to documents. Infrequent queries can be evaluated for use as suggested queries, even if they did not appear frequently enough in a training set to be used during training.").
Arnold, Gaskill, and Chechik are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Gaskill to incorporate the teachings of Chechik to calculate a similarity score between documents and query suggestions and select query suggestions based on the similarity scores.  Doing so would allow for efficiently and accurately learning anticipated queries from a large collection of pairs of documents and user queries.
Regarding claim 20, as best understood based on the 35 U.S.C. 112(b) issues identified above, Arnold discloses the computer system as claimed in claim 15, but does not specifically disclose: wherein sort the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on the similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain further comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to: convert each document in the corpus of documents into a fixed length floating point vector, wherein the conversion is performed using a text encoder; and select one or more top query predictions, wherein the one or more top query predictions are the one or more predicted phrases that have a highest similarity based on the vector similarity.
Gaskill teaches:
wherein sort the one or more predicted phrases, wherein the one or more predicted phrases are sorted based on the similarity computation between the one or more predicted phrases and the corpus of documents for the specific domain further comprises one or more of the following program instructions, stored on the one or more computer readable storage media, to: convert each document in the corpus of documents into a fixed length floating point vector, wherein the conversion is performed using a text encoder (Page 830, lines 11-12, "A common method for representing and comparing items such as documents or listing titles is with a vector created by feature hashing"; Page 830, lines 19-21, "In our initial approach, we found that adequate performance could be obtained only with vectors with a minimum length of 8,000 32-bit floating point dimensions.").
Gaskill teaches representing documents as floating-point vectors in order to improve the performance of a search result (Page 834, lines 10-11,"For our experiment we make a rough simulation of a search result ranking task, without using actual search recall sets."; Page 834, lines 29-35, "Tables 1 and 2 contain a summary of results from our experiment. The columns include the dimension of the arrays, the storage size (assuming 32-bit floats), and the execution time in seconds. Finally, the accuracy of the method in identifying the purchased item by a sorted ranking of the recall set in the 1st position, top five positions, or top ten positions. Our best performing method (pairwise float) could predict the bought item 33.93% of the time in the top-1 position.").
Arnold and Gaskill are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold to incorporate the teachings of Gaskill to represent documents as floating-point vectors.  Doing so would allow for improving the performance of a search result.
Arnold in view of Gaskill does not specifically disclose: select one or more top query predictions, wherein the one or more top query predictions are the one or more predicted phrases that have a highest similarity based on the vector similarity.
Chechik teaches:
select one or more top query predictions, wherein the one or more top query predictions are the one or more predicted phrases that have a highest similarity based on the vector similarity (Column 13, lines 14-20, "Similarity scores can be generated using the query suggestion rule without necessarily pre-filtering potential query suggestions. The scoring rule naturally distinguishes between highly relevant and irrelevant query suggestions. The system selects query suggestions for each document 620. It uses the scores to select the query suggestions.").
Chechik teaches calculating a similarity score between documents and query suggestions and selecting query suggestions based on the similarity scores in order to efficiently and accurately learn anticipated queries from a large collection of pairs of documents and user queries (Column 3, lines 5-14, "Particular implementations of the technology disclosed can be implemented to realize one or more of the following advantages. A system can learn anticipated queries from a large collection of pairs of documents and user queries efficiently and with high accuracy. Experience with queries subsequent to visiting of documents can be generalized for documents with content similar to documents. Infrequent queries can be evaluated for use as suggested queries, even if they did not appear frequently enough in a training set to be used during training.").
Arnold, Gaskill, and Chechik are considered to be analogous to the claimed invention because they are in the same field of natural language processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Arnold in view of Gaskill to incorporate the teachings of Chechik to calculate a similarity score between documents and query suggestions and select query suggestions based on the similarity scores.  Doing so would allow for efficiently and accurately learning anticipated queries from a large collection of pairs of documents and user queries.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to James Boggs whose telephone number is (571)272-2968. The examiner can normally be reached M-F 8:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571)272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JAMES BOGGS/Examiner, Art Unit 2657                                                                                                                                                                                                        

/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657