Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Application 16/660,330 filed 10/22/2019 has been examined.
Claims 1-20 are currently pending.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.



Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an
abstract idea without significantly more.
Claim 1 recites:
analyzing historical user queries with an analysis model.
The limitation of analyzing historical user queries with an analysis model, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting a computing system, nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the computing system language, analyzing in the context of this claim encompasses the user manually analyzing queries using generic “analysis models”. Similarly, the limitation(s) of retrieving; analyzing; generating; generating; generating; and training, as drafted, is a process that, under its broadest reasonable interpretation,
covers performance of the limitation in the mind but for the recitation of generic computer
components. For example, but for the databases language, retrieving; analyzing; generating; generating; generating; and training in the context of this claim encompasses the user manually generating a listing of topic distribution data based on generic “analysis”. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)).
Further, these concepts also recite “Certain Methods of Organizing Human Activity”; (such as
commercial or legal interactions (including agreements in the form of contracts; legal
obligations; advertising, marketing or sales activities or behaviors; business relations) where
generating a listing of topic distribution data based on generic “analysis” is a method of human activity in advertising/marketing activities.
Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim only
recites one additional element – using a computer system to perform both the retrieving; analyzing; generating; generating; generating; and training and analyzing steps. The databases/processor in both steps is recited at a high level of generality (i.e., as a generic processor performing a generic computer function of analyzing generating topic distribution data) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
The claim does not include additional elements that are sufficient to amount to significantly more
than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a computer system to perform
both the retrieving; analyzing; generating; generating; generating; and training and analyzing steps amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim(s) is/are not patent eligible.

Dependent claims 2-13 are merely add further details of the abstract steps/elements recited in
claim 1 without integrating the idea into a practical application; or including an improvement to
another technology or technical field, an improvement to the functioning of the computer itself,
or meaningful limitations beyond generally linking the use of an abstract idea to a particular
technological environment. Therefore, dependent claims 2-13 are also directed towards
nonstatutory subject matter.

As per independent claims 14 and 19, are also rejected as ineligible subject matter under 35
U.S.C. 101 for substantially the same reasons as the method claim(s) 1. The components (i.e.,
system/medium described in independent claims 14 and 19 do not provide for integrating the
abstract idea into a practical application. At best, the claim(s) are merely providing alternate
environments to implement the abstract idea.

Dependent claims 15-18 and 20 merely add further details of the abstract steps/elements
recited in claim 1 without integrating the idea into a practical application; or including an
improvement to another technology or technical field, an improvement to the functioning of the
computer itself, or meaningful limitations beyond generally linking the use of an abstract idea to
a particular technological environment. Therefore, dependent claims 15-18 and 20 are also
directed towards non-statutory subject matter.



Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1, 6-7, 9-13 is/are rejected under 35 U.S.C. 103 as being unpatentable over 
Yuan et al., US Pub. No. 2019/0043379 A1, in view of Goel et al., US Patent No. 11,194,973.

As to claim 1, Yuan discloses a computing system method 
(Yuan Fig. 13B)
comprising:

retrieving, from an assistance document database of a data management system, a plurality of assistance documents each including a historical user query and an answer to the historical user query;
(Yuan [0039] To remedy this, an example aspect of the present application provides a neural
model trained from scratch to extract all answer key phrases in a particular document.;
see also [0106] Described herein is a multi-stage (e.g., two-stage) framework to address the problem of question generation from documents. First, a question answering corpus is used
to train a neural model to estimate the distribution of key phrases that are interesting to question-asking humans.)

analyzing the answers with a training model;  
(Yuan abstract: In one example embodiment herein, the estimating includes predicting the interesting phrases as answers, and the estimating is performed by a neural model.; 
see also [0039] To remedy this, an example aspect of the present application provides a neural
model trained from scratch to extract all answer key phrases in a particular document.;
see also [0106] Described herein is a multi-stage (e.g., two-stage) framework to address the problem of question generation from documents. First, a question answering corpus is used
to train a neural model to estimate the distribution of key phrases that are interesting to question-asking humans.)

generating, with the training model, first topic distribution data indicating, for each answer, how relevant each of a plurality of topics is to the answer;
(Yuan [0038] The result outputted from MLP 22 is a value of P( e, ID) that represents a probability that a particular entity ei is relevant given the document D.;
See also [0050] According to an example embodiment herein, during inference, selection can be made of the top k entities with highest likelihood as being relevant in the document given
by the model, where, in one example embodiment, k=6 as determined by a hyper-parameter search.)

analyzing the historical user queries with an analysis model;
(Yuan [0027] Given a set of extracted key phrases, question generation is performed by modeling the conditional probability of a question given a document-answer pair, i.e.,
P( q/a, d). For this a sequence-to-sequence model with attention is employed, as is a pointer-softmax mechanism. This component is also trained on SQuAD, according to an example embodiment, by maximizing the likelihood of questions in the dataset.;
see also [0070] An amount of 5,158 question-answer pairs ( self-contained in 23 Wikipedia articles) from the training set was used as a validation set.
[0071] All models were trained using stochastic gradient descent with a minibatch size of 32 using the ADAM optimization algorithm.)

generating, with the analysis model for each historical user query, embedding data including a matrix embedding of the historical user query;
(Yuan [0074] In question generation, the decoder vocabulary used the top 2000 words sorted by their frequency in the gold questions in the training data. The word embedding matrix was initialized with the 300-dimensional Glo Ve vectors.; see also [0034] In one example embodiment, the lookup table is parameterized by a matrix, 8, with a row for each word:)


and
training the analysis model with a machine learning process to generate the
second topic distribution data, for each historical user query, convergent to the first
topic distribution data associated with the corresponding answer
(Yuan [0008] the method further comprises conditioning a question generation
model based on the interesting phrases predicted in the predicting, the question generation model generating the question. The method also can include training the neural model.;
See also [0077] The metric is calculated as follows. Given the prediction sequence e, and the gold label sequence e1 , first there is constructed a pairwise, token-level Fl score f,,1 matrix between the two phrases e, and eJ" Max-pooling along the gold-label axis essentially assesses the precision of each prediction, with partial matches accounted for by the pairwise Fl (identical to evaluation of a single answer in SQuAD) in the cells: p,=max/f,,1)- Analogously, recall for
label e1 can be defined by max-pooling along the prediction axis: r1=max,(f,). The hierarchical Fl is defined by the mean precision p=avg(p,) and recall r=avg(r):;
see also [0079] Evaluation results are listed in the Table immediately below).


Yuan does not disclose:
generating, with the analysis model for each historical user query, second topic
distribution data based on the matrix embedding data and including a distribution of
topics that are relevant to the historical user query; 

however, Goel discloses:
generating, with the analysis model for each historical user query, second topic
distribution data based on the matrix embedding data and including a distribution of
topics that are relevant to the historical user query; 
(Goel teaches various second topic distributions/scores for embedding data/potential responses see
Col. 18 ln. 7-16: The system may then use some combination of scores (e.g., comprehensible score 972, on-topic score 974, coherence score 981, interesting score 976, continue score 978,
10 and/or engagement score 986) to determine which potential response to a user utterance should be selected. For example the system may process a coherence score 981 and engagement score 986 to arrive at an overall score, which may be a sum of those scores, some weighted combination of those 15 scores, etc. The potential system response with the highest
score may be selected.;
see also Col. 4 ln. 54-58: The system may also determine (136), 55 using at least one second trained model, a first engagement score for the first response. The engagement score may
correspond to an estimate as to how well the user will react to the dialog response;
see also col. 20 ln. 24-32: To make the loss differentiable, the system may use the output of the softmax layer 1240 ( distribution of likelihood across entire vocabulary for output length k, i.e., IVlxk) and use this to do a weighted embedding lookup across the entire vocabulary to
get the same Dxk matrix as an input to rest of the evaluator network. The system may keep the rest of the input ( context and features) for the evaluator required to predict the scores
as 1s. )

It would have been obvious to one having ordinary skill in the art at the time the time of the effective filing date to apply training the neural model to predict question answer pairs as taught by Goel since it was known in the art response systems provide for to perform various operations effectively certain techniques may be employed to incorporate certain information in a form that can be considered by a trained model where such information may include, for example, dialog data from user inputs and system responses, etc., where one such technique for putting information in a form operable by a trained model, for example, is use of an encoder and encoding is a general technique for projecting a sequence of features into a vector space where one goal of encoding is to project data points into a multi-dimensional vector space so that various operations can be performed on the vector combinations to determine how they ( or the data they contain) relate to each other. (Goel col. 10 ln. 16-27).


As to claim 6, Yuan discloses the method of claim 1, wherein the analysis model is sensitive to the sequence of words in a historical user query (Yuan [0033] Referring to FIGS. 2, 4, and 6, the training method starts in 402 and then in procedure 404 each word w/ is embedded in an embedding layer 18 using an embedding lookup table, to generate a corresponding distributed vector v/_emb_w, such that, for a sequence of the words w 1 ... w n' the embedding results in the generation of corresponding vectors).

As to claim 7, Yuan discloses the method of claim 6, wherein the analysis model includes a
sequential neural model (Yuan [0039] This model is parameterized as a
pointer network to point sequentially to start and end locations of all key phrase answers.).

As to claim 9, Goel discloses, under the rationale above, discloses the method of claim 6, further comprising, converting each historical user query to a series of vectors in which each vector is a one hot encoding of one of the words in the historical user query, wherein analyzing the historical user queries includes analyzing the series of vectors (Goel col. 11 ln. 25-29: A word sequence is usually represented as a series of one-hot vectors (i.e., a N-sized vector representing the N available words in a lexicon, with one bit high to represent the particular word in the sequence).).

As to claim 10, Goel discloses under the rationale above the method of claim 1, wherein the machine learning process is a semisupervised machine learning process in which the first topic distribution data is used as labels for the semi-supervised machine learning process (Goel col. 13 ln. 31-35: Various techniques may be used to train the models including backpropagation, statistical learning, supervised learning, semi-supervised learning, stochastic learning, or
other known techniques.).

As to claim 11, Goel discloses under the rationale above the method of claim 1, further comprising, after the machine learning
process:
receiving a new user query from a user of the data management system;
(Goel col. 4 ln. 40-43: The user input may be processed (132) by a dialog management component or other component of the system(s) 120 to determine one or more potential system
responses to the user input.)

generating embedding data for the new user query by analyzing the new user
query with the analysis model;
(Goel col. 15 ln. 52-55: User input feature vector 827 may correspond to a word embedding or sentence embedding using embedding techniques discussed above.;
See also col. 11 ln. 20-25: For various operations, such as selecting and/or scoring potential dialog responses, a system may be configured to encode text data that may include one or more word
sequences (for example dialog data from one or more previous exchanges with the system during a dialog) and use that encoded text data to score potential dialog responses;
see also col. 11 ln. 29-33: The one-hot vector is often augmented with information from other models, which have been trained on large amounts of generic data, including but not limited to word embeddings that represent how individual words are used in a text corpus,) 

selecting support data based on the embedding data; 
(Goel col. 15 ln. 57-59: Response feature vector 837 may correspond to a word embedding or sentence embedding using embedding techniques discussed above.;
See also col. 11 ln. 52-55: Thus the word embedding data represented in the individual values of a word embedding data vector may correspond to how the respective word is used in the corpus.;)
and
providing the support data to the user in response to the new user query
(Goel col. 4 ln. 40-43: The user input may be processed (132) by a dialog management component or other component of the system(s) 120 to determine one or more potential system
responses to the user input.;
see also col. 3 ln. 34-40: To address this concern, a system may choose a system response based on an analysis of whether the system response will be deemed by a user to be relevant and interesting as part of the dialog. Such an analysis of relevance and interest may be made at each turn to mitigate the general response problem and generally provide a more desirable user experience.).

As to claim 12, Goel discloses under the rationale above the method of claim 11, further comprising providing the support data to the user via an automated support system of the data management system (Goel col. 4 ln. 40-43: The user input may be processed (132) by a dialog
management component or other component of the system(s) 120 to determine one or more potential system responses to the user input.;
See also col. 3 ln. 2-8: Systems configured to engage in dialogs with a user may use the dialog session identifier or other data to track the progress of the dialog to select system responses in a way that tracks the previous user-system exchanges, thus moving the dialog along in a manner that results in a desirable user experience.;
See also col. 2 ln. 46-49A dialog may be goal-oriented, meaning the dialog is directed to the system performing a specific action requested by a user ( such as figuring out what music the system should play).).

As to claim 13, Goel discloses under the rationale above the method of claim 12, wherein the automated support system includes an assistance document search function or a support chat bot
(Goel col. 2 ln. 54-57: System components that control what actions the system takes in
response to various user inputs of a dialog may sometimes be referred to as chatbots.;
see also col. 3 ln. 34-40: To address this concern, a system may choose a system response based on an analysis of whether the system response will be deemed by a user to be relevant and interesting as part of the dialog. Such an analysis of relevance and interest may be made at each turn to mitigate the general response problem and generally provide a more desirable user experience.).


Claim(s) 2-5, 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over 
Yuan et al., US Pub. No. 2019/0043379 A1, in view of Goel et al., US Patent No. 11,194,973, in view of Hewitt et al., US Pub. No. 2020/0401885 A1.

As to claim 2, Yuan/Goel does not disclose:
wherein the training model includes a latent Dirichlet allocation model;

However, Hewitt discloses the method of claim 1, wherein the training model includes a latent
Dirichlet allocation model (Hewitt [0021] Solution efficacy model 126, hereinafter SEM 126,
contains one or more models, containers, documents, subdocuments, matrices, vectors, and associated data, modeling one or more feature sets such as results from linguistic
analysis, topic characterization/representations, and intraarrival time of post frequency. In an embodiment, SEM 126 contains one or more generative (e.g., latent Dirichlet allocation
(LDA), etc.) or discriminative (e.g., support vector machine (SVM), etc.) statistical models utilized to categorize one or more communications. For example, SEM 126 may train and utilize one or more generative statistical models to calculate the conditional probability of an observable X,;
See also [0030] In various embodiment, program 150 utilizes latent Dirichlet allocation (LDA) to identify one or more topics that may be contained within a discussion. LDA allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. For example, if observations are words collected into documents, LDA posits that
each document is a mixture of a small number of topics and that each word's presence is attributable to one of the document's topics.).

It would have been obvious to one having ordinary skill in the art at the time the time of the effective filing date to training using latent Dirichlet allocation (LDA) as taught by Hewitt since it was known in the art that solution effective ness systems provide one or more generative (e.g., latent Dirichlet allocation (LDA), etc.) or discriminative (e.g., support vector machine (SVM), etc.) statistical models utilized to categorize one or more communications. (Hewitt abstract, 0021).

As to claim 3, Goel discloses under the rationale above the method of claim 2, wherein each answer is modeled as a bag of words (Goel col. 16 ln. 40-45: The various data 845 may also include parts-of-speech (POS) data which may be determined by a POS tagger or other component. For example a bag-of-words representation of POS tags may be produced for a
user utterance and/ or system response and considered as part of the various data 845.).



As to claim 4, Yuan discloses the method of claim 3, wherein each of the plurality of topics is
modeled as a distribution over words (Yuan [0074] In question generation, the decoder vocabulary used the top 2000 words sorted by their frequency in the gold questions in the training data;
see also [0067] The result of the iterative process is a sequence of words that form the question, wherein each word is generated from vocabulary or copied from the document. The start and end of the determined question can be determined based on the results of the pointer network 600.).

As to claim 5, Yuan discloses the method of claim 4, wherein the training model generates the first topic distribution data based on cooccurrence of words in the answers (Yuan [0074] In question generation, the decoder vocabulary used the top 2000 words sorted by their frequency in the gold questions in the training data.).


As to claim 8, Hewitt discloses under the rationale above the method of claim 7, wherein the sequential neural model includes
one or more of:
a long short-term memory model;
a gated recurrent unit model; and
a convolution neural network
(Hewitt [0024] In an embodiment, SEM 126 utilizes deep learning techniques to pair problems to relevant solutions. In various embodiments, SEM 126 utilizes transferrable neural network algorithms and models (e.g., long short-term memory (LSTM), deep stacking network (DSN), deep beliefnetwork (DBN), convolutional neural networks (CNN), compound hierarchical deep models, etc.) that can be trained with supervised and/or unsupervised methods.;
See also [0024] SEM 126 utilizes gated recurrent units (GRU). GRUB simplify the training process while reducing the amount of necessary computational resources. In another
embodiment, SEM 126 utilizes LSTM.).

Claim(s) 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over 
Goel et al., US Patent No. 11,194,973.

As to claim 14, Goel discloses a method comprising:
training an analysis model with a machine learning process to generate
embedding data for user queries;
(Goel col. 3 ln. 59-col. 4 ln. 2 The system may also train and operate a model that
may evaluate as the potential engagement of the potential dialog response ( e.g., how well the user will react to the dialog response). Other models may also be trained and operated to evaluate the quality of a potential dialog response. Training of such models may occur using many training dialog turns and conversations between users and chatbots, where each training dialog turn includes user evaluation data (to be used as ground truth data) corresponding to the desired quality to be evaluated (e.g., coherence and/or engagement).)

receiving a user query from a user of a data management system;
(Goel col. 4 ln. 40-43: The user input may be processed (132) by a dialog management
component or other component of the system(s) 120 to determine one or more potential system
responses to the user input.;
see also col. 2 ln. 46-49: A dialog may be goal-oriented, meaning the dialog is directed to the system performing a specific action requested by a user ( such as figuring out what music the system should play).)

generating, with the analysis model, embedding data for the user query by
embedding the user query as matrix in a vector space based on a sequence of words in
the user query;
(Goel col. 15 ln. 50-55: The user input data 825 may be encoded using user input encoder 820 that can output user input feature vector (FV) 827. User input feature vector 827 may correspond to a word embedding or sentence embedding using embedding techniques
discussed above.;
See also col. 10 ln. 20-24: One such technique for putting information in a form operable by a trained model, for example, is use of an encoder. Encoding is a general technique for projecting a sequence of features into a vector space.;
See also col. 16 ln. 1-3: Entity grid information may include data corresponding
to a grid representation of dialog turns and entities as a
matrix (e.g., DAs x entities).)

selecting support data for the user based on the embedding data; 
(Goel col. 15 ln. 57-59: Response feature vector 837 may correspond to a word embedding or sentence embedding using embedding techniques discussed above.;
See also col. 11 ln. 52-55: Thus the word embedding data represented in the individual values of a word embedding data vector may correspond to how the respective word is used in the corpus.;)
and
providing the support data to the user responsive to the user query
(Goel col. 4 ln. 40-43: The user input may be processed (132) by a dialog management component or other component of the system(s) 120 to determine one or more potential system
responses to the user input.;
see also col. 3 ln. 34-40: To address this concern, a system may choose a system response based on an analysis of whether the system response will be deemed by a user to be relevant and interesting as part of the dialog. Such an analysis of relevance and interest may be made at each turn to mitigate the general response problem and generally provide a more desirable user experience.).

It would have been obvious to one having ordinary skill in the art at the time the time of the effective filing date to apply training the neural model to predict question answer pairs as taught by Goel since it was known in the art response systems provide for to perform various operations effectively certain techniques may be employed to incorporate certain information in a form that can be considered by a trained model where such information may include, for example, dialog data from user inputs and system responses, etc., where one such technique for putting information in a form operable by a trained model, for example, is use of an encoder and encoding is a general technique for projecting a sequence of features into a vector space where one goal of encoding is to project data points into a multi-dimensional vector space so that various operations can be performed on the vector combinations to determine how they ( or the data they contain) relate to each other. (Goel col. 10 ln. 16-27).

Claim(s) 15-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over 
Goel et al., US Patent No. 11,194,973, in view of Yuan et al., US Pub. No. 2019/0043379 A1.

As to claim 15, Goel discloses the method of claim 14, wherein training the analysis model includes:
retrieving, from an assistance document database of the data management
system, a plurality of assistance documents each including a historical user query and
an answer to the historical user query;
(Goel col. 3 ln. 3-13: Systems configured to engage in dialogs with a user may
use the dialog session identifier or other data to track the progress of the dialog to select system responses in a way that tracks the previous user-system exchanges, thus moving the dialog along in a manner that results in a desirable user experience. Systems may incorporate information such as the dialog history (which may include user inputs, system responses, or other data relevant to the dialog) in the natural language understanding (NLU) operations when interpreting user inputs so the system can select an appropriate response to what the user said.;
see also col. 14 ln. 11-18: Such an evaluator may be trained on a large annotated
corpus of example training chatbot conversations. Such a corpus may include text data corresponding to user inputs and system generated responses, as well as annotation data regarding how the human participants in such conversations engaged with the system as it provided its responses to the training conversations.;
see also col. 3 ln. 34-40: To address this concern, a system may choose a system response based on an analysis of whether the system response will be deemed by a user to be relevant and interesting as part of the dialog. Such an analysis of relevance and interest may be made at each turn to mitigate the general response problem and generally provide a more desirable user experience.)

analyzing the answers with a training model;
(Goel col. 14 ln. 18-20: Such training data may result in a better trained model for open dialog management as it consists of utterances exchanged during a conversation;
See also col. 14 ln. 30-43: The training data (which may include text corresponding to a user input, text corresponding to a system generated response the user input, annotation data, or other data relevant to the training dialog/conversation) may be used to train one or more models (such as those used by a dialog management component 265) which may be used at runtime
to evaluate the potential success (as measured by coherence and engagement) of a potential system response in a dialog
col. 14. ln.42-43 feature vector for processing by the model to evaluate one or more potential responses.)

and
generating, with the analysis model for historical each query, second topic
distribution data based on the matrix embedding data and including a distribution of
topics that are relevant to the query, the second topic distribution data matching the
first topic distribution data.
(Goel teaches various second topic distributions/scores for embedding data/potential responses see
Col. 18 ln. 7-16: The system may then use some combination of scores (e.g., comprehensible score 972, on-topic score 974, coherence score 981, interesting score 976, continue score 978,
10 and/or engagement score 986) to determine which potential response to a user utterance should be selected. For example the system may process a coherence score 981 and engagement score 986 to arrive at an overall score, which may be a sum of those scores, some weighted combination of those 15 scores, etc. The potential system response with the highest
score may be selected.;
see also Col. 4 ln. 54-58: The system may also determine (136), 55 using at least one second trained model, a first engagement score for the first response. The engagement score may
correspond to an estimate as to how well the user will react to the dialog response;
see also col. 20 ln. 24-32: To make the loss differentiable, the system may use the output of the softmax layer 1240 ( distribution of likelihood across entire vocabulary for output length k, i.e., IVlxk) and use this to do a weighted embedding lookup across the entire vocabulary to
get the same Dxk matrix as an input to rest of the evaluator network. The system may keep the rest of the input ( context and features) for the evaluator required to predict the scores
as 1s. )

Goel does not disclose:
generating, with the training model, first topic distribution data indicating, for each answer, how relevant the answer is to each of a plurality of topics; 

However, Yuan discloses:
generating, with the training model, first topic distribution data indicating, for each answer, how relevant the answer is to each of a plurality of topics; 
(Yuan [0038] The result outputted from MLP 22 is a value of P( e, ID) that represents a probability that a particular entity ei is relevant given the document D.;
See also [0050] According to an example embodiment herein, during inference, selection can be made of the top k entities with highest likelihood as being relevant in the document given
by the model, where, in one example embodiment, k=6 as determined by a hyper-parameter search.)


It would have been obvious to one having ordinary skill in the art at the time the time of the effective filing date to apply relevancy predictions as taught by Yuan since it was known in the art response systems provide for a model that can flexibly select an arbitrary number of key phrases from a document where to teach the model to assign high probability to interesting answers, the model is trained, according to an example embodiment herein, on human-selected
answers from a largescale, crowd-sourced question-answering dataset (SQuAD) and as such, a data-driven approach to the concept of interestingness is employed, based on the premise that crowdworkers tend to select entities or events that interest them when they formulate their own comprehension questions where a growing collection of crowd-sourced question answering
datasets can be harnessed to learn models for key phrases of interest to human readers. (Yuan [0026]).

As to claim 16, Goel discloses the method of claim 14, wherein the support data includes one or more assistance documents selected from an assistance document database
(Goel col. 3 ln. 3-13: Systems configured to engage in dialogs with a user may
use the dialog session identifier or other data to track the progress of the dialog to select system responses in a way that tracks the previous user-system exchanges, thus moving the dialog along in a manner that results in a desirable user experience. Systems may incorporate information such as the dialog history (which may include user inputs, system responses, or other data relevant to the dialog) in the natural language understanding (NLU) operations when interpreting user inputs so the system can select an appropriate response to what the user said.;
see also col. 14 ln. 11-18: Such an evaluator may be trained on a large annotated
corpus of example training chatbot conversations. Such a corpus may include text data corresponding to user inputs and system generated responses, as well as annotation data regarding how the human participants in such conversations engaged with the system as it provided its responses to the training conversations.;
see also col. 3 ln. 34-40: To address this concern, a system may choose a system response based on an analysis of whether the system response will be deemed by a user to be relevant and interesting as part of the dialog. Such an analysis of relevance and interest may be made at each turn to mitigate the general response problem and generally provide a more desirable user experience.).


As to claim 17, Goel discloses the method of claim 14, wherein providing the support data to the user includes providing the support data to the user with an automated support system
(Goel col. 4 ln. 40-43: The user input may be processed (132) by a dialog management component or other component of the system(s) 120 to determine one or more potential system
responses to the user input.;
see also col. 3 ln. 34-40: To address this concern, a system may choose a system response based on an analysis of whether the system response will be deemed by a user to be relevant and interesting as part of the dialog. Such an analysis of relevance and interest may be made at each turn to mitigate the general response problem and generally provide a more desirable user experience.)..

As to claim 18, Yuan discloses under the rationale above the method of claim 14, further comprising generating, with the
analysis model based on the embedding data, topic distribution data indicating how
relevant the answer is to each of a plurality of topics (Yuan [0038] The result outputted from MLP 22 is a value of P( e, ID) that represents a probability that a particular entity ei is relevant given the document D.;
See also [0050] According to an example embodiment herein, during inference, selection can be made of the top k entities with highest likelihood as being relevant in the document given
by the model, where, in one example embodiment, k=6 as determined by a hyper-parameter search.).





As to claim 19, Goel discloses a method comprising:
retrieving, from an assistance document database of the data management
system, a plurality of assistance documents each including a historical user query and
an answer to the historical user query;
(Goel col. 3 ln. 3-13: Systems configured to engage in dialogs with a user may
use the dialog session identifier or other data to track the progress of the dialog to select system responses in a way that tracks the previous user-system exchanges, thus moving the dialog along in a manner that results in a desirable user experience. Systems may incorporate information such as the dialog history (which may include user inputs, system responses, or other data relevant to the dialog) in the natural language understanding (NLU) operations when interpreting user inputs so the system can select an appropriate response to what the user said.;
see also col. 14 ln. 11-18: Such an evaluator may be trained on a large annotated
corpus of example training chatbot conversations. Such a corpus may include text data corresponding to user inputs and system generated responses, as well as annotation data regarding how the human participants in such conversations engaged with the system as it provided its responses to the training conversations.;
see also col. 3 ln. 34-40: To address this concern, a system may choose a system response based on an analysis of whether the system response will be deemed by a user to be relevant and interesting as part of the dialog. Such an analysis of relevance and interest may be made at each turn to mitigate the general response problem and generally provide a more desirable user experience.)



and
generating, with an analysis model for historical each query, matrix embedding
data and second topic distribution data based on the matrix embedding data and
including a distribution of topics that are relevant to the query, the second topic
distribution data converging with the first topic distribution data
(Goel teaches various second topic distributions/scores for embedding data/potential responses see
Col. 18 ln. 7-16: The system may then use some combination of scores (e.g., comprehensible score 972, on-topic score 974, coherence score 981, interesting score 976, continue score 978,
10 and/or engagement score 986) to determine which potential response to a user utterance should be selected. For example the system may process a coherence score 981 and engagement score 986 to arrive at an overall score, which may be a sum of those scores, some weighted combination of those 15 scores, etc. The potential system response with the highest
score may be selected.;
see also Col. 4 ln. 54-58: The system may also determine (136), 55 using at least one second trained model, a first engagement score for the first response. The engagement score may
correspond to an estimate as to how well the user will react to the dialog response;
see also col. 20 ln. 24-32: To make the loss differentiable, the system may use the output of the softmax layer 1240 ( distribution of likelihood across entire vocabulary for output length k, i.e., IVlxk) and use this to do a weighted embedding lookup across the entire vocabulary to
get the same Dxk matrix as an input to rest of the evaluator network. The system may keep the rest of the input ( context and features) for the evaluator required to predict the scores
as 1s. ).


Goel does not disclose:
generating, with a training model, first topic distribution data indicating, for each answer, how relevant the answer is to each of a plurality of topics;

However, Yuan discloses:
generating, with a training model, first topic distribution data indicating, for each answer, how relevant the answer is to each of a plurality of topics; 
(Yuan [0038] The result outputted from MLP 22 is a value of P( e, ID) that represents a probability that a particular entity ei is relevant given the document D.;
See also [0050] According to an example embodiment herein, during inference, selection can be made of the top k entities with highest likelihood as being relevant in the document given
by the model, where, in one example embodiment, k=6 as determined by a hyper-parameter search.)


It would have been obvious to one having ordinary skill in the art at the time the time of the effective filing date to apply relevancy predictions as taught by Yuan since it was known in the art response systems provide for a model that can flexibly select an arbitrary number of key phrases from a document where to teach the model to assign high probability to interesting answers, the model is trained, according to an example embodiment herein, on human-selected
answers from a largescale, crowd-sourced question-answering dataset (SQuAD) and as such, a data-driven approach to the concept of interestingness is employed, based on the premise that crowdworkers tend to select entities or events that interest them when they formulate their own comprehension questions where a growing collection of crowd-sourced question answering
datasets can be harnessed to learn models for key phrases of interest to human readers. (Yuan [0026]).

Claim(s) 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Goel et al., US Patent No. 11,194,973, in view of Yuan et al., US Pub. No. 2019/0043379, in view of Hewitt et al., US Pub. No. 2020/0401885 A1.


As to claim 20, Hewitt discloses under the rationale above the method of claim 19, wherein the analysis model includes one or
more of:
a long short-term memory model;
a gated recurrent unit model; and
a convolution neural network.
(Hewitt [0024] In an embodiment, SEM 126 utilizes deep learning techniques to pair problems to relevant solutions. In various embodiments, SEM 126 utilizes transferrable neural network algorithms and models (e.g., long short-term memory (LSTM), deep stacking network (DSN), deep beliefnetwork (DBN), convolutional neural networks (CNN), compound hierarchical deep models, etc.) that can be trained with supervised and/or unsupervised methods.;
See also [0024] SEM 126 utilizes gated recurrent units (GRU). GRUB simplify the training process while reducing the amount of necessary computational resources. In another
embodiment, SEM 126 utilizes LSTM.).







CONTACT INFORMATION
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EVAN S ASPINWALL whose telephone number is (571)270-7723. The examiner can normally be reached Monday-Friday 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Neveen Abel-Jalil can be reached on 571-270-0474. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/Evan Aspinwall/Primary Examiner, Art Unit 2152