DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 
The following title is suggested: Adapting Dialog Models by Relevance Value for Concepts to Complete a Task.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5, 7 to 10, 14, and 16 to 20 are rejected under 35 U.S.C. 103 as being unpatentable over Larson et al. (U.S. Patent No. 10,679,150) in view of Farhady Ghalaty et al. (U.S. Patent No. 10,867,245).
Concerning independent claims 1, 10, and 19, Larson et al. discloses a method, apparatus, and computer instructions for automatically configuring training data for a dialogue system, comprising:
e.g., Income, Balance, Spending, Investment, Location, etc. (column 7, lines 32 to 39: Figure 1); slot identification engine 130 functions to implement one or more machine learning models to identify slots or meaningful segments (“identify one or more concepts”) of user queries or user commands, and to assign a slot classification label for each identified slot (column 8, lines 22 to 35: Figure 1); a corpora of raw machine learning training data from one or more sources of utterance examples define a scope for sourcing and generating suitable training data for a given intent classification task; a corpora of raw machine learning training data includes a plurality of distinct corpora of machine learning data; sourcing machine learning training data includes defining a set of example utterances for each of a plurality of intent classification tasks (“information on completing tasks”) (column 11, line 8 to column 12, line 28: Figure 2: Step 210); broadly, a training corpus can be construed as “a document”; Compare Specification, ¶[0021], describing a document as a text side of a user/dialog system interaction, and ¶[0025], describing a document corpus 204 as chat logs in a dialog; 

“removing the utterances from a dialog model to be used for completing the task when the [relevance] value of the utterance is below a given threshold value” – in one embodiment, a plurality of seed training samples comprise a plurality of example utterances and/or prompts for a specific dialog intent of a machine learning-based automated dialogue system (column 3, lines 31 to 35); if one or more metrics of the corpora of raw machine learning training data do not satisfy one or more training data quality threshold, e.g., a minimum coverage threshold, a minimum diversity threshold, etc. (“when the [relevance] value of the utterance is below a given threshold value”), or if one or more performance metrics of the one or more machine learning algorithms trained using the corpora of training data do not satisfy performance metrics, e.g., accuracy metrics (“when the [relevance] value of the utterance is below a given 
Concerning independent claims 1, 10, and 19, the only element that is not expressly disclosed by Larson et al. is “computing a relevance value of an utterance with respect to completing the task using the one or more identified concepts”.  That is, Larson et al. compares quality metrics, diversity metrics, performance metrics, and coverage metrics to thresholds to determine if training data representing utterances should be removed, but does not expressly disclose “a relevance value of an utterance . . . using the one or more identified concepts”.  Still, Larson et al. discloses using a dialog model to complete a task using identified concepts, even if it is does not expressly disclose “a relevance value”.  

Farhady Ghalaty et al. teaches a system and method for facilitating training of a prediction model, where a determination via a relevancy model, based on training data, of whether a feature type satisfies a first condition may be made, where a first condition may relate to whether the feature type has a threshold amount of influence on the prediction model.  (Abstract)  The feature types represented by a dataset may relate to a particular field used to train a particular type of prediction model.  A dataset may be used to train a prediction model for pricing and/or selecting an automobile or for approving/disapproving a loan.  (Column 3, Lines 49 to 60)  A relevancy model may be used to determine how relevant one or more feature types are to the output of the prediction model.  If the relevancy model indicates that the datasets used to generate the training data include a feature type that has an amount of influence equal to or greater than a threshold amount of influence, then the training data may be updated.  (Column 4, Lines 12 to 34)  The relevancy model may compute a relevancy score for a feature type, and if the relevancy score exceeds the relevancy threshold score, the feature type satisfies the condition.  (Column 5, Lines 49 to 54)  Farhady Ghalaty et al., then, teaches a model for ‘tasks’ including pricing/selecting an automobile or approving/disapproving a loan, and “computing a relevancy value” to determine if training data should be included to train a model by comparing a relevancy score to threshold.  Implicitly, if training data does not satisfy a threshold, then this might be understood to imply that it is not be used for training, which appears equivalent to “removing” that training data from a model.  An objective is to determine whether feature types included in training data influence or impact results of the prediction model.  (Column 1, Lines 15 to 30)  It would have been obvious to one Farhady Ghalaty et al. to train a model in a dialogue system that removes training samples that do not meet a threshold in Larson et al. for a purpose of determining whether feature types included in training data influence or impact results.

Concerning claims 5 and 14, Larson et al. discloses that a corpus of training data includes sentences, words, and/or phrases, where training data is converted to vector values.  (Column 14, Line 61 to Column 15, Line 54: Figure 2)  Here, sentences, words, and phrases are “linguistic constituents”.
Concerning claims 7 to 8 and 16 to 17, Larson et al. discloses that slot identification engine 130 may function to decompose a query or command into defined, essential components that implicate meaningful information to be used when generating a response (column 8, lines 49 to 53: Figure 1); example utterances and/or prompts (seed samples) are defined for sourcing raw machine learning training data for each of a plurality intent classification tasks (column 12, lines 13 to 28: Figure 2); queries/commands are removed that do not satisfy data quality thresholds or performance thresholds (column 14, lines 21 to 36: Figure 2).  Larson et al., then, broadly determines that an utterance for a training sample is “non-essential” when it is below a threshold, and “essential” when it is above a threshold, as training data that is retained is implicitly “essential” and training data that is removed is implicitly “non-essential”.  Similarly, Farhady Ghalaty et al. teaches that a prediction model is associated with a “task”, e.g., pricing/selecting an automobile or approving/disapproving 
Concerning claims 9, 18, and 20, Farhady Ghalaty et al. teaches that a prediction model is associated with a “task”, e.g., pricing/selecting an automobile or approving/disapproving a loan.  (Column 3, Lines 49 to 63)  A relevancy model may determine how relevant a feature type is to the output of the prediction model.  (Column 4, Lines 12 to 32)  If a relevancy score for a feature type exceeds a relevancy threshold score, then a feature type satisfies a condition.  (Column 5, Lines 49 to 54)  This relevancy score (“relevance value”) “is a measure of whether or not the utterance contains information relevant to completing the task.”  

Claims 2 to 3 and 11 to 12 are rejected under 35 U.S.C. 103 as being unpatentable over Larson et al. (U.S. Patent No. 10,679,150) in view of Farhady Ghalaty et al. (U.S. Patent No. 10,867,245) as applied to claims 1 and 10 above, and further in view of Pyati (U.S. Patent Publication 2019/0311301).
Larson et al. discloses a variety of ways to perform machine learning.  (Column 9, Lines 3 to 54)  However, Larson et al. does not disclose “identifying the one or more concepts from the document further comprises a graph-based decomposition process” and “wherein computing the relevance value comprises determining a measure of centrality associated with a text graph”.  However, Pyati teaches dynamically generated e.g., tree, weighted graph (“graph-based”), etc., based on a similarity measure.  Divisive hierarchical clustering involves splitting or decomposing (“a decomposition process”) ‘central’ nodes of the hierarchical structure (‘a graph’) where the measure of ‘centrality’ can be based on ‘degree’ centrality, e.g., a node having the most number of edges incident on the node or the most number of edges to and/or from the node, ‘betweenness’ centrality, e.g., a node operating the most number of times as a bridge along the shortest path between the two nodes, ‘closeness’ centrality, e.g., a node having the minimum average length of the shortest path between the node and all other nodes of the graph, eigenvalue centrality, percolation centrality, cross-clique centrality, etc.  (¶[0063])  Pyati, then, teaches identifying ‘concepts’ using “a graph-based decomposition process” and “determining a measure of centrality associated with a text graph.”  An objective is to provide real-time machine learning that enables enterprises to make sense of voluminous amounts of data.  (¶[0002])  It would have been obvious to one having ordinary skill in the art to use a graph-based decomposition and a measure of centrality to identify features and relationships in text as taught by Pyati to train a machine learning model of Larson et al. for a purpose of enabling enterprises to make sense of voluminous amounts of data.

s 4 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Larson et al. (U.S. Patent No. 10,679,150) in view of Farhady Ghalaty et al. (U.S. Patent No. 10,867,245) as applied to claims 1 and 10 above, and further in view of Srinivasan et al. (U.S. Patent Publication 2018/0060287).
Larson et al. omits identifying concepts from a document by computation of one or more of a term frequency, an inverse document frequency, and a term frequency-inverse document frequency.  However, term frequency-inverse document frequency is known in the art of natural language processing as one of the most common ways of determining a significance of a term in a set of documents.  Specifically, Srinivasan et al. teaches a relevance score used by a content retrieval engine 208 to identify and return relevant content.  A term frequency/inverse document frequency (TF/IDF) algorithm can be utilized to identify relevant content 210 from a content repository.  TF is a statistical measurement of how often a term appears within a given document, e.g., the more often a term appears the more relevant a document is.  IDF is a statistical measurement of how often a term appears across the index of documents, e.g., the more often a term appears the less relevant it becomes as terms that appear in many documents will have a lower weight than those with uncommon terms.  A relevance score can identify to what degree content is relevant to an input query 206.  (¶[0031] -- ¶[0032])  An objective is to expand content delivered to authoring users to enable efficient creation of new or expanded content.  (¶[0002])  It would have been obvious to one having ordinary skill in the art to identify relevant concepts from a document by computation of a term frequency-inverse document frequency as taught by Srinivasan et al. to train a machine learning model of Larson et al. for a purpose of returning relevant content according to a relevance score.     

Claims 6 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Larson et al. (U.S. Patent No. 10,679,150) in view of Farhady Ghalaty et al. (U.S. Patent No. 10,867,245) as applied to claims 1 and 10 above, and further in view of Acero et al. (U.S. Patent Publication 2004/0148154).
Larson et al. discloses various machine learning models including support vector machines.  (Column 9, Lines 26 to 27)  However, Larson et al. omits computing a relevance score by computing one or more of a maximum marginal relevance value, an integer linear programming value, and a sequential minimal optimization value.  Still, these techniques all appear to be known in the art of natural language processing.  Generally, Acero et al. teaches statistical classifiers for spoken language understanding performing task classification on natural language inputs.  (Abstract)  Specifically, one embodiment provides sequential minimal optimization as a fast method to train support vector machines.  (¶[0065])  One technique for performing training of support vector machines uses sequential minimal optimization.  (¶[0067])  An objective is to improve machine learning by identifying anomalous instances of training data in a corpus of machine learning training data based on identified statistical characteristics of the corpus.  (Abstract)  It would have been obvious to one having ordinary skill in the art to use sequential minimal optimization for training a support vector machine as taught by Acero et al. in training of a machine learning model in Larson et al. for a purpose of identifying anomalous instances of training data in a corpus.

Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicants’ disclosure.
Jheeta, McAteer et al., Aït-Mokhtar et al., and Maheshwari et al. disclose related prior art. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARTIN LERNER whose telephone number is (571) 272-7608.  The examiner can normally be reached Monday-Thursday 8:30 AM-6:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571) 272-5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center.  Unpublished application information in Patent Center is available to registered users.  To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov.  Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format.  For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 





/MARTIN LERNER/Primary Examiner
Art Unit 2657                                                                                                                                                                                                        February 18, 2022