DETAILED ACTION
This is responsive to the application filed 14 February 2020.
Claims 1-20 are pending and considered below.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claims 2-3, 9-10, 13, 15-16 and 19 are objected to because of the following informalities: claim 2, in lines 1-2, recites the limitation “identifying in the sentences slots based in part on the dialog act classifications” which should read ‘identifying, in the sentences, slots based in part on the dialog act classifications’ for clarity. Claims 9 and 15 suffer from similar deficiencies and are likewise objected to. The remaining claims are objected to for depending upon an objected to claim without providing a remedy.
Claims 7 and 20 are objected to because of the following informalities: claim 7, in line 2, recites the limitation “the next cluster label” which lacks proper antecedent basis in the claim. The limitation will be interpreted as ‘[[the]] a next cluster label’ for clarity. Claim 20 suffers from a similar deficiency and is likewise objected to.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-4, 7-11, 14-17 and 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
In claims 1 and 8, the limitations receiving a set of conversations, each conversation comprising sentences; classifying each sentence into a dialog act of a plurality of dialog acts; for each set of sentences classified into a dialog act, clustering the set of sentences into clusters based on the content of the sentences, each cluster having a cluster label; and generating a language model based on the cluster labels, as drafted, are processes that, under their broadest reasonable interpretation, cover performance of the limitations in the mind but for the recitation of generic computer components. 
In claim 14, the limitations for a set of transcripts comprising sentences, grouping the sentences into dialog acts; grouping the sentences into clusters, each cluster having a cluster label; and training a language model based on the cluster labels, as drafted, are processes that, under their broadest reasonable interpretation, cover performance of the limitations in the mind but for the recitation of generic computer components. 
That is, other than reciting a “system, comprising: a memory; and one or more processors” (claim 8) nothing in the claims precludes the steps from practically being performed in the mind. For example, a person may receive a set of conversations comprising sentences; classify each sentence; for each classified set of sentences, cluster the set of sentences into clusters based on the content of the sentences, each cluster having a cluster label; and generate a language model based on the cluster labels. Or a person may group sentences into dialog acts; group the sentences into clusters, each cluster having a cluster label; and train a language model based on the cluster labels.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea. 
This judicial exception is not integrated into a practical application. In particular, the claims recite the additional elements – “system, comprising: a memory; and one or more processors” (claim 8) which are recited at a high-level of generality (i.e., as generic processors performing generic computer functions) such that they amount to no more than mere instructions to apply the exception using a generic computer components. 
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea. As stated above, the claims recite the additional limitations of a “system, comprising: a memory; and one or more processors” (claim 8). However, these are recited at a high level of generality and are recited as performing generic computer functions routinely used in computer applications (see Applicant’s specification [0060], [0062] and [0063]). Generic computer components recited as performing generic computer functions that are well-understood, routine and conventional activities amount to no more than implementing the abstract idea with a computerized system. The claims also recite the additional element “displaying the determined related sample sentence”. This limitation represents the extra-solution activity of displaying data which is a well-understood, routine and conventional activity. Thus, taken alone, the additional elements do not amount to significantly more than the above-identified judicial exception (the abstract idea). Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually. There is no indication that the combination of elements improves the functioning of a computer or improves any other technology. Their collective functions merely provide conventional computer implementation. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are not patent eligible.
The dependent claims, when analyzed as a whole, are held to be patent ineligible under 35 U.S.C. 101 because the additional recited limitations fail to establish that the claims are not directed to an abstract idea. 
The dependent claims recite:
comprising identifying in the sentences slots based in part on the dialog act classifications, the slots used as input to a conversational bot creation process;
comprising identifying for a slot a set of possible slot values;
wherein the language model comprises a neural network; and
wherein the language model is to take as input a series of cluster labels and predict the next cluster label.
The additional recited limitations further narrow the steps of the independent claims without however providing “a practical application of” or "significantly more than" the underlying “Mental Processes” abstract idea. Therefore, the dependent claims are also not patent eligible.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 8 and 14 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Duta (USPN 8,515,736).
Claim 1:
Duta discloses a method for creating input data to be used to train a conversational bot the method comprising: 
receiving a set of conversations, each conversation comprising sentences; classifying each sentence into a dialog act of a plurality of dialog acts (“The training manager labels sentences from the database of sentences using the multiple different text classifiers, wherein each of the multiple different text classifiers produces a respective label and classification pairing for each respective sentence from the database of sentences. The training manager also labels sentences from the first set of sentences using the multiple different text classifiers so that each of the multiple different text classifiers produces a respective label and classification pairing for each respective sentence from the first set of sentences”, col. 14, lines 11-37, see also “general conversation data”, col. 9, lines 8-11); 
for each set of sentences classified into a dialog act, clustering the set of sentences into clusters based on the content of the sentences, each cluster having a cluster label (“The training manager computes a semantic similarity for each labeled sentence from the first group of sentences as compared to each labeled sentence from the database of sentences. For each labeled sentence from the first set of sentences, the training manager identifies labeled sentences from the database of sentences that meet a predetermined measure of similarity, such as by reference a specific threshold”, col. 14, lines 11-37, see also “a clustering algorithm can be a hierarchical "complete linkage" algorithm, which aims at minimizing a cluster diameter, that is, maximizing the semantic similarity between sample utterances in a same cluster. The clustering process can be stopped when the intra-cluster similarity reaches a lower-bound threshold. Each cluster can be labeled by a most frequent label of clustered utterances according to any included database routers”, col. 4, lines 32-44); and 
generating a language model based on the cluster labels (“The training manager can then use the identified labeled sentences from the database of sentences and the first set of sentences to train a statistical language model according to the first set of semantic labels”, col. 14, lines 11-37).
Claim 8:
Duta discloses a system for creating input data to be used to train a conversational bot, the system, comprising: a memory; and one or more processors (col. 6, lines 3-49) configured to perform the steps of process claim 1 as shown above.
Claim 14:
Duta discloses a method for generating data to be used to generate an automatic conversational bot, the method comprising: 
for a set of transcripts comprising sentences, grouping the sentences into dialog acts (“The training manager labels sentences from the database of sentences using the multiple different text classifiers, wherein each of the multiple different text classifiers produces a respective label and classification pairing for each respective sentence from the database of sentences. The training manager also labels sentences from the first set of sentences using the multiple different text classifiers so that each of the multiple different text classifiers produces a respective label and classification pairing for each respective sentence from the first set of sentences”, col. 14, lines 11-37, see also “general conversation data”, col. 9, lines 8-11); 
grouping the sentences into clusters, each cluster having a cluster label (“The training manager computes a semantic similarity for each labeled sentence from the first group of sentences as compared to each labeled sentence from the database of sentences. For each labeled sentence from the first set of sentences, the training manager identifies labeled sentences from the database of sentences that meet a predetermined measure of similarity, such as by reference a specific threshold”, col. 14, lines 11-37, see also “a clustering algorithm can be a hierarchical "complete linkage" algorithm, which aims at minimizing a cluster diameter, that is, maximizing the semantic similarity between sample utterances in a same cluster. The clustering process can be stopped when the intra-cluster similarity reaches a lower-bound threshold. Each cluster can be labeled by a most frequent label of clustered utterances according to any included database routers”, col. 4, lines 32-44); and 
training a language model based on the cluster labels (“The training manager can then use the identified labeled sentences from the database of sentences and the first set of sentences to train a statistical language model according to the first set of semantic labels”, col. 14, lines 11-37).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-3, 6, 9-10, 13, 15-16 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Duta (USPN 8,515,736) in view of Dobrynin et al. (US PGPub 2019/0205322).
Claim 2:
Duta discloses the method of claim 1, but does not explicitly disclose identifying, in the sentences, slots based in part on the dialog act classifications, the slots used as input to a conversational bot creation process.
In a similar system using sentences to generate a language model, Dobrynin discloses identifying, in the sentences (sentences in “a plurality of electronic documents”, see also [0082]), slots (parameter fields) based in part on the dialog act classifications (“the synthetic document generating component 520 can obtain (e.g., extract, retrieve) a plurality of command templates or at least portions thereof (e.g., with or without parameter fields) based on the action datasets stored by digital assistant server 302. For each of the command templates, the synthetic document generating component 520 can generate a query including at least a portion of the command template, and submit the query to a search engine or remote data repository to retrieve a corresponding search result. The search result for the query can include a plurality of electronic documents, such as content from webpages, social media feeds, PDFs, Internet forum entries, and the like”, [0081]), the slots used as input to a conversational bot creation process (“The digital assistant server 302 can also include a language model generating component 310, which can be employed to efficiently generate an intelligent language model specific to a desired language space, such as commands received from a digital assistant device”, [0065]).
It would have been obvious to one with ordinary skill in the art before the effective date of the claimed invention to have combined the references to yield the predictable result of identifying, in Duta’s sentences, slots based in part on the dialog act classifications, the slots used as input to a conversational bot creation process retrieving sentences relevant to a command template or terms included therein (see Dobrynin, [0081]).
Claim 3:
Duta in view of Dobrynin discloses the method of claim 2 comprising identifying for a slot a set of possible slot values (Dobrynin, [0065] and [0081]).
Claim 6:
Duta in view of Dobrynin discloses the method of claim 2, comprising training a conversational bot using the language model, clusters and slots (Duta, col. 14, lines 11-37, note that since the combination teaches the language model being based on clusters and slots, training using the language model necessarily implies training using the clusters and slots).
Claims 9-10 and 13:
Duta in view of Dobrynin discloses the system of claim 8, wherein the one or more processors are configured to perform the steps of process claims 2-3 and 6 as shown above.
Claim 15:
Duta discloses the method of claim 14, but does not explicitly disclose identifying, in the sentences, slots based in part on the dialog act classifications, the slots used as input to a conversational bot creation process.
In a similar system using sentences to generate a language model, Dobrynin discloses identifying, in the sentences (sentences in “a plurality of electronic documents”, see also [0082]), slots (parameter fields) based in part on the dialog act classifications (“the synthetic document generating component 520 can obtain (e.g., extract, retrieve) a plurality of command templates or at least portions thereof (e.g., with or without parameter fields) based on the action datasets stored by digital assistant server 302. For each of the command templates, the synthetic document generating component 520 can generate a query including at least a portion of the command template, and submit the query to a search engine or remote data repository to retrieve a corresponding search result. The search result for the query can include a plurality of electronic documents, such as content from webpages, social media feeds, PDFs, Internet forum entries, and the like”, [0081]), the slots used as input to a conversational bot creation process (“The digital assistant server 302 can also include a language model generating component 310, which can be employed to efficiently generate an intelligent language model specific to a desired language space, such as commands received from a digital assistant device”, [0065]).
It would have been obvious to one with ordinary skill in the art before the effective date of the claimed invention to have combined the references to yield the predictable result of identifying, in Duta’s sentences, slots based in part on the dialog act classifications, the slots used as input to a conversational bot creation process retrieving sentences relevant to a command template or terms included therein (see Dobrynin, [0081]).
Claim 16:
Duta in view of Dobrynin discloses the method of claim 15 comprising identifying for a slot a set of possible slot values (Dobrynin, [0065] and [0081]).
Claim 19:
Duta in view of Dobrynin discloses the method of claim 15, comprising training a conversational bot using the language model, clusters and slots (Duta, col. 14, lines 11-37, note that since the combination teaches the language model being based on clusters and slots, training using the language model necessarily implies training using the clusters and slots).

Claims 4-5, 7, 11-12, 17-18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Duta (USPN 8,515,736) in view of Dobrynin et al. (US PGPub 2019/0205322).
Claim 4:
Duta discloses the method of claim 1 but does not explicitly disclose wherein the language model comprises a neural network.
In a system similarly generating a language model, Magliozzi discloses wherein the language model comprises a neural network (“the chatbot generates a language model using NLP techniques based on the first set of data. In one instance, the first set of data can be used as initial training data to train a neural network. Some non-limiting examples of language models that can be generated include unigram models, n-gram models, exponential language models, neural language models”, [0045]).
It would have been obvious to one with ordinary skill in the art before the effective date of the claimed invention to have combined the references to yield the predictable result of Duta’s language model comprising a neural network because such a model is a well-known standard as evidenced by the list of models listed by Magliozzi (see [0045]).
Claim 5:
Duta discloses the method of claim 1 but does not explicitly disclose training a conversational bot using the language model.
In a system similarly generating a language model, Magliozzi discloses training a conversational bot using the language model (“The language model that is generated based on the first set of data (e.g., knowledge seed) forms the foundational framework based on which the chatbot is initially trained”, [0045]).
It would have been obvious to one with ordinary skill in the art before the effective date of the claimed invention to have combined the references to yield the predictable result of training a conversational bot using Duta’s language model in order to implement machine learning to train the bot (see Magliozzi, [0028]).
Claim 7:
Duta discloses the method of claim 1, but does not explicitly disclose wherein the language model is to take as input a series of cluster labels and predict the next cluster label.
In a system similarly generating a language model, Magliozzi discloses wherein the language model is to take as input a series of cluster labels and predict the next cluster label (“Some non-limiting examples of language models that can be generated include unigram models, n-gram models, exponential language models, neural language models”, [0045], note that n-grams predict the nth term based on the (n-1) prior terms).
It would have been obvious to one with ordinary skill in the art before the effective date of the claimed invention to have combined the references to yield the predictable result of Duta’s language model taking as input a series of cluster labels and predict the next cluster label (e.g. n-gram) because n-grams are well-known standards for language modeling (see Magliozzi, [0045]).
Claims 11-12:
Duta in view of Magliozzi discloses the system of claim 8, wherein the one or more processors are configured to perform the steps of process claims 4-5 as shown above.
Claim 17:
Duta discloses the method of claim 14 but does not explicitly disclose wherein the language model comprises a neural network.
In a system similarly generating a language model, Magliozzi discloses wherein the language model comprises a neural network (“the chatbot generates a language model using NLP techniques based on the first set of data. In one instance, the first set of data can be used as initial training data to train a neural network. Some non-limiting examples of language models that can be generated include unigram models, n-gram models, exponential language models, neural language models”, [0045]).
It would have been obvious to one with ordinary skill in the art before the effective date of the claimed invention to have combined the references to yield the predictable result of Duta’s language model comprising a neural network because such a model is a well-known standard as evidenced by the list of models listed by Magliozzi (see [0045]).
Claim 18:
Duta discloses the method of claim 14 but does not explicitly disclose training a conversational bot using the language model.
In a system similarly generating a language model, Magliozzi discloses training a conversational bot using the language model (“The language model that is generated based on the first set of data (e.g., knowledge seed) forms the foundational framework based on which the chatbot is initially trained”, [0045]).
It would have been obvious to one with ordinary skill in the art before the effective date of the claimed invention to have combined the references to yield the predictable result of training a conversational bot using Duta’s language model in order to implement machine learning to train the bot (see Magliozzi, [0028]).
Claim 20:
Duta discloses the method of claim 14, but does not explicitly disclose wherein the language model is to take as input a series of cluster labels and predict the next cluster label.
In a system similarly generating a language model, Magliozzi discloses wherein the language model is to take as input a series of cluster labels and predict the next cluster label (“Some non-limiting examples of language models that can be generated include unigram models, n-gram models, exponential language models, neural language models”, [0045], note that n-grams predict the nth term based on the (n-1) prior terms).
It would have been obvious to one with ordinary skill in the art before the effective date of the claimed invention to have combined the references to yield the predictable result of Duta’s language model taking as input a series of cluster labels and predict the next cluster label (e.g. n-gram) because n-grams are well-known standards for language modeling (see Magliozzi, [0045]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Balchandran et al. (US PGPub 2007/0156392) discloses, during development of an NLU application, a developer categorizes sentences of the NLU application and associates each sentence with targets that represent a correct interpretation of the sentence. The association is performed during a classification process which results in the automatic training of language models. The language models learn multiple associations between sentences and targets for correctly responding to a language input request. This allows multiple pieces of information to be provided by the language models when responding to language input requests. For example, at runtime, a language input request (e.g. a spoken utterance) is passed through an optimal configuration specified by the language model representation and a set of targets are identified. The language models identify targets with the highest corresponding interpretation accuracy. The language model representation can identify an action for responding to the language input request. The automatic process can reduce the time to generate the language models in comparison to manually generating the language models.
Rao et al. (US PGPub 2014/0358539) discloses performing categorized sentence mining in an acquired data samples to obtain categorized training samples for multiple categories; building a text classifier based on the categorized training samples; classifying the data samples using the text classifier to obtain a class vocabulary and a corpus for each category; mining the corpus for each category according to the class vocabulary for the category to obtain a respective set of high-frequency language templates; training on the templates for each category to obtain a template-based language model for the category; training on the corpus for each category to obtain a class-based language model for the category; training on the class vocabulary for each category to obtain a lexicon-based language model for the category; building a speech decoder according to an acoustic model, the class-based language model and the lexicon-based language model for any given field, and the data samples.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SAMUEL G NEWAY whose telephone number is (571)270-1058. The examiner can normally be reached Monday-Friday 9:00am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on 571-272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SAMUEL G NEWAY/            Primary Examiner, Art Unit 2657