Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103 is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Priority
Acknowledgement is made of applicant’s claim for domestic priority based on provisional application 63/048532 filed on 07/06/2020.
Claim Rejections - 35 USC § 102	
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
(a) NOVELTY; PRIOR ART.—A person shall be entitled to a patent unless— 
(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention; or 
(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention. 

    PNG
    media_image1.png
    18
    19
    media_image1.png
    Greyscale
(b) EXCEPTIONS.— 
(1) DISCLOSURES MADE 1 YEAR OR LESS BEFORE THE EFFECTIVE FILING DATE OF THE CLAIMED INVENTION.—A disclosure made 1 year or less before the effective filing date of a claimed invention shall not be prior art to the claimed invention under subsection (a)(1) if— 
(A) the disclosure was made by the inventor or joint inventor or by another who obtained the subject matter disclosed directly or indirectly from the inventor or a joint inventor; or 
(B) the subject matter disclosed had, before such disclosure, been publicly disclosed by the inventor or a joint inventor or another who obtained the subject matter disclosed directly or indirectly from the inventor or a joint inventor. 
(2) DISCLOSURES APPEARING IN APPLICATIONS AND PATENTS.—A disclosure shall not be prior art to a claimed invention under subsection (a)(2) if— 
(A) the subject matter disclosed was obtained directly or indirectly from the inventor or a joint inventor;
(B) the subject matter disclosed had, before such subject matter was effectively filed under subsection (a)(2), been publicly disclosed by the inventor or a joint inventor or another who obtained the subject matter disclosed directly or indirectly from the inventor or a joint inventor; or
(C) the subject matter disclosed and the claimed invention, not later than the effective filing date of the claimed invention, were owned by the same person or subject to an obligation of assignment to the same person.

Claims 1-2, 8-9, and 15-16 are rejected under 35 USC 102(a)(1)-(a)(2) as being anticipated by Hakkani-Tur et al. (US 2015/0227845 A1).
Regarding Claims 1, 8, and 15, Hakkani-Tur discloses an apparatus comprising: at least one processor (Fig. 12, computing functionality 1202 comprising CPU 1204 and ¶122, implementation of computer system 102) configured to: 
apply a natural language understanding (NLU) model to an input utterance in order to obtain initial slot probability distributions (¶36 and ¶68 implement a preliminary intent labeling system to linguistic items (¶32, each linguistic item corresponds to an utterance made up of one or more words) to determine an intent of the linguistic item with a level of confidence satisfying an application specific threshold comprising a first set 202 of linguistic items having known intents and a second set 204 of linguistic items without known intents (i.e., confidence not meeting threshold)); 
perform a confidence calibration through application of a calibration probability distribution to the initial slot probability distributions in order to generate calibrated slot probability distributions (¶¶34-35 and ¶43, for linguistic item lacking known intent, use a generative model to infer the intent where the model assumes user intent as a probabilistic distribution over K possible semantic intent classes; ¶¶88-89, generative model 702 specifies, for first set 202, intents deterministically assigned based on known label and specifies, for the second set 204, intent drawn from multinomial distribution of intents associated with hyper-parameters; ¶¶107-110, determine unknown intent via variational Bayesian method by generating a fully factored variational distribution q, which is an approximation of the posterior distribution p), the calibration probability distribution having a higher number of dimensions than the initial slot probability distributions (dimensions of q and p being greater than 1 dimension confidence value of L-A model for unknown intents; see ¶109, equation (2) for q based on posterior distribution p; in view of ¶92 for p, equation (1) for p (Id = k | c, W, I-d, α, β1, β2); intent is drawn from (1) a multinomial distribution of intents (θ) associated with hyper-parameter α per ¶88, (2) multinomial distribution Φ1Id associated with hyper-parameter β1 per ¶90, and (3) multinomial distribution Φ2Id associated with hyper-parameter β2 per ¶90); 
identify uncertainties associated with words in the input utterance based on the calibrated slot probability distributions (¶110, determine posterior distribution p and the variational distribution q at each iteration); and 
identify a new concept contained in the input utterance that is not recognized by the NLU model based on the identified uncertainties (¶107 and ¶110, converge the divergence between p and q until a convergence target to determine unknown (latent) intent variables; in view of ¶91, ¶¶96-97 and Fig. 1, using generative model 802 to discover new intents such as the new “play movie” intent, which L-A model was unable to determine the intent thereof with a level of confidence satisfying application specific threshold per ¶68).  
Regarding Claim 15, Hakkani-Tur discloses a non-transitory computer readable medium containing instructions that when executed cause at least one processor to perform the steps of claims 1 and 8 (¶124 and Fig. 12, storage resources 1206 storing instructions for processing device / CPU 1204 to carry out).
Regarding Claims 2, 9, and 16, Hakkani-Tur discloses wherein the at least one processor is further configured to: identify a word or phrase from the input utterance associated with the new concept (¶40 and ¶91, and Fig. 1, “play movie” is a new intent not represented by the original knowledge graph); and label the identified word or phrase with an unknown slot type (¶104, identify the presence of an entity name in an input linguistic item and replace the entity name with the entity type associated with the name; ¶105, e.g., “Tell me about Tom Cruise in Mission Impossible” -> “Tell me about [Actor] in [Movie Name]”; ¶107, determine the unknown (latent) intent variables).  
Claim Rejections - 35 USC § 103
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 103 that form the basis for the rejections under this section made in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3-4, 10-11, 17-18 are rejected under 35 USC 103 for being unpatentable over Hakkani-Tur et al. (US 2015/0227845 A1) in view of Pitschel et al. (US 9922642 B2).
Regarding Claims 3-4, 10-11, and 17-18, Hakkani-Tur does not disclose wherein the at least one processor is further configured to: prompt for clarification of the new concept; and perform slot filling for the identified word or phrase based at least partially on the clarification of the new concept.  
Pitschel discloses a computer system with a natural language processor that attempts to associate token sequence of speech to text processing of user speech input with one or more actionable intent (Col 10, Rows 6-11) by prompting the user for clarification of new concept and perform an action / slot filling for identified word or phrase based at least partially on the clarification of the new concept (Col 20, Rows 26-36, NLP 332 cannot infer an actionable intent for “Ring my wife” and request the user with a clarification “Did you mean to call your wife?”; in response to clarification “Yes, I did”, alter a relationship between nodes within ontology 360 to relate the word “ring” to “call” property node).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to prompt for clarification of the new concept in order to add new intents (Pitschel, Col 11, Rows 45-51; Hakkani-Tur, ¶119, infer intent for linguistic item whose intent is not known) in order to further expand the capabilities of the computer system to successfully respond to user requests (Pitschel, Col 16, Rows 26-39).
Regarding Claims 5, 12, and 19, Hakkani-Tur as modified by Pitschel discloses wherein the at least one processor is further configured to: prompt for clarification of the new concept (Pitschel, Col 16, Rows 26-34 ); and retrain at least one of the NLU model and the confidence calibration based at least partially on the clarification of the new concept (Pitschel, Col 16, Rows 26-34, training module configured to alter or adjust parameters of natural language processing; Hakkani-Tur, ¶57, ¶60, and ¶120, a model building module can train a language understanding model using the new intent and use to assign new labels to input data to yield a new and more robust version of the first set 202 of linguistic items / user speech input having known intent labels).  
Claims 6, 13, and 20 are rejected under 35 USC 103 for being unpatentable over Hakkani-Tur et al. (US 2015/0227845 A1) in view of Elbayad et al. (“Token-level and sequence-level loss smoothing for RNN language models”) and Makashir et al. (US 11195522 B1).
Regarding claims 6, 13, and 20, Hakkani-Tur discloses retrain the NLU model (¶60 and ¶120, using new intent information to progressively improve the accuracy of the language understanding model to generate a new and more robust version of the first set 202 of linguistic items having known intent labels) and retrain the confidence calibration based on the calibrated slot probability distribution (¶110, determines the divergence between the posterior distribution p and the variational distribution q, adjust the variational parameters based on the divergence until he divergence measure reaches a prescribed convergence target).
Hakkani-Tur does not disclose retrain the NLU model to minimize sequence loss in the NLU model and retrain the confidence calibration to maximize entropy loss based on the calibrated slot probability distributions.  
Elbayad teaches implementing recurrent neural network language models for natural language generation tasks (Abstract, 1. Introduction) by training the RNN models to minimize sequence loss (3.2 Sequence-level loss smoothing, see equation (9) for sequence level smoothed loss function).
Hakkani-Tur discloses implementing machine learning techniques to generate L-A model / language understanding model includes neural networks (¶56 and ¶79)
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to generate or retrain L-A / language understanding models implemented in neural networks using sequence loss smoothing / minimization of Elbayad in order to significantly speeds up training (Elbayad, I. Introduction, “For sequence level smoothing, we propose to use restricted token replacement vocabularies, and a “lazy evaluation” method that significantly speeds up training).
Further, Makashir teaches a machine learning model that determines natural language understanding data (Abstract) comprising a maximum entropy intent classifier to determine an intent classification score / probability distribution (Col 21, Row 66 – Col 22, Rows 50-67, using a helper maximum entropy intent classifier to generate intent classification scores; in view of Col 8, Rows 38-41, NLU confidence indicates a top ranked intent determined probabilistically by a NLU component) where machine learning model may be updated to maximize cost (Col 9, Rows 7-14) such as a binary cross entropy loss function for NLU model based on a calibrated slot probability distribution (Col 20, Rows 55-67, equation (1), predicted outcome p and ground truth outcome y).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to retrain the confidence calibration to maximize entropy loss based on the calibrated slot probability distributions in order to train an intent classifier model / NLU model to generate intent classification scores (Makashir, Col 21, Row 66 – Col 22, Row 50).
Claims 7 and 14 are rejected under 35 USC 103 for being unpatentable over Hakkani-Tur et al. (US 2015/0227845 A1) in view of Makashir et al. (US 11195522 B1).
Regarding claims 7 and 14, Hakkani-Tur discloses wherein: 
the initial slot probability distributions identify likelihoods of different words in the input utterance belonging to different slot types (¶33, in one example, “Who starred in the movie Mission Impossible?”, computer system 102 assigns a descriptive label corresponding to the identified intent such as “lead actor”; ¶68, L-A model / SLU assigns label to an input linguistic item if it can determine the intent of the linguistic item with a level of confidence that satisfies an application specific threshold); 
the calibration probability distribution comprises a Dirichlet distribution (¶109,q(IIk; pk) are each Dirichlet distribution)).
Hakkani-Tur does not disclose the uncertainties associated with the words in the input utterance comprise entropies of the calibrated slot probability distributions associated with the words in the input utterance.  
Makashir teaches a machine learning model that determines natural language understanding data (Abstract) comprising a maximum entropy intent classifier to determine an intent classification score / probability distribution (Col 21, Row 66 – Col 22, Rows 50-67, using a helper maximum entropy intent classifier to generate intent classification scores; in view of Col 8, Rows 38-41, NLU confidence indicates a top ranked intent determined probabilistically by a NLU component) where machine learning model may be updated to maximize cost (Col 9, Rows 7-14) such as a binary cross entropy loss function for NLU model based on a calibrated slot probability distribution (Col 20, Rows 55-67, equation (1), predicted outcome p and ground truth outcome y).
Further, there are uncertainties associated with the words in the input utterance comprise entropies of the calibrated slot probability distributions associated with the words in the input utterance (Col 20, Rows 54-67).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to determine uncertainties comprising entropies of the calibrated slot probability distributions associated with the words in the input utterance in order to generate a cost or loss function that describes the difference between expected output of the machine learning model and actual output (Makashir, Col 9, Rows 7-11).
Conclusion

Prior art made of record and not relied upon is considered pertinent to applicant's disclosure: 
US 9830315 B1 discloses a system that employs a neural network model which has been trained to predict a sequentialized form for an input text sequence in order to map natural language utterances to logical forms (semantic parsing). The training regiment attempts to minimize the cross-entropy of the model relative to logical forms in a training set.
US 2016/0027434 A1 teaches a training module for producing spoken language understanding models to understand the meaning of transcribed speech. If sufficient transcribed data is available, a new model is computed by applying a maximum a posteriori adaptation based on a prior distribution modeled using a Dirichlet density. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to examiner Richard Z. Zhu whose telephone number is 571-270-1587 or examiner’s supervisor King Poon whose telephone number is 571-272-7440. Examiner Richard Zhu can normally be reached on M-Th, 0730:1700.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/RICHARD Z ZHU/Primary Examiner, Art Unit 2675                                                                                                                                                                                                        06/09/2022