DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-6, 8-16, and 18-20 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
The independent claims 1 and 11 recite 
obtaining an input utterance, wherein the input utterance is obtained from computerized text;
generating a predicted value of an information amount of each word in the input utterance according to the context of the input utterance using a pre-trained general model; and
determining redundant words according to the predicted value of the information amount of each word, and determining whether to remove the redundant words from the input utterance. 

The limitations of “obtaining…”, “generating…”, and “determining…” as drafted cover a human organizing of activities. More specifically, a human based on: receiving an utterance (e.g., written text) from another human; assigning a value to each word on the written text using a predefined known criterion; and identifying words that are not significant (i.e., redundant) based on the assigned value and deciding if appropriate to remove said word or not.
This judicial exception is not integrated into a practical application because for example: in [0086] of the as filed specification, “In which, the corpus entry system 200 can be a system or an apparatus with corpus entry function, for example, a mobile phone or a computer, which is not limited herein.” Therefore, a general-purpose computer or computing device is described and mainly used as an application thereof. Accordingly, these additional elements do not integrate the abstract idea into a practical idea because it does not impose any meaningful limits on practicing the abstract idea. 
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional elements of using a computer is listed as a general computing device as noted. The claim is not patent eligible. 

With respect to claims 2 and 12, the claims recite:
obtaining a plurality of sample utterances;
obtaining the information amount of each word in the sample utterance according to the context of the input utterance; and
training the general model through the plurality of sample utterances and the information amount of each word in the sample utterances.

The claims relate to a human organizing of ideas. This reads on a human receiving utterances (e.g., written texts/transcripts) from another human; assigning a value to each word based on the context of the received utterances; and adjusting/re-defining the predefined known criterion using the received utterances and the assigned values. No additional limitations are present. 	

With respect to claims 3 and 13, the claims recite:
predicting a probability of the sample utterance on each intention category through a pre-trained intention identification model to obtain a first probability distribution vector;
predicting another probability of the sample utterance on each intention category after removing each word through the intention identification model to obtain a second probability distribution vector corresponding to the word;
obtaining an information gain rate of each word according to the first probability distribution vector and the second probability distribution vector corresponding to the word; and
performing a normalization process on a sequence comprising the information gain rates of all the words to obtain the information amounts of the words.

The claims relate to a human organizing of ideas. This reads on a human manually calculating (using pen and paper) the probability (ratio where it is compared how many times an outcome can occur compared to all possible outcomes) of the received text being classified under certain categories using the predefined known criterion and writing down the results in a list; repeat the step after removing each word of the received written text; manually calculating a relationship (e.g., difference) between the obtained results (i.e., probabilities); and normalize the results by dividing them by a specific value (e.g., the highest value). No additional limitations are present. 	

With respect to claims 4 and 14, the claims recite:
obtaining the information gain rate of each word by calculating the Euclidean distance between the first probability distribution vector and the second probability distribution vector corresponding to the word.

The claims relate to a human organizing of ideas. This reads on a human manually calculating a relationship (e.g., difference) between the obtained results (i.e., probabilities) using Euclidean distance formula. No additional limitations are present. 	

With respect to claims 5 and 15, the claims recite:
obtaining the information gain rate of each word by calculating a relative entropy of the first probability distribution vector and the second probability distribution vector corresponding to the word.

The claims relate to a human organizing of ideas. This reads on a human manually calculating a relationship (e.g., difference) between the obtained results (i.e., probabilities) using relative entropy formula. No additional limitations are present. 	

With respect to claims 6 and 16, the claims recite:
obtaining an initial utterance from a plurality of corpora;
selecting a word from a dictionary corresponding to each word slot in the initial utterance in a random manner to fill the word slot to obtain the sample utterance, in response to the initial utterance comprising a word slot; and
using the initial utterance as the sample utterance, in response to the initial utterance not comprising the word slot;
after the step of performing the normalization process on the sequence comprising the information gain rates of all the words to obtain the information amounts of the words further comprises:
determining whether the word in the sample utterance is obtained by filling the word slot in the initial utterance; and
if yes, updating the information amount of the word to 1..

The claims relate to a human organizing of ideas. This reads on a human receiving an utterance from another human; randomly selecting a word from a list of words for each word space/slot present in the received utterance; use the originally received utterance; determining if the utterance with the randomly selected word matches the original utterance, and if so, assigning a value of 1 to the probability. No additional limitations are present. 	



With respect to claims 8 and 18, the claims recite:
displaying the utterance, wherein a background color depth of each word in the corpus corresponds to the predicted value of the information amount of the word.

The claims relate to a human organizing of ideas. This reads on a human writing down, on a piece of paper the text and highlighting with a different color each word based on the value (e.g., probability). No additional limitations are present. 	

With respect to claims 9 and 19, the claims recite:
removing a word selected by a user from the utterance.

The claims relate to a human organizing of ideas. This reads on a human deleting a word from the text on the piece of paper, which is be selected by another human. No additional limitations are present. 	

With respect to claims 10 and 20, the claims recite:
removing a word with the predicted value of the information amount lower than a predicted value threshold in response to a trigger instruction from the utterance.

The claims relate to a human organizing of ideas. This reads on a human deleting a word from the text on the piece of paper, whose value (e.g., probability) is below a predefined threshold, after being instructed by another human. No additional limitations are present. 	

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Shamsi; Davood et al. (US 20180307724 A1; hereinafter referred to as Shamsi et al.) as applied to claim 1 above, and further in view of KANG; Pil Goo et al. (US 20190392836 A1; hereinafter referred to as Kang et al.). 

As to independent claims 1 and 11, Shamsi et al. teaches a method comprising:
A computer-implemented corpus cleaning method / system (see ¶ [ 0001 ]: “… the present disclosure relates to systems and methods for interpreting a query related to a dataset by reducing ambiguity in the query.”), comprising executing on a processor steps of:
obtaining an input (see ¶ [ 0027 ]: “At a high level, NLU system 204 can analyze text, such as words and/or phrases, of an input natural language query and reduce ambiguity before executing the query to produce a response to a user. A user can access such an NLU system using, for example, user interface 206.”);
generating a predicted value of an information amount of each word in the input utterance according to the context of the input utterance using a pre-trained general model (see ¶ [ 0045 ]: “When a query contains stop words, those words can be integrated into similarity scores. Stop words usually refer to common words in a language (e.g., the). When a phrase contains a stop word, a predetermined amount (e.g., small amount) can be added to the similarity score. One exemplary way for this to occur is to determine a total sum distance by adding a similarity score and the score for a stop word, determining a total word distance by adding scores for stop words, and then determining if total sum distance/max (0.01, total word distance) exceeds a predetermined threshold. When the stop word determination exceeds a predetermined threshold, the meaning is assigned for that word.” 
Here, the predicted value of information amount is interpreted as the predetermined amount that can be added to the similarity score associated with each stop word. Also, the association of the word and the context of the input is interpreted as associated with the meaning assigned to the word and the pre-trained general model is interpreted as associated with the determination of whether the similarity score has exceeded or not the predetermined threshold, in order to determine if the word (i.e., stop word) the meaning of the word.); and
determining redundant words according to the predicted value of the information amount of each word, and determining whether to remove the redundant words from the input utterance (see ¶ [ 0045 (citation as in limitation above) and 0047 ]: “At block 410, ambiguity can be reduced for the query as generated at block 408. Ambiguity can be reduced by assigning meanings to words in one or more ways. First, meanings associated with identified key words within the query can be assigned. This resolves already matched words, for example, in Query 1, “pull,” “cpm,” “for each,” and “for” were already matched words. As such, no further determination of their meaning is needed to resolve ambiguity. Next, a potential meaning for a word can be selected when the similarity score of the meaning is above a predetermined threshold (e.g., score 1.0 matching). When an edge of the graph is above a predetermined threshold, lower scored edges can be removed to reduce ambiguity in the query. Using Query 1 as an example, 728×90 was matched with creative size name=728×90 with a similarity score of 1.0 and to APP standard 728×90 with a similarity score of 0.5, because the first match has a score of 1.0, that potential meaning can be selected and the potential meaning with the 0.5 similarity score can be removed.[…]”).
However, Shamsi et al. does not explicitly teach, but Kang et al. teaches:
obtaining an input utterance (see ¶ [ 0065 ]: “…the speech processing apparatus 100 may collect a user's spoken utterance including a query, and generate a query text as a text conversion result for the user's spoken utterance including a query.”).
Shamsi et al. and Kang et al. are considered to be analogous to the claimed invention because they are in the same field of endeavor in data processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have combined Shamsi et al.  and Kang et al. to incorporate the teachings of Kang et al. of obtaining an input utterance in order to yield predictable results of receiving/processing a user’s spoken utterance. (See KSR v. Teleflex).

Claims 2 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Shamsi; Davood et al. (US 20180307724 A1; hereinafter referred to as Shamsi et al.) and KANG; Pil Goo et al. (US 20190392836 A1; hereinafter referred to as Kang et al.) as applied to claim 1 above, and further in view of MNIH; Andriy et al. (US 20150095017 A1; hereinafter referred to as Mnih et al.). 

Regarding claims 2 and 12, Shamsi et al. in combination with Kang et al. teaches the limitations as in claims 1 and 11, above.
However, Shamsi et al. in combination with Kang et al. do not teach, but Mnih et al. teaches:
wherein before the step of generating the predicted value of the information amount of each word in the utterance according to the context of the input utterance using the pre-trained general model further comprises:
obtaining a plurality of sample utterances (see ¶ [ 0008 ]: “According to one aspect of the present invention, a system and computer-implemented method are provided of learning natural language word associations, embeddings, and/or similarities, using a neural network architecture, comprising storing data defining a word dictionary comprising words identified from training data consisting a plurality of sequences of associated words, selecting a predefined number of data samples from the training data, the selected data samples defining positive examples of word associations, generating a predefined number of negative samples for each selected data sample, the negative samples defining negative examples of word associations, wherein the number of negative samples generated for each data sample is a statistically small proportion of the number of words in the word dictionary, and training a neural probabilistic language model using the data samples and the generated negative samples. [0013] The neural language model may be configured to receive a representation of the target word and representations of the plurality of context words of an input sample, and to output a probability value indicative of the likelihood that the target word is associated with the context words.” Here, it is interpreted that the obtained sample utterances are obtained before generating probability values for each word using the neural probabilistic language model.);
obtaining the information amount of each word in the sample utterance according to the context of the input utterance (see ¶ [ 0013 ]: “[0013] The neural language model may be configured to receive a representation of the target word and representations of the plurality of context words of an input sample, and to output a probability value indicative of the likelihood that the target word is associated with the context words.”); and
training the general model through the plurality of sample utterances and the information amount of each word in the sample utterances (see ¶ [ 0031 and 0035 ]: “[0031] As will be described in more detail below, the training engine 3 is configured to apply a noise contrastive estimation technique to the process of training the neural language model 11, whereby the model is trained using positive samples from the training data defining positive examples of word associations, as well as a predetermined number of generated negative samples (noise samples) defining negative examples of word associations. A predetermined number of negative samples are generated from each positive sample. In one embodiment, each positive sample is modified to generate a plurality of negative samples, by replacing one or more words in the positive sample with a pseudo-randomly selected word from the word dictionary 15. The replacement word may be pseudo-randomly selected, for example based on the stored associated frequencies of occurrences. [0035] […] The neural language model training module 23 is configured to learn the parameters defining the neural language model based on the training samples and the negative samples, by recursively adjusting the parameters based on the calculated error or discrepancy between the predicted probability of word association of the input sample output by the model compared to the actual label of the sample.”).
 Shamsi et al. in combination with Kang et al. and Mnih et al. are considered to be analogous to the claimed invention because they are in the same field of endeavor in data processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Shamsi et al. in combination with Kang et al. to incorporate the teachings of Mnih et al. of obtaining a plurality of sample utterances; obtaining the information amount of each word in the sample utterance according to the context of the input utterance; and training the general model through the plurality of sample utterances and the information amount of each word in the sample utterances. which provides the benefit of having an improved system and method to enable efficient representation and retrieval of word embeddings based on a neural language model ([0002] of Mnih et al.).
Claim 7 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Shamsi; Davood et al. (US 20180307724 A1; hereinafter referred to as Shamsi et al.) and further in view of KANG; Pil Goo et al. (US 20190392836 A1; hereinafter referred to as Kang et al.)  and MNIH; Andriy et al. (US 20150095017 A1; hereinafter referred to as Mnih et al.) as applied to claims 2 and 12 above, and further in view of Ruvini; Jean-David (US 20200065873 A1; hereinafter referred to as Ruvini) and STOILOS; Georgios et al. (US 20200242133 A1; hereinafter referred to as Stoilos et al.).

Regarding claims 7 and 17, Shamsi et al. in combination with Kang et al. and Mnih et al. teaches the limitations as in claims 2 and 12, above.
However, Shamsi et al. in combination with Kang et al. and Mnih et al. do not teach, but Ruvini teaches:
wherein the general model is a deep learning model inputting a word sequence and outputting real numbers between 0 and 1 corresponding to each word in the input sequence (see ¶ [ 0050 ]: “In operation 406, the normalization module 206 normalizes the extracted textual statements across all reviews or product guides of a given product. In some embodiments, the normalization module 206 approximates a function F of two arguments which takes two of the textual statements as input and computes their similarity as a number between 0 and 1 (e.g., a probability). Continuing with the example F(“The razor is easy to manipulate”, “It fits well into the hand and is easy to use”) would be high (e.g., close to 1), while F(“The razor is easy to manipulate”, “It excels at giving a close shave”) would be low close to 0). In one embodiment, a machine learning task is used to leverage deep learning to approximate F. For example, Siamese Neural Networks can be used to train function F with pairs of very similar sentences (e.g., positive examples) and pairs of very dissimilar sentences (e.g., negative examples). Thus for the example, statement A, B, and C can be normalized to “ease of use.”)
 Shamsi et al. in combination with Kang et al. and Mnih et al.  and Ruvini are  considered to be analogous to the claimed invention because they are in the same field of endeavor in data processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified  Shamsi et al. in combination with Kang et al. and Mnih et al.  to incorporate the teachings of Ruvini of wherein the general model is a deep learning model inputting a word sequence and outputting real numbers between 0 and 1 corresponding to each word in the input sequence which provides the benefit of improving an information provisioning system, to provide the user information based on the user request (e.g., response to a query) (abstract and [0014] of Ruvini).

However, Shamsi et al. in combination with Kang et al., Mnih et al. and Ruvini do not teach, but Stoilos et al. teaches:
the general model (as disclosed in ¶ [ 0050 ] by Ruvini in limitation above) uses binary cross entropy as a loss function during training (see ¶ [0134 and 0137] of Stoilos et al.: “[0134] In an example, 265 user text queries are collected from the above described SCDS and medical doctors were asked to map each of them to the most relevant concept in PGM; these concepts will be termed the user intended concepts.  […] [0137] […] The training and test sets were of sizes 192 and 48, respectively, with the network trained using a binary cross-entropy loss function.”).
 Shamsi et al. in combination with Kang et al.,  Mnih et al. and Ruvini  and Stoilos et al. are  considered to be analogous to the claimed invention because they are in the same field of endeavor in data processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified  Shamsi et al. in combination with Kang et al., Mnih et al. and Ruvini to incorporate the teachings of Stoilos et al. of the general model uses binary cross entropy as a loss function during training which provides the benefit of having a high accuracy (0.84) and working better than other approaches, such as KB embedding in terms of area under the curve  (0136 and 0137 of Stoilos et al.).

Claims 8 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Shamsi; Davood et al. (US 20180307724 A1; hereinafter referred to as Shamsi et al.) in combination with KANG; Pil Goo et al. (US 20190392836 A1; hereinafter referred to as Kang et al.) as applied to claims 1 and 11 above, and further in view of ISHIKAWA, YUKI (WO 2020261479 A1; hereinafter referred to as Ishikawa).

Regarding claims 8 and 18, Shamsi et al. in combination with Kang et al. teaches the limitations as in claims 1 and 11, above.
Shamsi et al. in combination with Kang et al. further teach:
wherein before the step of determining the redundant words according to the predicted value of the information amount of each word in the specific context, and determining whether to remove the redundant words from the input utterance (see ¶ [0045 and 0047] citations of Shamsi et al. as in claims 1 and 11 above and Fig. 6 and ¶ [ 0051 ] of Shamsi et al.: Fig. 6 (604) – GUI (user input displayed). [0051]: “User input 604 illustrates a user-inputted query to the dataset chatbot: “I want to know about feelings campaign if February.””) further comprises:
displaying the utterance (see ¶ [ 0096 ] of Kang et al.: “The controller 170 may transmit utterance information received through the audio input unit 141 to the information processor 150, and provide the speech recognition processing result from the information processor 150 through the display 121 as visual information.”), 
Shamsi et al. and Kang et al. are both considered to be analogous to the claimed invention because they are in the same field of endeavor in data processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have combined Shamsi et al. and Kang et al. to incorporate the teachings of Kang et al. of displaying the utterance in order to yield predictable results of displaying a user’s spoken utterance. (See KSR v. Teleflex).

However, Shamsi et al. in combination with Kang et al. do not teach, but Ishikawa teaches:
wherein a background color depth of each word in the corpus corresponds to the predicted value of the information amount of the word (see Fig. 7 and ¶ 8 and 12 (page 3): “As shown in FIG. 7, the display control unit 17 corresponds to the distance between the vector representation of the search query and the vector representation of the word and the vector representation of the search query for each of the plurality of words included in the related document D1. Then, the word is highlighted by changing the color of the peripheral area of the word in the search result window Wn2. […] In addition, by changing the highlight color of the word according to the distance between the vector representation of each of the plurality of words and the vector representation of the search query, each of the plurality of words included in the searched related document and the search query can be obtained.” ).
 Shamsi et al. in combination with Kang et al. and Ishikawa are considered to be analogous to the claimed invention because they are in the same field of endeavor in data processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Shamsi et al. in combination with Kang et al. to incorporate the teachings of Ishikawa of wherein a background color depth of each word in the corpus corresponds to the predicted value of the information amount of the word which provides the benefit of  allowing the user to confirm the relevance of words included in a document as a difference in highlight color (¶ 12 (page 3) of Ishikawa).

Claim 9 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Shamsi; Davood et al. (US 20180307724 A1; hereinafter referred to as Shamsi et al.) in combination with KANG; Pil Goo et al. (US 20190392836 A1; hereinafter referred to as Kang et al.) and ISHIKAWA, YUKI (WO 2020261479 A1; hereinafter referred to as Ishikawa) as applied to claims 8 and 18 above, and further in view of Hurvitz; Eyal et al. (US 20120209605 A1; hereinafter referred to as Hurvitz et al.).

Regarding claims 9 and 19, Shamsi et al. in combination with Kang et al. and Ishikawa teach the limitations as in claims 8 and 18, above.
Shamsi et al. further teaches:
wherein the step of determining the redundant words and whether to remove the redundant words from the input utterance according to the predicted value of the information amount of each word in the specific context (see ¶ [0045 and 0047] citations as in claim 1) comprises:
However, Shamsi et al. in combination with Kang et al. and Ishikawa do not teach, but Hurvitz et al. teaches:
removing a word selected by a user from the utterance (see ¶ [ 0028 and 0087]: “Optional audio analysis engines 136 may be used for processing received audio interactions, or interactions that comprise an audio component. Audio analysis engines 136 receive vocal data of one or more interactions and process it using audio analysis tools, such as speech-to-text (S2T) engine which provides continuous text of an interaction, a word spotting engine which searches for particular words said in an interaction, emotion analysis, or the like. […] [0087] The selected topics, and optionally output of the other components such as audio analysis engines 136, keyphrase extraction component 332 or others is then passed to user interface component 348, which presents the data to a user, and optionally enables a user to manipulate, add or delete any data item, such as delete an irrelevant or erroneous term, indicate a connection between terms, or the like.”).
 Shamsi et al. in combination with Kang et al. and  Ishikawa and Hurvitz et al. are considered to be analogous to the claimed invention because they are in the same field of endeavor in data processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have combined Shamsi et al. in combination with Kang et al. and Ishikawa with Hurvitz et al. to incorporate the teachings of removing a word selected by a user from the utterance in order to yield predictable results of enabling users to manipulate data/terms. (See KSR v. Teleflex).

Claims 10 and 20 is rejected under 35 U.S.C. 103 as being unpatentable over Shamsi; Davood et al. (US 20180307724 A1; hereinafter referred to as Shamsi et al.) and further in view of in combination with KANG; Pil Goo et al. (US 20190392836 A1; hereinafter referred to as Kang et al.) and ISHIKAWA, YUKI (WO 2020261479 A1; hereinafter referred to as Ishikawa) as applied to claims 8 and 18 above, and further in view of Gardiol; Natalia et al. (US 20120191694 A1; hereinafter referred to as Gardiol et al.).

Regarding claims 10 and 20, Shamsi et al. in combination with Kang et al. and Ishikawa teach the limitations as in claims 8 and 18, above.
Shamsi et al. further teaches:
wherein the step of determining the redundant words and whether to remove the redundant words from the input utterance according to the predicted value of the information amount of each word in the specific context (see ¶ [0045 and 0047] citations as in claim 1 comprises:
However, Shamsi et al. in combination with Kang et al. and Ishikawa do not teach, but Gardiol et al. teaches:
removing a word with the predicted value of the information amount lower than a predicted value threshold in response to a trigger instruction from the utterance (see ¶ [ 0011 ]: “In the step 114, poor-quality topics are removed. For example, topics that have probabilities below a designated threshold may be removed. In another example, if a human editor finds what are deemed "poor quality topics," the editor is able to remove them.”).
Shamsi et al. in combination with Kang et al., Ishikawa and Gardiol et al. are considered to be analogous to the claimed invention because they are in the same field of endeavor in data processing. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Shamsi et al. in combination with Kang et al. and Ishikawa to incorporate the teachings of Gardiol et al. of removing a word with the predicted value of the information amount lower than a predicted value threshold in response to a trigger instruction from the utterance which provides the benefit of having an improved process of generating topic model-based word probabilities ([0009] of Gardiol et al.).

Allowable Subject Matter
Claims 3-6 and 13-16 would be allowable if rewritten to overcome the rejections under 35 U.S.C. 101, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
Regarding claim 3 and 13, Shamsi et al. in combination with Kang et al. and Mnih et al. teach the limitations as in claim 2, above.
However, Shamsi et al. in combination with Kang et al. and Mnih et al. fail to teach:
wherein the step of obtaining the information amount of each word in the sample utterance according to the context of the input utterance comprises:
predicting a probability of the sample utterance on each intention category through a pre-trained intention identification model to obtain a first probability distribution vector;
predicting another probability of the sample utterance on each intention category after removing each word through the intention identification model to obtain a second probability distribution vector corresponding to the word;
obtaining an information gain rate of each word according to the first probability distribution vector and the second probability distribution vector corresponding to the; and
performing a normalization process on a sequence comprising the information gain rates of all the words to obtain the information amounts of the words.

Regarding claim 4 and 14, Shamsi et al. in combination with Kang et al. and Mnih et al. teach the limitations as in claim 2, above.
However, Shamsi et al. in combination with Kang et al. and Mnih et al. fail to teach the limitations of claim 3 and 13 and further:
wherein the step of obtaining the information gain rate of each word according to the first probability distribution vector and the second probability distribution vector corresponding to the word comprises:
obtaining the information gain rate of each word by calculating the Euclidean distance between the first probability distribution vector and the second probability distribution vector corresponding to the word.

Regarding claim 5 and 15, Shamsi et al. in combination with Kang et al. and Mnih et al. teach the limitations as in claim 2, above.
However, Shamsi et al. in combination with Kang et al. and Mnih et al. fail to teach the limitations of claim 3 and 13 and further:
wherein the step of obtaining the information gain rate of each word according to the first probability distribution vector and the second probability distribution vector corresponding to the word comprises:
obtaining the information gain rate of each word by calculating a relative entropy of the first probability distribution vector and the second probability distribution vector corresponding to the word.

Regarding claim 6 and 16, Shamsi et al. in combination with Kang et al. and Mnih et al. teach the limitations as in claim 2, above.
However, Shamsi et al. in combination with Kang et al. and Mnih et al. fail to teach the limitations of claim 3 and 13 and further:
wherein the step of obtaining the plurality of sample utterances comprises:
obtaining an initial utterance from a plurality of;
selecting a word from a dictionary corresponding to each word slot in the initial utterance in a random manner to fill the word slot to obtain the sample utterance, in response to the initial utterance comprising a word slot; and
using the initial utterance as the sample utterance, in response to the initial utterance not comprising the word slot;
after the step of performing the normalization process on the sequence comprising the information gain rates of all the words to obtain the information amounts of the words further comprises:
determining whether the word in the sample utterance is obtained by filling the word slot in the initial utterance; and
if yes, updating the information amount of the word to.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Keisha Y Castillo-Torres whose telephone number is (571)272-3975. The examiner can normally be reached Monday - Friday, 9:00 am - 4:00 pm (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on (571)272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

Keisha Y. Castillo-Torres
Examiner
Art Unit 2659



/Keisha Y. Castillo-Torres/Examiner, Art Unit 2659                                                                                                                                                                                                        
/Paras D Shah/Primary Examiner, Art Unit 2659                                                                                                                                                                                                        
06/13/2022