DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This Office Action is in response to correspondence filed 08 January 2020 in reference to application 16/732,645.  Claims 21-40 are pending and have been examined.

Response to Amendment
The preliminary amendment filed 08 January 2020 has been accepted and considered in this office action.  Claims 1-20 have been cancelled and claims 21-40 added.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 21, 23, 24, 27-29, 31, 33, and 37-39 is/are rejected under 35 U.S.C. 103 as being unpatentable over Attwater et al. (US PAP 2003/0091163) in view of Lavallee (US PAP 2015/005190).

Consider claim 21, Attwater teaches a computer-implemented method (abstract, figure 1), comprising: 
obtaining transcriptions of voice inputs from a training set of voice inputs, wherein each voice input in the training set of voice inputs is directed to one of a plurality of stages of a multi- stage voice activity (0032, 35, corpus of call examples, and transcriptions); 
generating a plurality of groups of transcriptions, wherein each group of transcriptions includes a different subset of the transcriptions of voice inputs from the training set of voice inputs (0034, clustering groups of training tokens (i.e. n-grams)); 
assigning each group of transcriptions to a different dialog state of a dialog-state model that includes a plurality of dialog states, wherein each dialog state of the plurality of dialog states corresponds to a different stage of the multi-stage voice activity (0034, each cluster represents a different dialog function, 0027 multi turn dialogs for example); for each group of transcriptions, determining a set of n-grams for the group, and associating the reset of n-grams for the group with the corresponding dialog state of the dialog-state model to which the group is assigned (0034, clustering groups of training sentences tokens (i.e. n-grams) to represent each dialog act); and 
processing, with a speech recognizer, a subsequent voice input directed to a particular stage of the multi-stage voice activity, including biasing the speech recognizer using the representative set of n-grams associated with the dialog state in the dialog-state model that corresponds to the particular stage of the voice activity (0095, 0024, 
Attwater does not specifically teach for each group of transcriptions, determining a representative set of n-grams for the group, and associating the representative set of n-grams for the group with the corresponding dialog state of the dialog-state model to which the group is assigned.
In the same field of dialog act clustering, Lavallee teaches, determining a representative set of n-grams for the group, and associating the representative set of n-grams for the group with the corresponding dialog state of the dialog-state model to which the group is assigned (0026, determining a representative sample of the dialog acts within a cluster).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to determine a representative group of items within the cluster as taught by Lavallee in the system of Attwater in order to reduce the size of clusters and ease processing requirements when using the clustered model (Lavallee 0026).

Consider claim 23, Attwater teaches the computer-implemented method of claim 21, wherein generating the plurality of groups of transcriptions comprises clustering transcriptions based on similarities among the transcriptions (0034, clustering based on similarities of strings of tokens, also see 0036-93 for detail).

Consider claim 24, Attwater teaches the computer-implemented method of claim 23, wherein clustering transcriptions based on similarities among the transcriptions comprises: 
extracting respective sets of n-grams from the transcriptions (0034 extracting sequences of tokens (i.e. n-grams) from the transcriptions based on white space etc.); 
comparing the respective sets of n-grams from the transcriptions with each other to determining levels of similarity between the respective sets of n-grams (0034 and 0036-95 for detail, comparing sequences of tokens based on edit distances and alignments); and 
grouping transcriptions based on the determined levels of similarity between the respective sets of n-grams for the transcriptions (0034, grouping similar sequences of tokens)

Consider claim 27, Lavallee teaches the computer-implemented method of claim 21, wherein generating the plurality of groups of transcriptions comprises: 
generating preliminary set of groups of transcriptions (0022, clustering); and 
merging particular groups from the preliminary set of groups of transcriptions to generate a final set of groups of transcriptions (0022-23 merging and eliminating clusters into parents clusters).

Consider claim 28, Lavallee teaches the computer-implemented method of claim 27, wherein merging particular groups from the preliminary set of groups of transcriptions to generate the final set of groups of transcriptions comprises:   

determining to merge at least two of the particular groups to generate a merged group of transcriptions based on a level of similarity between the representative sets of n-grams of the at least two of the particular groups (0022-23 merging clusters into more generic clusters based on distance between them).

Consider claim 29, Lavallee teaches the computer-implemented method of claim 21, wherein determining a representative set of n-grams for a group of transcriptions comprises selecting n-grams from transcriptions in the group for inclusion in the representative set that are determined to be more prominent in a language than other n-grams from transcriptions in the group (0026-27, determining the representative features of a cluster based on distribution of features i.e. prominence).

Consider claim 31, Attwater teaches a system (abstract), comprising: 
one or more processors (0022, figure 1); and 
one or more computer-readable media having instructions stored thereon that, when executed by the one or more processors, cause the one or more processors to perform operations (0022 figure 1) comprising: 
obtaining transcriptions of voice inputs from a training set of voice inputs, wherein each voice input in the training set of voice inputs is directed to one of a plurality of stages of a multi- stage voice activity (0032, 35, corpus of call examples, and transcriptions); 

assigning each group of transcriptions to a different dialog state of a dialog-state model that includes a plurality of dialog states, wherein each dialog state of the plurality of dialog states corresponds to a different stage of the multi-stage voice activity (0034, each cluster represents a different dialog function, 0027 multi turn dialogs for example); for each group of transcriptions, determining a set of n-grams for the group, and associating the reset of n-grams for the group with the corresponding dialog state of the dialog-state model to which the group is assigned (0034, clustering groups of training sentences tokens (i.e. n-grams) to represent each dialog act); and 
processing, with a speech recognizer, a subsequent voice input directed to a particular stage of the multi-stage voice activity, including biasing the speech recognizer using the representative set of n-grams associated with the dialog state in the dialog-state model that corresponds to the particular stage of the voice activity (0095, 0024, using generated n-gram based dialog model to facilitate utterance interpretation in speech recognition.).
Attwater does not specifically teach for each group of transcriptions, determining a representative set of n-grams for the group, and associating the representative set of n-grams for the group with the corresponding dialog state of the dialog-state model to which the group is assigned.
In the same field of dialog act clustering, Lavallee teaches, determining a representative set of n-grams for the group, and associating the representative set of n-
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to determine a representative group of items within the cluster as taught by Lavallee in the system of Attwater in order to reduce the size of clusters and ease processing requirements when using the clustered model (Lavallee 0026).

Claim 33 contains similar limitations as claim 23 and therefore is rejected for the same reasons.

Claim 34 contains similar limitations as claim 24 and therefore is rejected for the same reasons.

Claim 37 contains similar limitations as claim 27 and therefore is rejected for the same reasons.

Claim 38 contains similar limitations as claim 28 and therefore is rejected for the same reasons.

Claim 39 contains similar limitations as claim 29 and therefore is rejected for the same reasons.

Claims 22 and 32 is/are rejected under 35 U.S.C. 103 as being unpatentable over Attwater and Lavallee as applied to claims 21 and 31 above, and further in view of Coker et al. (US Patent 8,3070,143).

Consider claim 22, Attwater and Lavallee teach the computer-implemented method of claim 21, but does not specifically teach wherein the training set of voice inputs include voice inputs collected from users at a plurality of computing devices.
In the same field of voice response systems, Coker teaches wherein the training set of voice inputs include voice inputs collected from users at a plurality of computing devices (col 11 line 67- col 12 line 5, training utterances collected from multiple devices).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use utterances from multiple devices as taught by Coker in the system of Attwater and Lavallee in order to more accurately train the speech response system (Coker col 1 lines 25-33).

Claim 32 contains similar limitations as claim 22 and therefore is rejected for the same reasons.

Claims 25 and 35 is/are rejected under 35 U.S.C. 103 as being unpatentable over Attwater and Lavallee as applied to claims 24 above, and further in view of Venkatapathy et al. (US Patent 9,473,637).

Consider claim 25, Attwater and Lavallee teach the computer-implemented method of claim 23, but does not specifically teach further comprising obtaining context data associated with the voice inputs from the training set of voice inputs, wherein the transcriptions are clustered based on similarities among the contexts of the voice inputs from which the transcriptions were derived, as indicated by the context data.
In the same field of speech response systems Venkatapathy teaches further comprising obtaining context data associated with the voice inputs from the training set of voice inputs, wherein the transcriptions are clustered based on similarities among the contexts of the voice inputs from which the transcriptions were derived, as indicated by the context data (col 3 lines 10-25, clustering training utterances based on extracted context features).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to cluster training data based on context as taught by Venkatapathy in the system of Attwater and Lavallee in order to improve dialog act selection (Venkatapathy col 2 lines 30-55).

Claim 35 contains similar limitations as claim 25 and therefore is rejected for the same reasons.

Claims 26 and 36 is/are rejected under 35 U.S.C. 103 as being unpatentable over Attwater and Lavallee  and Venkatapathy as applied to claims 25 and 35 above, and further in view of Van Os et al. (US Patent 2015/0282047).

Consider claim 26, Attwater and Lavallee  and Venkatapathy teach the computer-implemented method of claim 25, but do not specifically teach wherein the context data for a given voice input characterizes a display of a user device at which the voice input was received at a time when the voice input was received.
In the same field of processing speech commands, Van Os teaches wherein the context data for a given voice input characterizes a display of a user device at which the voice input was received at a time when the voice input was received (0167, information about what is being displayed may be retrieved for context).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use display context as taught by Van Os in the system of Attwater, Lavallee, and Venkatoapthy in order to allow for better disambiguation of a user input (Van Os 0167).

Claim 36 contains similar limitations as claim 26 and therefore is rejected for the same reasons.

Claims 30 and 40 is/are rejected under 35 U.S.C. 103 as being unpatentable over Attwater and Lavallee as applied to claims 21 above, and further in view of Lee et al. (US PAP 2015/0356959).

Consider claim 30, Attwater and Lavallee teach the computer-implemented method of claim 21, but does not specifically teach further comprising determining likelihoods of transitions between dialog states of the plurality of dialog states based on historical records indicating frequencies of transitions between the dialog states.
In the same field of dialog systems, Lee teaches determining likelihoods of transitions between dialog states of the plurality of dialog states based on historical records indicating frequencies of transitions between the dialog states (0017, 0059-63, transitions between dialog states may be based on frequency of occurrence in training data).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use frequency in training data to model transition probabilities in Attwater and Lavallee in order to improve modeling for the dialog system (Lee 0029).

Claim 40 contains similar limitations as claim 30 and therefore is rejected for the same reasons.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure and is listed on the Notice of References Cited. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS C GODBOLD whose telephone number is (571)270-1451.  The examiner can normally be reached on 7:30-12 Monday and Friday, 7:30-6 Tuesday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


DOUGLAS GODBOLD
Examiner
Art Unit 2658



/DOUGLAS GODBOLD/Primary Examiner, Art Unit 2658