Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Acknowledgement  
Acknowledgement is made of applicant’s amendment made on 08/29/2021. Applicant’s submission filed has been entered and made of record.
Status of the Claims
Claims 1-41 are pending 
Response to Applicant’s Argument
In response to “In response, claim 1 was amended to require that the segmentation preserves the order of the parts in the conversation and that the segmentation maximizes a likelihood that the parts of the given conversation match the corresponding parts in the defined order of the computed conversation structure model”.
In view of amendment to the claims, previous combination of prior arts are withdrawn. Upon further search and consideration, please see details of a new combination of prior arts below.
In response to “Applicant respectfully traverses the rejection of claim 8, at least as amended. It is noted that the computing of the segmentation which maximizes a likelihood that the parts of the given conversation match the corresponding parts in the defined order of the ” and “Applicant could not find in Shafiei any hint to computing a coherence score after computing the segmentation of the given conversation. The quotations presented by the Examiner from Shafiei, such as: "it is very likely for a sentence" and "with high probability, the topic for sentence I is the same as for sentence i-1", do not relate to calculation of a score, but rather present the underlining assumptions of the authors. The quotation from Hakkani-Tur: "the objective is to compute the posterior probability of each class", are computations performed in classifying caller request to a specific type, and not a score indicating a quality of the selection, which in claim 8 is an extent of fit between a computed segmentation of the given conversation and the conversation structure model”. 
Shafiei teaches each topic is some distribution over words (p. 283, 1. Introduction, “…that a document is a mixture of several topics where each topic is some distribution over words”; p. 284, “where each topic is characterized by a distribution over words”). Further, “Each topic model is a generative model which specifies a simple probabilistic process by which the words in a document are being generated on the basis of a small number of latent variables” and “the goal of fitting this generative model is to find the optimal set of latent variables that can explain the observed data (i.e., observed words in documents). These latent variables capture the correlations between words and are referred to as topics” (p. 284). 
Specifically, Shafiei teaches after computing a segmentation of a given conversation (p. 284, “splitting a text stream into coherent and meaningful segments is referred to as topic segmentation…we propose a generative model which is able to segment text data into topically coherent segments while discovering the topic distributions over words”; p. 285, section 3, “a model is proposed which is able to detect the boundaries of these segments. Each segment is assigned to a topic from a predefined number of topics…Then, each segment is modeled based on its word content similar to most probabilistic topic models”), computing a coherence score quantifying an extent of fit between the given conversation and a conversation structure model by estimate the coherence score through analyzing a likelihood of the segmentation of the conversation under the conversation structure model (p. 286, “To model the relation between topics of consecutive sentences or paragraphs, we assume a Markov structure on the distribution over document-topics. We assume that it is very likely for a sentence (or a paragraph) to have the same distribution over document topics as its previous sentence”; p. 286, 3.1 The Proposed Hierarchical Bayesian model “We order sentences of each document and assumes a Markov structure on the topic distributions of sentences: with high probability, the topic for sentence i is the same as for sentence i-1”; p. 287, equation (1) where if cs = 0, word distribution probability for sentence s has the same topic ys as the word distribution probability of previous sentence topic ys-1) 
wherein when the coherence score is below a given value, regard the given conversation as not matching the conversation structure model (p. 286, “with high probability, the topic for sentence i is the same as for sentence i-1; otherwise we sample a new topic for it”; p. 287, equation (1) where if cs = 1, word distribution probability for sentence s has a different topic as compared with the previous sentence ys-1; compare, Hakkani-Tur, ¶36, the objective is to compute the posterior probability of each semantic class, P(Ci|W) and retain those that are above a predetermined threshold; i.e., word distribution / posterior probability of one sentence or a sequence of words given semantic class topic C above a threshold means word distribution corresponds to the semantic class / topic). 
Here, the segmentation model according to equation (1) comprises at least one word distribution for at least one topic corresponding to a previous sentence ys-1 (p. 287, probability distribution in equation 1). The segmentation model determines whether word contents of a sentence s (ys) fits the word distribution of the at least one topic. 
In other words, after the segment boundaries are determined, each segment’s word contents are modeled to determine whether its word distribution have the same distribution over a topic as its previous segment (p. 286, “We assume that it is very likely for a sentence (or a paragraph) to have the same distribution over document-topics as its previous sentence”). If not, sample a new distribution or topic for the segment (p. 286, “Otherwise, we sample a new distribution for the document-topic of this sentence”). 
The result is the splitting of a text stream into coherent and meaningful segments (p. 284).
Here, if topic ys for sentence s has a different word distribution probability from topic ys-1 for sentence s-1 in the generative model, then sentence s should not be in the same segment as s-1 as such segment comprising s and s-1 would not be coherent and meaningful.  
Claim Rejections - 35 USC § 103
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 103 that form the basis for the rejections under this section made in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing 


Claims 1-2, 4-7, 13-14, 19-20, 22-25, 31-32, 37, and 40 are rejected under 35 USC 103(a) as being unpatentable over Hakkani-Tur et al. (US 2006/0190253 A1) in view of Peters et al. (US 2007/0260564 A1).
Regarding Claims 1 and 19, Hakkani-Tur discloses a system for information processing (¶6), comprising: 
an interface for accessing a corpus of recorded conversations (¶23, ASR module 104 receives utterances; ¶40, receiving human/human conversational data); and 
a processor (¶9 and ¶32, processor 320), configured to: 
compute, over a corpus of conversations, a conversation structure model comprising (i) a sequence of conversation parts having a defined order, and (ii) a probabilistic model defining each of the conversation parts (¶36, given a set of semantic call types (or semantic classes) C=[C1, . . . , Cn] and a sequence of input words W=[W1, . . . Wm], the objective is to compute the posterior probability of each class, P(Ci|W) and retain those that are above a predetermined threshold; ¶40, create spoken language models for call classification from human/human conversational data in unsupervised training process without any transcribed data).
Hakkani-Tur does not disclose compute, for a given conversation, a segmentation of the conversation based on the computed conversation structure model; and acting on the given conversation according to the segmentation. 
Peters teaches a system for segmenting unstructured text generated by a speech recognition system transcribing recorded speech into sections and assigning semantic topic to each section to generate structured text (¶1 and ¶3).
¶11 and ¶56, training or generating text segmentation model using a training corpus with annotated sections assigned to predefined topic in order to extract text emission probability (text emission model), the topic sequence probability (topic sequence model), the topic position probability (topic position model) as well as the section length probability (topic dependent section length model) needed to perform segmentation of unstructured text and assign labels and topics to the resulting sections; ¶13, topic sequence probability indicates the likelihood that a first topic is followed by a second topic; ¶17, e.g., the topic sequence model for example keeps track of a probability that the section labelled as "theory" is often followed by a section labelled as "experiments");
compute, for a given unstructured text generated from speech recognition (¶29 and ¶68, unstructured text generated from speech recognition), a segmentation of the speech recognized text based on a computed text segmentation model, wherein the segmentation preserves the order of the parts in the speech recognized text (¶30, the method of text segmentation exploits the topic sequence probability in order to perform a text segmentation and topic assignment), and 
wherein the segmentation maximizes a likelihood that the parts of the speech recognized text match the corresponding parts in the defined order of the computed text segmentation model (¶33, application of text segmentation model is performed by means of a two dimensional simultaneous optimization over section boundaries and over assigned topics to find an optimal segmentation of a given word stream of N words into K sections that are labeled by topics with respect to text emission probability, topic sequence probability, topic position probability, and section length probability reduced to respective optimization criterion according to optimization equation 1; ¶34, p(tk | tk-1) in optimization argmax equation 1 corresponds to topic transition probability (per ¶23, the topic sequence model accounts for a plurality of successive topic transitions by making use of a topic transition M-gram model)).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to implement the computed conversation structure model of Hakkani-Tur to compute segmentation of a given conversation by performing semantic label / call-type classification (Hakkani-Tur, ¶32) by applying the textual segmentation model as a conversation structure model on speech recognized conversation (Peters, ¶29 and ¶68; Hakkani-Tur, ¶40) in order to segment unstructured conversation text into sections and assigning a semantic topic to each section (Peters, ¶1 and ¶43, note that topic refers to a semantic meaning of a section or segment; compare Hakkani-Tur, ¶32, compute posterior probability of each semantic class P (Ci|W) given a sequence of input words W).
Regarding Claim 37, Hakkani-Tur discloses a computer software product, the product comprising a tangible non-transitory computer readable medium in which program instructions are stored, which instructions, when read by a processor, cause the processor to implement the method of claim 1 and system of claim 19 (¶32-33, processor 320 executing instructions stored in memory). 
Regarding Claims 2 and 20, Hakkani-Tur discloses wherein the processor is configured to compute the probabilistic model by assigning a probability to an occurrence of ¶36, the goal is to compute posterior probability P(Ci|w) of each class C and sequence of words w and retain those above a threshold; ¶37, obtaining hypothesized sequence of words for W in P(Ci|W) using n-gram language model; ¶47-48, training n-gram language model involves counting n-grams by counting the occurrence of n-tuples and compute confidence score of n-tuples C(wi)).
 Regarding Claims 4 and 22, Hakkani-Tur discloses wherein the processor is configured to assign the probability by using a prior probability distribution for one or more of the conversation parts (¶38, if sufficient transcribed data / in-domain conversation data is available, apply MAP adaptation to generate the model using prior distribution set forth in equation 2, which is modeled using Dirichlet density).
Regarding Claims 5 and 23, Hakkani-Tur discloses wherein the processor is configured to compute the conversation structure model by pre-specified fixed number of the conversation parts (¶46, using a small set of transcribed data to train an initial model).
Regarding Claims 6 and 24, Hakkani-Tur discloses wherein the processor is configured to compute the conversation structure model by selecting a subset of the conversations based on one or more business rules (¶40, select conversation data source from (a) switchboard corpus, (b) various spoken dialog applications, or (c) relevant websites).
Regarding Claims 7 and 25, Hakkani-Tur discloses wherein the processor is configured to compute the segmentation of the conversation by finding the segmentation that best matches the conversation structure model (¶36, the goal is to compute posterior probability P(Ci|w) of each semantic class C and sequence of words w and retain those above a threshold; compare Peters, ¶33, application of text segmentation model is performed by means of a two dimensional simultaneous optimization over section boundaries and over assigned topics to find an optimal segmentation of a given word stream of N words into K sections that are labeled by topics; ¶43, each topic refers to a semantic meaning of a section or segment).
Regarding Claims 13 and 31, Hakkani-Tur discloses wherein the conversations are transcribed from human conversations (¶43, source data includes human/human conversational data; ¶46, training initial language model using a small set of manually transcribed data). 
Regarding Claims 14 and 32, Hakkani-Tur discloses wherein the conversations are recorded conversations (¶43, human/human conversational data from switchboard corpus), conducted over a telephone, a conference system, or in a meeting. 
Regarding Claim 40, Hakkani-Tur discloses wherein the probabilistic model defining each of the conversation parts defines a word distribution of the part and wherein the likelihood is a function of a match of a word distribution of each part of the given conversation to the word distribution of each corresponding part of the probabilistic model (¶36, the goal is to compute posterior probability P(Ci|w) of each class C and sequence of words w and retain those above a threshold; ¶46, using language model to calculate confidence score for utterances of set Su; and ¶48, occurrences of n-tuples may be counted to produce C(w1n) where w1n is the word n-tuple w1, w2, …, wn).  
Claims 15-18, 33-36, and 39 are rejected under 35 USC 103(a) as being unpatentable over Hakkani-Tur et al. (US 2006/0190253 A1) in view of Peters et al. (US 2007/0260564 A1) as applied to claims 1, 19, and 37, in further view of Cromack et al. (US 2009/0306981 A1).
Regarding Claims 15 and 33, Hakkani-Tur does not disclose wherein the processor is further configured to act on the given conversation by presenting a timeline that graphically illustrates the respective order and durations of the conversation parts during the given conversation.
Cromack teaches a server-client system for classifying conversations between multiple parties to classify key concepts and take appropriate actions (Abstract) comprising using a computed conversation structure model to perform automatic speech recognition (¶38) and provide conversational keyword / phrase recognition and transcription (¶71). In particular, for a given conversation, using the computed conversation model to compute a segmentation of the conversation (¶89, detect keywords / key phrases; ¶90, analyze key phrase significance scores; ¶91-92, for particular portions of a conversation, analyze keywords / topics significance scores to recommend topics and titles) and acting on the given conversation according to the segmentation (Figs. 6 and 8 in view of Abstract, distill out and record core ideas of a conversation to enhance user experience);
wherein the processor is further configured to act on the given conversation by presenting a timeline that graphically illustrates the respective order and durations of the conversation parts during the given conversation (Cromack, ¶77 and Figs.4-5, timeline display 5320).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to implement the computed conversation structure model of Hakkani-Tur to compute segmentation of a given conversation by performing semantic label / call-type classification (Hakkani-Tur, ¶40) and detect keywords / key phrases in a given portion / segment of a conversation (Cromack, ¶89) in order to distill out and record core ideas of a Cromack, Abstract and Fig. 6).
Regarding Claims 16 and 34, Cromack modified Hakkani-Tur and discloses wherein the processor is further configured to act on the given conversation by displaying conversation part duration to computer users (Cromack, ¶77, a running count of call duration is displayed and current recording point in time is indicated on speech waveform display by a cursor 5440). 
Regarding Claims 17 and 35, Cromack modified Hakkani-Tur and discloses wherein the processor is further configured to search for words within a conversation or within the corpus based on a conversation part to which the words are assigned (Cromack, ¶88, searching and sorting operations based on topic names and bookmarked information; ¶89, allowing users to search for keywords during or after live call). 
Regarding Claims 18 and 36, Cromack modified Hakkani-Tur and disclose wherein the processor s further configured to correlate the conversation parts of a given participant with participant metadata to identify conversation differences between participants (Cromack, ¶88, automatically record metadata for synchronizing timestamp, speaker name, and transcribed data).
Regarding Claim 39, Cromack modified Hakkani-Tur and disclose wherein computing the conversation structure model comprises allowing one or more of the parts to be empty (Cromack, ¶38, speech recognizer uses functional syntactical model to recognize words in the context of a sentence; the model is trained by initially relying on expert human transcribers and continually training for both speaker dependent and speaker independent models on the actual speech of each speaking participant; ¶69, this involves analyzing structure of speech waveform by analyzing pauses (i.e., empty parts); compare Hakkani-Tur, ¶46, training an initial language model using a small set of manually transcribed data from utterance data). 
Claims 3, 8-12, 21, 26-30, 38 and 41 are rejected under 35 USC 103(a) as being unpatentable over Hakkani-Tur et al. (US 2006/0190253 A1) and Peters et al. (US 2007/0260564 A1) as applied to claims 1, 19, and 37, in further view of Shafiei et al. (A statistical model for topic segmentation and clustering).
Regarding Claims 3 and 21, Hakkani-Tur does not disclose wherein the processor is configured to assign the probability by running a Gibbs sampling process. 
Shafiei teaches running a Gibbs sampling process to assign probabilities to an occurrence of each word when segmenting documents into topics (p. 288, “Gibbs sampling like other memmers of the Markov chain Monte Carlo algorithms family…each iteration of the algorithm gives a sample from the target distribution in the long run…the target distribution is the posterior distribution of word-topics, document-topics, and topic-switching variables given the collection of documents”).
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to assign probability to an occurrence of each word by running a Gibbs sampling process in order to draw samples from complex and usually high dimensional distributions (Shafiei, p. 288).
Regarding Claims 8-11, 26-29, and 38, Hakkani-Tur does not disclose wherein the processor is further configured to after computing the segmentation of the given conversation, computing a coherence score, which quantifies an extent of fit between the computed segmentation of the given conversation and the conversation structure model.
Shafiei teaches after computing a segmentation of a given conversation (p. 284, “splitting a text stream into coherent and meaningful segments is referred to as topic segmentation…we propose a generative model which is able to segment text data into topically coherent segments while discovering the topic distributions over words”; p. 285, section 3, “a model is proposed which is able to detect the boundaries of these segments. Each segment is assigned to a topic from a predefined number of topics…Then, each segment is modeled based on its word content similar to most probabilistic topic models”), computing a coherence score quantifying an extent of fit between the given conversation and a conversation structure model by estimate the coherence score through analyzing a likelihood of the segmentation of the conversation under the conversation structure model (p. 286, “To model the relation between topics of consecutive sentences or paragraphs, we assume a Markov structure on the distribution over document-topics. We assume that it is very likely for a sentence (or a paragraph) to have the same distribution over document topics as its previous sentence”; p. 286, 3.1 The Proposed Hierarchical Bayesian model “We order sentences of each document and assumes a Markov structure on the topic distributions of sentences: with high probability, the topic for sentence i is the same as for sentence i-1”; p. 287, equation (1) where if cs = 0, word distribution probability for sentence s has the same topic (word distribution probability) as the previous sentence ys-1; compare Peters, ¶17 and ¶23, topic sequence model keeps track of probability of a first topic followed by a second topic; ¶29-30, use topic sequence probability to segment unstructured text and assign topics) wherein when the coherence score is below a given value, regard the given conversation as not matching the conversation structure model (p. 286, “with high probability, the topic for sentence I is the same as for sentence i-1; otherwise we sample a new topic for it”; p. 287, equation (1) where if cs = 1, word distribution probability for sentence s has a different topic as compared with the previous sentence ys-1; compare, Hakkani-Tur, ¶36, the objective is to compute the posterior probability of each class, P(Ci|W) and retain those that are above a predetermined threshold). 
Further, based on one or more coherence scores computed between one or more respective conversations in the corpus and the conversation structure model, that the conversation structure model does not capture a valid conversations structure (pp. 285-86 “a model is proposed which is able to detect the boundaries of these segments. Each segment is assigned to a topic from a predefined number of topics…Then, each segment is modeled based on its word content similar to most probabilistic topic models…We assume it is very likely for a sentence (or a paragraph) to have the same distribution over document-topics as its previous sentence. Otherwise, we sample a new distribution for the document-topic of this sentence”; p. 286, 3.1 The Proposed Hierarchical Bayesian model, “with high probability, the topic for sentence i is the same as for sentence i-1; otherwise we sample a new topic for it”). 
It would’ve been obvious to one ordinarily skilled in the art before the effective filing date of the invention to compute and analyze coherence score using a generative model of Shafiei in order to split a text stream into coherent and meaningful segments (Shafiei, p. 284).
Regarding Claims 12 and 30, Shafiei discloses subsequent to computing the conversation structure model, merge one or more of the conversation parts into a single conversation part (Shafiei, p. 284, “splitting a text stream into coherent and meaningful segments is referred to as topic segmentation”; p. 286, 3.1 The Proposed Hierarchical Bayesian model, “with high probability, the topic for sentence i is the same as for sentence i-1; otherwise we sample a new topic for it”; see also Abstract, a statistical model for discovering topical clusters of words in unstructured text). 
Regarding Claim 41, Peters modifies Hakkani-Tur to teach wherein the probabilistic model defining each of the conversation parts defines a number of paragraphs in the part and wherein the likelihood is a function of a match of a number of paragraphs in each part of the given conversation to the number of paragraphs in each part of the probabilistic model (Peters, ¶6, the technique features segmenting a stream of text that is composed of a sequence of blocks of text (e.g. sentences, compare Shafiei, p. 286, “To model the relation between topics of consecutive sentences or paragraphs, we assume a Markov structure on the distribution over document-topics. We assume that it is very likely for a sentence (or a paragraph) to have the same distribution over document-topics as its previous sentence”) into segments using a plurality of language models. This segmentation is done in two steps: First, each block of text is assigned to one cluster language model. Thereafter, text sections (segments) are determined from sequential blocks of text which have been assigned to the same cluster language model. For the first step, each block of text is first scored against the language models to generate language model scores for this block of text. A language model score for a block of text indicates a correlation between the block of text and the language model. Second, language model sequence scores for different sequences of language models to which a sequence of blocks of text may correspond are generated. Combining all score information, a best-scoring sequence of language models is determined, thus resulting in an assignment of each sentence si to some cluster language model slmi).
Conclusion
Applicant's amendment necessitated the new grounds of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to examiner Richard Z. Zhu whose telephone number is 571-270-1587 or examiner’s supervisor King Y. Poon whose telephone number is 571-272-7440. Examiner Richard Zhu can normally be reached on M-Th, 0730:1700.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information 
/RICHARD Z ZHU/Primary Examiner, Art Unit 2675                                                                                                                                                                                                        12/3/2021