DETAILED ACTION
1.	This communication is in response to the Amendments and Arguments filed on 9/20/2022. Claims 1-20 are pending and have been examined. 
Response to Amendments and Arguments
2.	 Applicant's arguments with respect to claim rejections under 35 USC 103 have been fully considered, but they are not persuasive. In particular, the applicant argues that the references do not teach most of the claimed limitations in the independent claims. In response, the examiner respectfully disagrees. For such comprehensive arguments, the applicant is referred to the further elaborated rejection rationale presented in the next section. 
Note that the key-term of the invention is “alignment/misalignment” which can be broadly interpreted based on its definition - Specification: [0010] “The success of the conversation hinges on how well the interactions in the conversation are 'aligned.' This alignment is defined generally by agreement and mutual understanding, both of the content of the conversation and its context. Alignment may apply either to an individual turn in the conversation, or to the entire conversation. Alignment may fail in situations such as where one party mis-hears or misunderstands the other and/or where a party fails to perform as expected, for example, fails to answer a question. Alignment may fail where a party performs an unexpected act, for example, asks a non-sequitur question or tries to end the conversation when the other party still has unresolved questions. Another example of where alignment may fail is if parties disagree about outcomes or goals. For example, if one party feels an issue in question has been resolved, while the other party feels it hasn't, this could be a sign of misalignment applied to the entirety of the conversation, and not just to individual interactions or turns.”
Claim Rejections - 35 USC § 103
3.	Claims 1, 5-14, 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Bennett (US 20060122834; hereinafter BENNETT) in view of Gates, et al. (US 20100332287; hereinafter GATES).
 	As per claim 1, BENNETT (Title: Emotion detection device & method for use in distributed systems) discloses “A method comprising: receiving digitized media that represents a conversation between individuals; extracting cues from the digitized media that indicate properties of the conversation (BENNETT, [0091], extract conversational cues from a student's dialog using prosodic information, and apply data derived from these cues to the dialog manager to recognize misconceptions and clarify issues that the student has with the lesson; [0099], a dialog manager that enables smooth and robust conversational dialogs <read on digitized media> between the student and tutor, while allowing for better understanding of the student during the student-system dialog); 
entering the cues as training data into a machine learning module to create a trained machine learning model that [ detects misalignments in the conversations ]; and using the trained machine learning model in a processor to detect other misalignments in subsequent digitized conversations (BENNETT, [0022], an efficient mechanism for training a prosody analyzer <read on misalignment detector based on prosody cues – see Claim 6>; [0044], the dataset corresponding to the features extracted from the speech samples are fed to a decision tree-based machine learning algorithm; [0057], An emotion modeler and classifier system <where ‘classifier’ also reads on machine learning model> .. is trained with actual examples from test subjects to improve performance. This training data is generated based on Prosodic Feature Vectors calculated <read on cues as training data>; [0097-0098], the interactive training system, like its human counterpart, must detect and understand cues contained in the student's dialogue and be able to alter or tailor its response and its tutoring strategies <read on ‘to detect other misalignments in subsequent digitized conversations’> … The detection of emotion in the student's utterances is important for the tutorial domain because the detection of any negative emotion - such as confusion, boredom, irritation, intimidation, or conversely positive state such as confidence, enthusiasm in the student can allow the system to provide a more appropriate response; [0099], a dialog manager that enables smooth and robust conversational dialogs between the student and tutor).”
BENNETT does not expressly disclose the detailed implementation for “detects misalignments in the conversations ..” However, the feature is taught by GATES (Title: System and method for real-time prediction of customer satisfaction).
In the same field of endeavor, GATES teaches: [0012] “predicting customer satisfaction .. capturing a conversation between a customer and a customer service agent and converting the captured conversation into transcribed text if the conversation was carried out by phone; analyzing the interaction transcript to extract a plurality of unstructured features most closely related to customer satisfaction; combining the extracted features with a plurality of structured features obtained from other contact center data; generating a customer satisfaction score from the combination of extracted unstructured features and structured features, and presenting the customer satisfaction score to the customer service agent or other contact center personnel” and [0025] “ In an example embodiment, the number of prior interactions the customer has initiated before and the type of the goodwill offered by the contact center are obtained from database 106 (as structured features), and prosodic, lexical and contextual features are obtained from the interaction text 112 (as unstructured features). In an example embodiment, the information of call dominance and the number of negative sentiment words spoken by the customer, the information about if a follow-up call was scheduled, etc. are extracted from the call transcript <read on detecting misalignments in the conversations, where misalignment can be broadly interpreted>. The prediction component 200 combines the structured and unstructured features on the basis of C-SAT Model 150 to arrive at a customer satisfaction score 114.”
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of GATES in the system (as taught by BENNETT) for the processing steps to detect misalignments in the conversation by extracting features that are most closely related to customer satisfaction through the conversation.
As per claim 5 (dependent on claim [[5]] 1), BENNETT in view of GATES further discloses “15wherein the machine learning module comprises a support vector machine (BENNETT, [0147], these data-driven algorithms such as .. artificial neural networks (ANN), support vector machines (SVM) and nearest neighbor methods).”   
As per claim 6 (dependent on claim 1), BENNETT in view of GATES further discloses “15wherein the cues comprise prosodic cues (BENNETT, [0091], extract conversational cues from a student's dialog using prosodic information; [0057], An emotion modeler and classifier system .. is trained .. This training data is generated based on Prosodic Feature Vectors <rea on cues>).”  
As per claim 7 (dependent on claim 1), BENNETT in view of GATES further discloses “15wherein extracting the cues is performed by a deep neural network (BENNETT, [0147], these data-driven algorithms such as .. artificial neural networks (ANN) <read on DNN which is the most widely-used ANN>; [0165], deep language understanding framework).”
As per claim 8 (dependent on claim 1), BENNETT in view of GATES further discloses “15wherein the conversation is a customer support session, and wherein using the trained machine learning model to detect the other misalignments comprises using the trained machine learning model to evaluate call performance of customer service agents based on the other misalignments (BENNETT, [0012], capturing a conversation between a customer and a customer service agent; [0005], The main goal of conducting customer satisfaction survey is identifying satisfied customers and dissatisfied customers to evaluate the performance of their contact center and to improve their service quality; [0091], extract conversational cues from a student's dialog using prosodic information, and apply data derived from these cues to the dialog manager to recognize misconceptions and clarify issues).”  
As per claim 9 (dependent on claim 8), BENNETT in view of GATES further discloses “15wherein evaluating the call performance comprises displaying a real time dashboard that displays call performance of an agent relative to a baseline (BENNETT, [0025], The generated customer satisfaction score may be presented at the customer service agent's computer, displayed at another contact center display, sent via e-mail to one or more parties, or otherwise communicated to the relevant parties <read on a ready mechanism to present data in real-time>).”  
As per claim 10 (dependent on claim 8), BENNETT in view of GATES further discloses “15wherein evaluating the call performance comprises providing statistics of a plurality of agents in a call center, the statistics used to evaluate a call center policy (BENNETT, [0025], The generated customer satisfaction score may be .. communicated to the relevant parties <read on data presented to all relevant parties as performance statistics>).”   
As per claim 11 (dependent on claim 1), BENNETT in view of GATES further discloses “15wherein the digitized media includes one or more of audio and video streams (BENNETT, [0023], The SR process typically transfers speech data from an utterance to be recognized using a packet stream of extracted acoustic feature data including at least some cepstral coefficients).”  
As per claim 12 (dependent on claim 1), BENNETT in view of GATES further discloses “15wherein the training data further comprises any combination of the digitized media and an external source of training data (BENNETT, [0057], An emotion modeler and classifier system .. is trained with actual examples from test subjects to improve performance. This training data is generated based on Prosodic Feature Vectors <read on a ready mechanism to collect any data sets for training>).”  
As per claim 13 (dependent on claim 1), BENNETT in view of GATES further discloses “15fine-tuning the machine learning module using any combination of the subsequent digitized conversations and human-supplied ground truth labels (BENNETT, [0057], An emotion modeler and classifier system .. is trained with actual examples from test subjects to improve performance. This training data is generated based on Prosodic Feature Vectors <read on a ready mechanism to collect any data for further training/fine-tuning, including human-labeled ground truth>).”  
Claims 14, 18-19, 20 (similar in scope to claims 1, 12-13, 8) are rejected under the same rationale as applied above for claims 1, 12-13, 8. 
4.	Claims 2-4, 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over BENNETT in view of GATES, and further in view of Lightner, et al. (US 20160042276; hereinafter LIGHTNER).
As per claim 2 (dependent on claim 1), BENNETT in view of GATES further discloses “15wherein extracting the cues comprises: extracting lower-level cues from the digitized media; and constructing higher-level cues based on the lower level cues, wherein [ the higher-level cues comprise latent topics ] associated with the conversation (BENNETT, [0069], Data values that are extracted .. incorporated in higher level logic of the dialog manager; [0135], a dialog level prosodic model).”
BENNETT in view of GATES does not expressly disclose “the higher-level cues comprise latent topics. ..” However, the feature is taught by LIGHTNER (Title: Method of automated discovery of new topics).
In the same field of endeavor, LIGHTNER teaches: [Abstract] “performing automated discovery of new topics from unlimited documents related to any subject domain, employing a multi-component extension of Latent Dirichlet Allocation (MC-LDA) topic models, to discover related topics in a corpus.”
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of LIGHTNER in the system (as taught by BENNETT and GATES) for constructing latent topics from lower-level conversational cues.
As per claim 3 (dependent on claim 2), BENNETT in view of GATES and further in view of LIGHTNER further discloses “15wherein the digitized media comprises a textual transcript of the conversation, the lower level cues further comprising textual cues obtained from the textual transcript (BENNETT, [0042], outputs recognized speech text <read on transcript> corresponding to the user's question; [0024], A parts-of-speech analyzer is also preferably included for identifying a first set of emotion cues based on evaluating a syntax structure of the utterance).”  
As per claim 4 (dependent on claim 2), BENNETT in view of GATES and further in view of LIGHTNER further discloses “15wherein constructing the higher-level cues comprises processing the lower-level cues using at least one of a latent Dirichlet allocation and a hidden Markov model (LIGHTNER, [Abstract], performing automated discovery of new topics from unlimited documents related to any subject domain, employing a multi-component extension of Latent Dirichlet Allocation (MC-LDA) topic models, to discover related topics in a corpus).”
Claims 15-17 (similar in scope to claims 2-4) are rejected under the same rationale as applied above for claims 2-4. 
 					Conclusion
5.	THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).   
	A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 		
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FENG-TZER TZENG whose telephone number is (571)272-4609. The examiner can normally be reached on M-F (8:30-5:00). The fax phone number where this application or proceeding is assigned is 571-273-4609.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir (SPE) can be reached on 571-272-7799. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/FENG-TZER TZENG/		12/1/2022Primary Examiner, Art Unit 2659