Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 2019-05-14 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1, 3-10, 12, and 14-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  
Step 1 Analysis:
In the instant case, Claims 1-11 are directed to a system, and Claims 12-22 are directed to a method.  Thus, each of the claims falls within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Step 2 Analysis:
Based on the claims being determined to be within one of the four statutory categories (Step 1), it must be determined if the claims are directed to a judicial exception of an abstract 
Step 2A: Prong 1 Analysis:
Claims 1 and 12 recite:
“determining a prediction cost for each of the plurality of artificial intelligence classification models based upon the training”; “determining” is an evaluation, which falls into the grouping of “observations, evaluations, judgments, and opinions”, and is thus a mental process (see MPEP 2106.04(a)(2)(III))\
“executing a plurality of trained artificial intelligence classification models”;  an artificial intelligence model may simply be a linear regression model, which can be performed by a human with pen and paper, and is thus a mental process (see MPEP 2106.04(a)(2)(III))
“ selecting one of the plurality of trained artificial intelligence classification models based upon the prediction vectors”; “selecting” is an evaluation and/or judgment, which falls into the grouping of “observations, evaluations, judgments, and opinions”, and is thus a mental process (see MPEP 2106.04(a)(2)(III))
“identifying the communication feature upon which the selected artificial intelligence classification model was trained”; “identifying” is an observation, which falls into the grouping of “observations, evaluations, judgments, and opinions”, and is thus a mental process (see MPEP 2106.04(a)(2)(III))
“generating a user interaction modification based upon the identified communication feature”; “generating…based on” here describes an evaluation, 
Step 2A: Prong 2 Analysis:
The judicial exception is not integrated into a practical application because the following additional limitations in Claims 1 and 12 amount to insignificant extra-solution activity:
 “receiving first encoded text” and “receiving second encoded text”; (necessary data gathering; see MPEP 2106.05(g)(3))
 “training a plurality of artificial intelligence classification models”; “training” a model, in general, via (non-functional descriptive) data obtained, as broadly recited, without specifics as to what the “training” is used or provides for, is considered to be insignificant  (insignificant extra-solution activity; see MPEP 2106.05(g)(2))
 “transmit the generated user interaction modification”; (necessary data outputting; see MPEP 2106.05(g)(3))
The remaining additional limitations “first client computing device”, “second client computing device”, “server computing device”, “network connections”, “memory”, and “processor” amount to merely using computers as tools to perform a mental process (see MPEP 2106.04(a)(2)(III)(C)(3)).  The claims are directed to an abstract idea.
Step 2B Analysis:

“receiving first encoded text” and “receiving second encoded text”; (receiving data; see MPEP 2106.05(d)(II)(i))
“performing test pattern generation to determine test coverage and volume”; (necessary data gathering and outputting – testing a system for a response; see MPEP 2106.05(g)(1))
“training a plurality of artificial intelligence classification models”; training a machine learning model is well-known in the art (insignificant extra-solution activity; see MPEP 2106.05(g)(1))
“transmit the generated user interaction modification”; (transmitting data; see MPEP 2106.05(d)(II)(i))
As discussed above, the remaining additional limitations “first client computing device”, “second client computing device”, “server computing device”, “network connections”, “memory”, and “processor” amount to merely using computers as tools to perform a mental process (see MPEP 2106.04(a)(2)(III)(C)(3)).  The limitations do not place any meaningful constraints on practicing the judicial exception. The claims are directed to a judicial exception.
Dependent claims 3-10 and 14-21 when analyzed as a whole are held to be patent ineligible under 35 U.S.C. 101 because the additional recited limitations fail to establish that the claims are not directed to an abstract idea, as they recite further embellishment of the judicial exception.

Claims 3 and 14 recite the same limitations as Claims 1 and 10, further specifying that the user interactions may be chat messages or digital speech segments.  The claims are still directed to a mental process.
Claims 4 and 15 recite the same limitations as Claims 1 and 10, further specifying the types of communication features.  The claims are still directed to a mental process.
Claims 6 and 17 recite the same limitations as Claims 1 and 10, further performing instructing a communication participant to change the identified communication feature in subsequent user interactions.  The claims are still directed to an abstract idea, in this case “Certain Methods of Organizing Human Activity”, specifically “managing personal behavior or relationships or interactions between people (including social activities, teaching, and following rules or instructions)”.  See MPEP 2106.04(a)(2)(II).
Claims 7 and 18 recite the same limitations as Claims 1 and 10, further specifying that each artificial intelligence model executes on a different processor.  This limitation is directed to using computers as tools to implement a mental process.  The claims are still directed to a mental process.
Claims 8 and 19 recite the same limitations as Claims 7 and 18, further specifying the type of processor being a GPU.  The claims are still directed to a mental process.

Claims 10 and 21 recite the same limitations as Claims 1 and 10, further specifying that the generated modification is displayed on a device.  This limitation does not integrate the judicial exception into a practical application because it is insignificant extra-solution activity (necessary data outputting; see MPEP 2106.05(g)(3)).
Claims 11 and 22 recite the same limitations as Claims 1 and 10, further specifying that the computing device adapts a communication stream based on the modification.  Here, the judicial exception is integrated into a practical application.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-6, 9-17, and 20-22 are rejected under 35 U.S.C. 103 as being unpatentable over Raanani et. al. (US 2017/0187880 A1; hereinafter Raanani) in view of Endo et. al. (US 2005/0060158 A1; hereinafter Endo) and Chaudhari et. al. (US 2020/0202256 A1; hereinafter Chaudhary).
As per Claim 1, Raanani teaches a system for automated, predictive analysis of user interactions to determine a modification to one or more features of the user interactions based (Raanani, Para [0023], discloses:  “The call co-ordination system 100 includes a real-time analysis component 130 that uses the classifiers 120 to generate a mapping 145 for coordinating calls between the representatives and the customers, e.g., for both inbound and outbound calls. The real-time analysis component 130 receives real-time call data 150 of an ongoing conversation between a customer and a first representative and analyzes the real-time call data 150 to generate a set of features, e.g., call features 135, for the ongoing conversation using a feature generation component 113. In some embodiments, the feature generation component 113 is similar to or same as the feature generation component 111. The feature generation component 113 generates the call features 135 based on the real-time call data 150, e.g., as described above with respect to the feature generation component 111. The real-time call data 150 can be an early-stage or initial conversation between the customer and the first representative. After the call features 135 are generated, a classifier component 114 determines a set of classifiers 140 that includes features matching the call features 135 and for a specified outcome of the call. For example, if the desired outcome of the call is to close the sales, then the classifier component 114 searches for classifiers that are generated for an outcome of “sales closed” and having features that match with one or more of the call features 135. The classifier component 114 can then choose one of the set of classifiers 140, e.g., using the classifier's prediction power, which can be indicated using a probability value for the specified outcome, as a specified classifier”.  Here, Raanani discloses a plurality of competing artificial intelligence classification models  (“The classifier component 114 can then choose one of the set of classifiers 140, e.g., using the classifier's prediction power“) for automated, predictive analysis (“The real-time analysis component 130 receives real-time call data 150 of an ongoing conversation between a customer and a first representative and analyzes the real-time call data 150 to generate a set of features”).  Raanani, Para [0040], discloses:  “Additionally, the mapping 145 may be used for guiding representatives with respective customers. For example, if the classifier has learned that a specific representative and customer would lead, e.g., with high probability, to success if the representative kept their speech rate below five words/second, asked open-ended questions, resolved objections raised by the customer, it may be used to inform the representative and/or their managers to adjust their speech rate, ask open-ended questions, and/or address the objections accordingly. On the other hand, if, for example, the classifier has learned that a conversation between an introverted man over age 50 and an extroverted 20 year old woman leads to low closing rates, it may signal the representative and/or their managers to expedite wrap-up of such conversations and/or later remap them, to avoid losing time on a call that will not yield desired results.”  Here, Raanani discloses determine a modification to one or more features of the user interactions (“inform the representative and/or their managers to adjust their speech rate”)).
the system comprising:
a first client computing device of a first communication participant and a second client computing device of a second communication participant (Raanani, Para [0017], discloses:  “Note that although the application describes the call-coordination system with respect to voice conversations/calls, the call-coordination system is not restricted to analyzing and coordinating voice calls; the call-coordination system can be used to analyze and coordinate various types of interactions between the participants, e.g., ranging from standard phone calls, through Voice over Internet Protocol (VoIP) calls, video conference, text chat, any online meetings, collaborations or interactions, and even in Virtual Reality (VR) or Augmented Reality (AR) based interactions”.  Here, Raanani discloses a system to coordinate interactions between participants, where in some of the types of interactions specified must require computing devices, such as VR and AR.  An interactions between participants must have at least two participants.  Thus, Raanani discloses a first client computing device of a first communication participant and a second client computing device of a second communication participant.)
and a server computing device coupled to the first client computing device and the second client computing device via one or more network connections, the server computing device comprising a memory for storing programmatic instructions and a processor (Raanani, Para [0011], discloses “This mapping (or pairing) may be fed into either an automatic or manual coordination system that connects or bridges sales representatives with customers.”  Here, Raanani discloses that the system is coupled to the first client computing device and the second client computing device,  (“bridges sales representatives with customers”).  Raanani, Para [0051], discloses:  “FIG. 7 is a block diagram of a computer system as may be used to implement features of the disclosed embodiments. The computing system 700 may be used to implement any of the entities, components or services depicted in the examples of the foregoing figures (and any other components described in this specification). The computing system 700 may include one or more central processing units (“processors”) 705, memory 710, input/output devices 725 (e.g., keyboard and pointing devices, display devices), storage devices 720 (e.g., disk drives), and network adapters 730 (e.g., network interfaces) that are connected to an interconnect 715.”  Here, Raanani discloses a server computing device (“computer system as may be used to implement features of the disclosed embodiments”) with memory and a processor (“one or more central processing units (“processors”) 705, memory 710”) and one or more network connections (“network adapters 730 (e.g., network interfaces)”)).
that executes the programmatic instructions to
receive first [encoded] text corresponding to prior user interactions, each segment of the first [encoded] text [comprising one or more multidimensional vectors] representing a prior user interaction, wherein each [multidimensional vector] comprises one or more communication features of the prior user interaction and a user engagement level associated with the prior user interaction (Raanani, Para [0028-0030], discloses:  “The ASR component 210 may be tuned for specific applications, e.g., for sales calls. The features produced by the ASR component 210 may include full transcripts, vocabularies, statistical language models (e.g., transition probabilities), histograms of word occurrences (“bag of words”), weighted histograms (where words are weighted according to their contextual salience, using e.g., a Term Frequency—Inverse Document Frequency (TF-IDF) scheme), n-best results, or any other data available from the component's lattice, such as phoneme time-stamps, etc.
The NLP component 225 processes the text to produce various semantic features, e.g., identification of topics, identification of open-ended questions, identification of objections and their correlation with specific questions, named entity recognition (NER), identification of relations between entities, identification of competitors and/or products, identification of key phrases and keywords (either predetermined, or identified using a salience heuristics such as TF-IDF), etc.
The affect component 215 can extract low-level features and high-level features. The low-level features can refer to the voice signal itself and can include features such as a speech rate, a speech volume, a tone, a timber, a range of pitch, as well as any statistical data over such features (e.g., a maximal speech rate, a mean volume, a duration of speech over given pitch, a standard deviation of pitch range, etc.). The high-level features can refer to learned abstractions and can include identified emotions (e.g., fear, anger, happiness, timidity, fatigue, etc.) as well as perceived personality traits (e.g., trustworthiness, engagement, likeability, dominance, etc.) and perceived or absolute personal attributes such as an age, an accent, and a gender. Emotion identification, personality trait identification, and personal attributes, may be trained independently to produce models incorporated by the affect component, or trained using the human judgment tags optionally provided to the offline analysis component. In some embodiments, the affect component 215 can also extract features, such as a speaker engagement metric (“wow” metric), which measures how engaged a participant was in the conversation, e.g., based on the usage of vocabulary, rate of speech, pitch change. For example, the usage of phrase “Oh! cool” can indicate a higher degree of engagement than the phrase “cool!” In another example, the same phrase but said in different pitches or pitch ranges can indicate different degrees of engagement. All features extracted by the affect component 215 may or may not include a corresponding confidence level which can be used in modeling outcomes. The affect features can be extracted separately for the representative and the customer, and may be recorded separately for multiple speakers on each side of the conversation.”  Here, Raanani discloses text (“ASR Component…the NLP component 225 processes the text”), and that this is corresponding to prior user interactions (“sales calls”).  Raanani discloses that the information comprises one or more communication features of the prior user interaction (“The affect component 215 can extract low-level features and high-level features. The low-level features can refer to the voice signal itself and can include features such as a speech rate, a speech volume, a tone, a timber, a range of pitch, as well as any statistical data over such features (e.g., a maximal speech rate, a mean volume, a duration of speech over given pitch, a standard deviation of pitch range, etc.).”) and a user engagement level associated with the prior user interaction (“In some embodiments, the affect component 215 can also extract features, such as a speaker engagement metric (“wow” metric), which measures how engaged a participant was in the conversation”).
train, using the first [encoded] text, a plurality of artificial intelligence classification models executing on the server computing device, wherein each artificial intelligence classification model is trained according to a different one of the one or more communication features (Raanani, Para [0034], discloses:  “FIG. 3 is a block diagram of the classifier component for generating classifiers, consistent with various embodiments. The example 300 illustrates the classifier component 112 using the features 115 extracted from the feature generation component 111 to build a number of classifiers, “C1”-“CN.” In some embodiments, the classifier component 112 is run on a dedicated portion of the collected recordings, e.g., a training set, which is a subset of the entire recordings available for analysis, to model the conversation outcomes. The conversation outcome can be any configurable outcome, e.g., “sales closed”, “sales failed”, “demo scheduled”, “follow up requested.” In some embodiments, the features 115 extracted from the feature generation component 111 can be fed into a machine learning algorithm (e.g., a linear classifier, such as a SVM, or a non-linear algorithm, such as a DNN or one of its variants) to produce the classifiers 120. The classifiers may be further analyzed to determine what features carry the largest predictive powers (e.g., similarity of speech rate, occurrence of first interrupt by customer, extrovert/introvert matching, or gender or age agreement.) The classifier component 112 can generate multiple classifiers for the same outcome. However, for a given outcome, different classifiers can have different features. For example, a first classifier 305, “C1,” for a specified outcome, “o1,” has a first set of features, e.g., features “f1”-“f3,” and a second classifier 310, “C2” for the same outcome “o1” has a second set of features, e.g., features “f5”-“f8.” The features in different classifiers can have different weight and contribute to the specified outcome in different degrees. Each of the classifiers can have a value, e.g., a probability value, that indicates a predictive power of the classifier for the specified outcome. Higher the predictive power, the higher the probability of achieving the specified outcome of the classifier. Different classifiers may be built for different number of participants, and may consider multiple participants as a single interlocutor, or as distinct entities.”  Here, Raanani discloses using the first text (“using the features 115 extracted from the feature generation component 111”) with a plurality of artificial intelligence classification models (“to build a number of classifiers, “C1”-“CN.”) to train the classifiers (“In some embodiments, the classifier component 112 is run on a dedicated portion of the collected recordings, e.g., a training set”).  Raanani also discloses wherein each artificial intelligence classification model is (“However, for a given outcome, different classifiers can have different features”)).
receive second [encoded] text corresponding to a current user interaction between the first client computing device and the second client computing device, each segment of the second [encoded] text [comprising one or more multidimensional vectors] representing the current user interaction, wherein each [multidimensional vector] comprises one or more communication features of the current user interaction (Raanani, Para [0023], discloses:  “The call co-ordination system 100 includes a real-time analysis component 130 that uses the classifiers 120 to generate a mapping 145 for coordinating calls between the representatives and the customers, e.g., for both inbound and outbound calls. The real-time analysis component 130 receives real-time call data 150 of an ongoing conversation between a customer and a first representative and analyzes the real-time call data 150 to generate a set of features, e.g., call features 135, for the ongoing conversation using a feature generation component 113.”  Here, Raanani discloses receive second text (“real-time call data”) corresponding to a current user interaction (“for the ongoing conversation”).  This is the second text, as the first text was used above to train the models.  Raanani also discloses one or more communication features of the current user interaction, as Raanani discloses “analyzes the real-time call data 150 to generate a set of features, e.g., call features 135, for the ongoing conversation.”)
execute, using the second [encoded] text, the plurality of trained artificial intelligence classification models to generate a prediction [vector] for each trained artificial intelligence classification model, wherein each prediction [vector] comprises a predicted value for the one (Raanani, Para [0023], discloses:  “The call co-ordination system 100 includes a real-time analysis component 130 that uses the classifiers 120 to generate a mapping 145 for coordinating calls between the representatives and the customers, e.g., for both inbound and outbound calls. The real-time analysis component 130 receives real-time call data 150 of an ongoing conversation between a customer and a first representative and analyzes the real-time call data 150 to generate a set of features, e.g., call features 135, for the ongoing conversation using a feature generation component 113. In some embodiments, the feature generation component 113 is similar to or same as the feature generation component 111. The feature generation component 113 generates the call features 135 based on the real-time call data 150, e.g., as described above with respect to the feature generation component 111. The real-time call data 150 can be an early-stage or initial conversation between the customer and the first representative. After the call features 135 are generated, a classifier component 114 determines a set of classifiers 140 that includes features matching the call features 135 and for a specified outcome of the call. For example, if the desired outcome of the call is to close the sales, then the classifier component 114 searches for classifiers that are generated for an outcome of “sales closed” and having features that match with one or more of the call features 135.”  Here, Raanani discloses using the second text (“real-time call data 150 of an ongoing conversation”) and the plurality of trained artificial intelligence classification models (“set of classifiers 140”) wherein each prediction comprises a predicted value for the one or more communication features (“includes features matching the call features”) of the current user interaction that maximizes user engagement (“searches for classifiers that are generated for an outcome of ‘sales closed’”).  Note that closing of a sale with a user may be considered a measure of maximizing user engagement.  Raanani does discloses that user engagement itself is evaluated, as shown above in Raanani [0028-0030]:  “the affect component 215 can also extract features, such as a speaker engagement metric”).
select one of the plurality of trained artificial intelligence classification models based upon the prediction [vectors] generated from the plurality of trained artificial intelligence classification models [and the prediction costs associated with] the plurality of trained artificial intelligence classification models (Raanani, Para [0023], discloses:  “The call co-ordination system 100 includes a real-time analysis component 130 that uses the classifiers 120 to generate a mapping 145 for coordinating calls between the representatives and the customers, e.g., for both inbound and outbound calls. The real-time analysis component 130 receives real-time call data 150 of an ongoing conversation between a customer and a first representative and analyzes the real-time call data 150 to generate a set of features, e.g., call features 135, for the ongoing conversation using a feature generation component 113. In some embodiments, the feature generation component 113 is similar to or same as the feature generation component 111. The feature generation component 113 generates the call features 135 based on the real-time call data 150, e.g., as described above with respect to the feature generation component 111. The real-time call data 150 can be an early-stage or initial conversation between the customer and the first representative. After the call features 135 are generated, a classifier component 114 determines a set of classifiers 140 that includes features matching the call features 135 and for a specified outcome of the call. For example, if the desired outcome of the call is to close the sales, then the classifier component 114 searches for classifiers that are generated for an outcome of “sales closed” and having features that match with one or more of the call features 135. The classifier component 114 can then choose one of the set of classifiers 140, e.g., using the classifier's prediction power, which can be indicated using a probability value for the specified outcome, as a specified classifier”  Here, Raanani discloses select one of the plurality of trained artificial intelligence classification models (“The classifier component 114 can then choose one of the set of classifiers”) based upon the prediction generated from the plurality of trained artificial intelligence classification models (“using the classifier's prediction power, which can be indicated using a probability value for the specified outcome, as a specified classifier”)).
identify the communication feature upon which the selected artificial intelligence classification model was trained; (Raanani, Para [0023], discloses:  “For example, if the desired outcome of the call is to close the sales, then the classifier component 114 searches for classifiers that are generated for an outcome of “sales closed” and having features that match with one or more of the call features 135.”  Here, Raanani discloses identifying which communication feature was used to train each classification model.)
generate a user interaction modification based upon the identified communication feature;  (Raanani, Para [0040], discloses:  “Additionally, the mapping 145 may be used for guiding representatives with respective customers. For example, if the classifier has learned that a specific representative and customer would lead, e.g., with high probability, to success if the representative kept their speech rate below five words/second, asked open-ended questions, resolved objections raised by the customer, it may be used to inform the representative and/or their managers to adjust their speech rate, ask open-ended questions, and/or address the objections accordingly. On the other hand, if, for example, the classifier has learned that a conversation between an introverted man over age 50 and an extroverted 20 year old woman leads to low closing rates, it may signal the representative and/or their managers to expedite wrap-up of such conversations and/or later remap them, to avoid losing time on a call that will not yield desired results.”  Here, Raanani discloses generate a user interaction modification (“adjust”) based upon the identified communication feature (“speech rate”)).
transmit the generated user interaction modification to at least one of the first client computing device or the second client computing device (Raanani, Para [0040], discloses:  “Additionally, the mapping 145 may be used for guiding representatives with respective customers. For example, if the classifier has learned that a specific representative and customer would lead, e.g., with high probability, to success if the representative kept their speech rate below five words/second, asked open-ended questions, resolved objections raised by the customer, it may be used to inform the representative and/or their managers to adjust their speech rate, ask open-ended questions, and/or address the objections accordingly. On the other hand, if, for example, the classifier has learned that a conversation between an introverted man over age 50 and an extroverted 20 year old woman leads to low closing rates, it may signal the representative and/or their managers to expedite wrap-up of such conversations and/or later remap them, to avoid losing time on a call that will not yield desired results.” Here, Raanani discloses transmit the generated user interaction modification, as they state “it may be used to inform the representative and/or their managers to adjust their speech rate”.  As this is while the call is “ongoing”, this must be transmitted to the representative via their client computing device on which the call is being conducted.)
However, Raanani does not explicitly teach encoded text corresponding to user interactions, each segment of the encoded text comprising one or more multidimensional vectors representing a user interaction, wherein each multidimensional vector comprises one or more communication features of the prior user interaction and a user engagement level associated with the prior user interaction; determine a prediction cost for each of the plurality of artificial intelligence classification models based upon the training
Endo teaches encoded text corresponding to user interactions, each segment of the encoded text comprising one or more multidimensional vectors representing a user interaction, wherein each multidimensional vector comprises one or more communication features of the user interaction and a user engagement level associated with the user interaction (Endo, Para [0006], discloses:  “To determine the state of the user, the method generates an utterance parameter vector based upon the utterance parameters, converts the utterance parameter vector to an indication representing the state of the user, and determines the state of the user based upon the indication. To generate the utterance parameter vector, the method determines the number of segments for each classification, and divides the number of segments for each classification by the total number of segments in the utterance. The utterance parameter vector is converted to the indication by applying a linear function to the utterance parameter vector to generate one of a scalar, a vector of fuzzy classes, and an index representing the state of the user. In case the indication is a scalar, it is determined that the user is in a first state if the scalar is greater than a predetermined threshold and that the user is in a second state if the scalar is not greater than the predetermined threshold.”  Here, Endo discloses encoded text (“utterance parameter vector”) comprising one or more multidimensional vectors (“linear function…to generate…a vector of fuzzy classes”), wherein the “vector of fuzzy classes” comprises one or more communication features of the user interaction (“representing the state of the user”).  Endo, Para [0022], discloses that the state of the user comprises communication features:  “In one embodiment, each segment is assigned 204 a classification indicating one of a plurality of states of a user. For example, the classifications may include P1 (truth), P2 (stress), P3 (excitement), P4 (unsure), P5 (very stressed), P6 (voice control), P7 (tense), P8 (very tense), P9 (inaccurate), PA (implausible), PB (deceiving), PC (speech speed), PD (pause ratio), PE (clearness), PF (drowsy), PG (tired), PH (hesitation), PI (variance of the pitch during a segment), PJ (difference in pitch from one segment to the next segment), and PK (shape of the frequency spectrum in the segment). The assignment of these classifications to the segments of the utterance may be carried out by various lie detection software that is commercially available.”  Also, “excitement” may be considered to be a user engagement level associated with the user interaction, and Raanani has already established monitoring the user engagement level as shown above in Raanani [0030]).
Raanani and Endo are analogous are because they are both in the field of endeavor of customized electronic communications.
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the voice call analysis of Raanani, with the emotional state analysis of Endo. The modification would have been obvious because one of ordinary skill in the art would be 
The combination of Raanani and Endo thus far fails to teach determine a prediction cost for each of the plurality of artificial intelligence classification models based upon the training.
Chaudhari teaches determine a prediction cost for each of the plurality of artificial intelligence classification models based upon the training (Chaudhari, Abstract, discloses:  “Embodiments are directed to a machine learning engine that determines training documents and validation documents from a plurality of documents. The machine learning engine may determine attributes associated with the documents. In response to receiving a request to predict attribute values of a selected document the machine learning engine may train a plurality of ML models to predict the attribute values based on the training documents and the attributes and associate the trained ML models with an accuracy score. The machine learning engine may determine candidate ML models from the trained ML models based on the training accuracy scores. The machine learning engine may evaluate and rank the candidate ML models based on the request and the validation documents. The machine learning engine may generate confirmed ML models based on the ranked candidate ML models such that the confirmed ML models may answer the request.”  Here, Chaudhari discloses for each of the plurality of artificial intelligence classification models (“plurality of ML models”), determine a prediction cost (“associate the trained ML models with an accuracy score”) based upon the training (“training accuracy scores”)).
Raanani and Chaudhari are analogous are because they are both in the field of endeavor of artificial intelligence.
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the voice call AI models of Raanani, with the training accuracy scores for each model of Chaudhari. The modification would have been obvious because one of ordinary skill in the art would be motivated to close more sales by achieving improved accuracy (Chaudhari, Para [0027]: “In one or more of the various embodiments, the machine learning engine may be arranged to generate one or more confirmed ML models based on the one or more ranked candidate ML models such that the one or more confirmed ML models may be employed to answer the request and predict the one or more attribute values of the selected document, and such that employing the one or more confirmed ML models improves both efficiency of employed computing resources and accuracy of the answer to the request”).

As per Claim 2, the combination of Raanani, Endo, and Chaudhari teaches the system of claim 1 as shown above, as well as wherein one or more of the plurality of artificial intelligence classification models comprises a neural network. (Raanani, Para [0015], discloses:  “In some embodiments, the offline analysis component can analyze the features using a machine learning algorithm (e.g., a linear classifier, such as a support vector machine (SVM), or a non-linear algorithm, such as a deep neural network (DNN) or one of its variants) to generate the classifiers.”  Here, Raanani discloses a neural network (“deep neural network (DNN)”).

As per Claim 3, the combination of Raanani, Endo, and Chaudhari teaches the system of claim 1 as shown above, as well as wherein the prior user interactions comprise online chat messages or digital speech segments. (Raanani, Para [0025], discloses:  “The call data 105 can be in various formats, e.g., audio recordings, transcripts of audio recordings, online chat conversations”).  

As per Claim 4, the combination of Raanani, Endo, and Chaudhari teaches the system of claim 1 as shown above, as well as wherein the one or more communication features comprise tone, speed, volume, or word choice. (Raanani, Para [0030], discloses:  “The affect component 215 can extract low-level features and high-level features. The low-level features can refer to the voice signal itself and can include features such as a speech rate, a speech volume, a tone, a timber, a range of pitch, as well as any statistical data over such features (e.g., a maximal speech rate, a mean volume, a duration of speech over given pitch, a standard deviation of pitch range, etc.)”  Here, Raanani discloses tone, speed, and volume (“speech rate, a speech volume, a tone”)).

As per Claim 5, the combination of Raanani, Endo, and Chaudhari teaches the system of claim 1 as shown above.  Raanani teaches wherein selecting one of the plurality of trained 
determining, for each communication feature of the prediction vectors, an accuracy value for each prediction (Raanani, Para [0023], discloses “Raanani, Para [0023], discloses:  “The call co-ordination system 100 includes a real-time analysis component 130 that uses the classifiers 120 to generate a mapping 145 for coordinating calls between the representatives and the customers, e.g., for both inbound and outbound calls. The real-time analysis component 130 receives real-time call data 150 of an ongoing conversation between a customer and a first representative and analyzes the real-time call data 150 to generate a set of features, e.g., call features 135, for the ongoing conversation using a feature generation component 113. In some embodiments, the feature generation component 113 is similar to or same as the feature generation component 111. The feature generation component 113 generates the call features 135 based on the real-time call data 150, e.g., as described above with respect to the feature generation component 111. The real-time call data 150 can be an early-stage or initial conversation between the customer and the first representative. After the call features 135 are generated, a classifier component 114 determines a set of classifiers 140 that includes features matching the call features 135 and for a specified outcome of the call. For example, if the desired outcome of the call is to close the sales, then the classifier component 114 searches for classifiers that are generated for an outcome of “sales closed” and having features that match with one or more of the call features 135. The classifier component 114 can then choose one of the set of classifiers 140, e.g., using the classifier's prediction power, which can be indicated using a probability value for the specified outcome, as a specified classifier”.  Here, Raanani discloses for each communication feature of the prediction vectors (“searches for classifiers…having features that match with one or more of the call features 135”), determining an accuracy value for each prediction vector (“using the classifier's prediction power, which can be indicated using a probability value for the specified outcome”).  Examiner’s note:  A classifier produces a probability that discriminates between two classes by being above or below a given threshold (i.e., 0.5).  Here, the “prediction power” is how close to 0 or 1 the classifier is, or “confident” it is, and is thus a measure of accuracy.)
using the accuracy value for each prediction vector determine an optimal prediction; selecting the trained artificial intelligence classification model associated with the optimal prediction (Raanani, Para [0023], discloses “The classifier component 114 can then choose one of the set of classifiers 140, e.g., using the classifier's prediction power, which can be indicated using a probability value for the specified outcome, as a specified classifier”.  Here, Raanani is selecting the trained artificial intelligence classification model (“choose one of the set of classifiers”) based on the prediction vector’s prediction power (an accuracy value, as shown above), which would produce the optimal prediction).
However, Raanani fails to teach a prediction vector; aggregating the accuracy value for each prediction vector and the prediction cost associated with the trained artificial intelligence classification model that generated the prediction vector to determine an optimal prediction vector;
Endo teaches a prediction vector  (Endo, Para [0006], discloses:  “To determine the state of the user, the method generates an utterance parameter vector based upon the utterance parameters, converts the utterance parameter vector to an indication representing the state of the user, and determines the state of the user based upon the indication. To generate the utterance parameter vector, the method determines the number of segments for each classification, and divides the number of segments for each classification by the total number of segments in the utterance. The utterance parameter vector is converted to the indication by applying a linear function to the utterance parameter vector to generate one of a scalar, a vector of fuzzy classes, and an index representing the state of the user. In case the indication is a scalar, it is determined that the user is in a first state if the scalar is greater than a predetermined threshold and that the user is in a second state if the scalar is not greater than the predetermined threshold.”  Here, Endo discloses a prediction vector (“a vector of fuzzy classes”), wherein the “vector of fuzzy classes” predicts the “state of the user”)).
The combination of Raanani and Endo thus far fails to teach aggregating the accuracy value for each prediction vector and the prediction cost associated with the trained artificial intelligence classification model that generated the prediction vector to determine an optimal prediction vector
Chaudhari teaches using the prediction cost associated with the trained artificial intelligence classification model that generated the prediction to determine an optimal prediction (Chaudhari, Abstract, discloses:  “Embodiments are directed to a machine learning engine that determines training documents and validation documents from a plurality of documents. The machine learning engine may determine attributes associated with the documents. In response to receiving a request to predict attribute values of a selected document the machine learning engine may train a plurality of ML models to predict the attribute values based on the training documents and the attributes and associate the trained ML models with an accuracy score. The machine learning engine may determine candidate ML models from the trained ML models based on the training accuracy scores. The machine learning engine may evaluate and rank the candidate ML models based on the request and the validation documents. The machine learning engine may generate confirmed ML models based on the ranked candidate ML models such that the confirmed ML models may answer the request.”).  Here, Chaudhari discloses using the prediction cost associated with the trained artificial intelligence classification model that generated the prediction (“associate the trained ML models with an accuracy score”) to determine an optimal prediction (“The machine learning engine may determine candidate ML models from the trained ML models based on the training accuracy scores”)).  In combination with Raanani’s using the accuracy value to determine an optimal prediction, one would consider both measures in making the decision, and this results in aggregating the accuracy value (Raanani) for each prediction vector and the prediction cost (Chaudhari) associated with the trained artificial intelligence classification model that generated the prediction vector to determine an optimal prediction vector.)
As per Claim 6, the combination of Raanani, Endo, and Chaudhari teaches the system of claim 1 as shown above, as well as wherein generating a user interaction modification based upon the identified communication feature comprises creating a recommendation message instructing a communication participant to change the identified communication feature in (Raanani, Para [0040], discloses “Additionally, the mapping 145 may be used for guiding representatives with respective customers. For example, if the classifier has learned that a specific representative and customer would lead, e.g., with high probability, to success if the representative kept their speech rate below five words/second, asked open-ended questions, resolved objections raised by the customer, it may be used to inform the representative and/or their managers to adjust their speech rate, ask open-ended questions, and/or address the objections accordingly. On the other hand, if, for example, the classifier has learned that a conversation between an introverted man over age 50 and an extroverted 20 year old woman leads to low closing rates, it may signal the representative and/or their managers to expedite wrap-up of such conversations and/or later remap them, to avoid losing time on a call that will not yield desired results.”  Here, Raanani discloses instructing a communication participant (“inform the representative”) to change the identified communication feature in subsequent user interactions (“to adjust their speech rate”)).

As per Claim 9, the combination of Raanani, Endo, and Chaudhari teaches the system of claim 1.  Raanani teaches wherein the server computing device uses the prediction [vectors] generated by executing the plurality of trained artificial intelligence classification models to train the artificial intelligence classification models prior to subsequent user interactions. (Raanani, Para [0034], discloses:  “FIG. 3 is a block diagram of the classifier component for generating classifiers, consistent with various embodiments. The example 300 illustrates the classifier component 112 using the features 115 extracted from the feature generation component 111 to build a number of classifiers, “C1”-“CN.” In some embodiments, the classifier component 112 is run on a dedicated portion of the collected recordings, e.g., a training set, which is a subset of the entire recordings available for analysis, to model the conversation outcomes.” Here, Raanani discloses by executing the plurality of trained artificial intelligence classification models (“a number of classifiers”) to train the artificial intelligence classification models prior to subsequent user interactions (“run on a dedicated portion of the collected recordings, e.g., a training set”)).
However, Raanani does not teach prediction vectors.
Endo teaches prediction vectors  (Endo, Para [0006], discloses:  “To determine the state of the user, the method generates an utterance parameter vector based upon the utterance parameters, converts the utterance parameter vector to an indication representing the state of the user, and determines the state of the user based upon the indication. To generate the utterance parameter vector, the method determines the number of segments for each classification, and divides the number of segments for each classification by the total number of segments in the utterance. The utterance parameter vector is converted to the indication by applying a linear function to the utterance parameter vector to generate one of a scalar, a vector of fuzzy classes, and an index representing the state of the user. In case the indication is a scalar, it is determined that the user is in a first state if the scalar is greater than a predetermined threshold and that the user is in a second state if the scalar is not greater than the predetermined threshold.”  Here, Endo discloses a prediction vector (“a vector of fuzzy classes”), wherein the “vector of fuzzy classes” predicts the “state of the user”)).

As per Claim 10, the combination of Raanani, Endo, and Chaudhari teaches the system of claim 1 as shown above, as well as wherein at least one of the first client computing device and the second client computing device displays the generated user interaction modification to the corresponding communication participant. (Raanani, Para [0040], discloses “Additionally, the mapping 145 may be used for guiding representatives with respective customers. For example, if the classifier has learned that a specific representative and customer would lead, e.g., with high probability, to success if the representative kept their speech rate below five words/second, asked open-ended questions, resolved objections raised by the customer, it may be used to inform the representative and/or their managers to adjust their speech rate, ask open-ended questions, and/or address the objections accordingly. On the other hand, if, for example, the classifier has learned that a conversation between an introverted man over age 50 and an extroverted 20 year old woman leads to low closing rates, it may signal the representative and/or their managers to expedite wrap-up of such conversations and/or later remap them, to avoid losing time on a call that will not yield desired results.”  Here, Raanani discloses displays (“inform the representative”, “signal the representative”) the generated user interaction modification (“adjust their speech rate”) to the corresponding communication participant (“the representative”).  Examiner’s note:  The form of “inform” or “signal” is not explicitly stated to be visual in Raanani, but neither is it restricted, and an audio signal would not work in this case as the representative is currently engaged in a voice call).

As per Claim 11, the combination of Raanani, Endo, and Chaudhari teaches the system of claim 1 as shown above, as well as wherein at least one of the first client computing device (Endo, Para [0010], discloses:  “The speech waveform storage module selects the audio waveform of the voice prompt to have a tone that is consistent with the determined state of the user”.  Here, Endo discloses a generated user interaction modification (change the “tone” to be “consistent with the determined state of the user”), and adapts a communication stream based upon it (“selects the audio waveform of the voice prompt”)).

As per Claim 12, Claim 12 is a method claim corresponding to system Claim 1.  Claim 12 is rejected for the same reasons as Claim 1.

As per Claim 13, Claim 13 is a method claim corresponding to system Claim 2.  Claim 13 is rejected for the same reasons as Claim 2.

As per Claim 14, Claim 14 is a method claim corresponding to system Claim 3.  Claim 14 is rejected for the same reasons as Claim 3.

As per Claim 15, Claim 15 is a method claim corresponding to system Claim 4.  Claim 15 is rejected for the same reasons as Claim 4.

As per Claim 16, Claim 16 is a method claim corresponding to system Claim 5.  Claim 16 is rejected for the same reasons as Claim 5.

As per Claim 17, Claim 17 is a method claim corresponding to system Claim 6.  Claim 17 is rejected for the same reasons as Claim 6.

As per Claim 20, Claim 20 is a method claim corresponding to system Claim 9.  Claim 20 is rejected for the same reasons as Claim 9.

As per Claim 21, Claim 21 is a method claim corresponding to system Claim 10.  Claim 21 is rejected for the same reasons as Claim 10.

As per Claim 22, Claim 22 is a method claim corresponding to system Claim 11.  Claim 22 is rejected for the same reasons as Claim 11.

Claims 7, 8, 18, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Raanani, Endo, Chaudhari in view of Qin et. al. (US 2016/0162800 A1; hereinafter Qin)
As per Claim 7, the combination of Raanani, Endo, and Chaudhari teaches the system of claim 1 as shown above.  Raanani teaches wherein the server computing device comprises a plurality of processors (Raanani, Para [0051], discloses:  “The computing system 700 may include one or more central processing units (“processors”) 705, memory 710, input/output devices 725 (e.g., keyboard and pointing devices, display devices), storage devices 720 (e.g., disk drives), and network adapters 730 (e.g., network interfaces) that are connected to an interconnect 715”.
each artificial intelligence classification model (Raanani, Para [0034], as shown above, discloses a plurality of classifiers:  “The classifier component 114 can then choose one of the set of classifiers”).
However, Raanani does not teach each artificial intelligence model executes on a different processor of the server computing device
Qin teaches each artificial intelligence model executes on a different processor of the computing device (Qin, Para [0062], discloses:  “In at least some of the embodiments described above, the modeling system 100 may decompose a first or original learning model or algorithm into smaller or simpler multiple learning algorithms to be trained and subsequently operated concurrently or simultaneously on separate processors or processing units of a computing system. Accordingly, the overall execution time of the learning model may be greatly accelerated, and thus may be capable of handling larger operational data sets than previously contemplated. Further, since each of the generated multiple learning algorithms is configured to be trained on separate partitions (501-504) of a sample data set, as well as on different operational data units, the amount of communication or coordination between the multiple learning algorithms is minimized, thus potentially maximizing any speed-up provided by the multiple learning algorithms being trained and operated on the separate processors”.  Here, Qin discloses each artificial intelligence model executes on a different processor of the computing device (“multiple learning algorithms to be trained and subsequently operated concurrently or simultaneously on separate processors or processing units of a computing system”).
Raanani and Qin are analogous are because they are both in the field of endeavor of artificial intelligence.
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the multiple classifiers of Raanani, with the models on different processors of Qin. The modification would have been obvious because one of ordinary skill in the art would be motivated to get results faster (Qin, Para [0062]: “Further, since each of the generated multiple learning algorithms is configured to be trained on separate partitions (501-504) of a sample data set, as well as on different operational data units, the amount of communication or coordination between the multiple learning algorithms is minimized, thus potentially maximizing any speed-up provided by the multiple learning algorithms being trained and operated on the separate processors”).

As per Claim 8, the combination of Raanani, Endo, Chaudhari, and Qin teaches the system of claim 1 as shown above, as well as wherein each of the plurality of processors comprises a GPU. (Qin, Para [0065], discloses:  “The example of the processing system 1300 includes a processor 1302 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), or both), a main memory 1304 (e.g., random access memory), and static memory 1306 (e.g., static random-access memory), which communicate with each other via bus 1308.”).

As per Claim 18, Claim 18 is a method claim corresponding to system Claim 7.  Claim 18 is rejected for the same reasons as Claim 7.

As per Claim 19, Claim 19 is a method claim corresponding to system Claim 8.  Claim 19 is rejected for the same reasons as Claim 8.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Fay et. al. (WO 2012/155079 A2) discloses automatically adjusting settings of a voice recognition system based on user expertise, which is evaluated based on features such as speed, volume, and tone of speech
Bouzid et. al. (US 2013/0272511 A1) discloses choosing a specific speech resource to increase the quality of a user’s experience, also disclosing a cost per transaction and recognition accuracy for an ASR engine
Subramaniam et. al. (“Business Intelligence from Voice of Customer”) discloses gaining insight based on voice interactions with customers, for example in Section V B, detecting phrases that resulted in a positive outcome.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEONARD A SIEGER whose telephone number is (571)272-9710.  The examiner can normally be reached on M-F 8:00 am - 5:00 pm.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571) 272-9767.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/L.A.S./Examiner, Art Unit 2126     
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126