Detailed Action
This action is in response to Applicant's communications filed 7 April 2021.
Claim(s) 1, 4, 6-8, and 13-15 was/were amended.  No claims were cancelled. No claims were withdrawn.  No claims were added.  Therefore, claims 1-20 are pending in this Application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments/Arguments
Applicant's arguments, filed 18 December 2020, regarding the rejections of claims 1, 8, and 15 under 35 USC 103 have been fully considered but are not persuasive.
Applicant argues (Remarks, p. 16) that the cited references do not disclose "training predictive models to predict hidden states of Hidden Markov Models (HMMs)."  More specifically, Applicant argues (Remarks, p. 16) that Mehanian does not teach "training... a first predictive model to predict the first hidden state of the Markov model... and ... a second predictive model to predict the second hidden state of the Markov model."  Applicant argues (Remarks, p. 17-18) that Mehanian instead teaches using HMMs to make predictions directly (Figures 5 and 6) or provides outputs to a classifier (Figure 10) but does not train a classifier to predict hidden states of the HMM.

Mehanian teaches that a first model is created for purchase clickstreams and a second model is created for non-purchase clickstreams for predicting the hidden states in Figure 5 and paragraph [0081]: "The purchase clickstreams 530 are used to train 570 an HMM that models purchase clickstreams (e.g., a purchase HMM). The purchase HMM parameters 550 (also referred to herein simply as purchase parameters) may also be stored, for example, in the data store for access by the production engine 350. The non-purchase clickstreams 540 are used to train 580 an HMM (e.g., a non-purchase HMM) that models non-purchase clickstreams. The non-purchase HMM parameters 560 (also referred to herein simply as purchase parameters) may also be stored, for example, in the data store for access by the production engine 350."  

    PNG
    media_image1.png
    398
    577
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    433
    636
    media_image2.png
    Greyscale

Mehanian then teaches that the first and second hidden state are predicted in Figure 6 and paragraph [0086]: The likelihood 650 of the clickstream path 630 is computed under the purchase HMM parameters 550. In addition, the likelihood 660 of the clickstream path 630 is computed under the non-purchase HMM parameters 560. Both of the computations are carried out using the HMM scoring algorithm 695 also known as the Forward Recursion."  Examiner notes that the steps 650 and 660 predict the hidden states of the hidden markov model.  Therefore Applicant's assertion that Mehanian only teaches a HMM for directly predicting the purchase probability without predicting the hidden states is incorrect. Thus, Mehanian teaches the limitations of the claims.
The rejection of the dependent claims for depending from rejected claims is maintained.
For the aforementioned reasons, claims 1-20 are rejected under 35 USC 103.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to 

Claims 1, 2, 5, 15, 16, and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mehanian et al. (US 2015/0269609, hereinafter "Mehanian") in view of Dilip et al. (US 2011/0179114, hereinafter "Dilip).

Regarding Claim 1,
Mehanian teaches a method comprising:
accessing, from a non-transitory computer-readable medium (computer-readable medium [0110]), training non-conversational data (Figure 5, clickstreams 530, 540) having observations (Figure 5, HMM params 550, 560; "In some embodiments, the HMM training module 344 (described in detail in reference to FIG. 3B) learns the model parameters from historical data using a certain algorithm, such as the Baum-Welch algorithm, to model clickstream paths with HMMs." [0049]);
fitting, by the processing device (data processing system [0111]), a hidden Markov model to the training non-conversational data ("The HMM training module 344 uses the clickstreams to train one or more HMMs." [0069]), wherein the hidden Markov model includes a first hidden state and a second hidden state (Figure 1A, State 0, State 1);
(Figure 1C, State 0 Emission Distribution 170) associated with the first hidden state of the hidden Markov Model (Figure 1C, State 0) and (ii) a second data segment that includes a second subset of the observations (Figure 1D, State 1 Emission Distribution 180) associated with the second hidden state of the hidden Markov Model (Figure 1D, State 1) ("A forward variable is computed for each event that occurs. For example, a forward variable is computed for a first page visit, and then for every page visit subsequent to that during a browsing session of a particular user. At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]);
training, by the processing device (data processing system [0111]), (a) a first predictive model to predict the first hidden state of the hidden Markov model (Figure 5, Train HMM 570, Purchase HMM params 550; Figure 6, Purchase likelihood 650) based on the first data segment (Figure 5, Purchase clickstreams 530) rather than the second data segment (Figure 5, Non-purchase clickstreams 540) and (b) a second predictive model to predict the second hidden state of the hidden Markov Model (Figure 5, Train HMM 580, Non-purchase HMM params 560; Figure 6, non-purchase likelihood 660) based on the second data segment (Non-purchase clickstreams 540) rather than the first data segment (Figure 5, Purchase clickstreams 530);
determining that input non-conversational data (Figure 6, Clickstream 630) for an entity is more likely to correspond to the first hidden state as compared to the second hidden state ("As described at least in reference to FIG. 6, the classifier module 354 scores (e.g., using the Forward Recursion algorithm) the current clickstream using the HMM parameters received from the training engine 340 to determine likelihoods that the current clickstream (e.g., encoded in the form of a sequence of symbols as in FIGS. 7A and 7B) corresponds to an output of the HMM." [0073]); and
generating a predicted behavior (Figure 10, Purchase probability 1090; alternatively, Score Purchase HMM output 1030, 1040) by applying the first predictive model (Figure 10, Score Purchase HMM 1010) to input non-conversational data (Figure 10, Clickstream 630) ("Other attributes (e.g., user type (e.g., consumer or business), demographic information, product information, etc.) 1080, from other sources, may also be used as input to the neural network classifier 1070. Based on all of its inputs, the neural network classifier 1070 outputs a purchase probability 1090. This computed probability is informed by both clickstream and non-clickstream data." [0098]).

Mehanian does not explicitly teach accessing training conversational data, 
determining, from the training conversational data, frequency statistics for textual content within the training conversational data;
generating decision points by providing the frequency statistics to a trained classification model and receiving, from the trained classification model, the decision points, wherein each decision point identifies a respective action or a sentiment,
grouping decision points and observations into data segments, and 
using the data segments in predicting hidden states and predicted behavior.

Dilip teaches accessing training conversational data ("By extracting data from multiple sources (e.g. social media conversations...)" [0051]; "In the following example, a conversation includes "Deciding which camera to buy between a Canon Powershot SD1000 or a Nikon Coolpix S230."" [0026]), 
determining, from the training conversational data, frequency statistics for textual content within the training conversational data ("The intent analysis procedures discussed herein use various machine learning algorithms, machine learning processes, and classification algorithms to determine a user intent associated with one or more user communications and/or user interactions. These algorithms and procedures identify various statistical correlations between topics, phrases, and other data." [0069]; "The “Birthday” topic cluster includes topics: balloons and cake. These topic clusters are regularly updated by adding new topics with high weightings and by reducing the weighting associated with older, less frequently used comments." [0098]);
generating decision points ("Procedure 800 is useful in classifying words and/or phrases contained in various social media communications, catalogs, product listings, online conversations and any other data source. [0083]) by providing the frequency statistics to a trained classification model and receiving, from the trained classification model, the decision points, wherein each decision point identifies a respective action or a sentiment ("Intent analyzer 120 determines an intent associated with the various user communications and response generator 118 generates a response to particular communications based on the intent and other data associated with similar communications. A user intent may include, for example, an intent to purchase a product or service, an intent to obtain information about a product or service, an intent to seek comments from other users of a product or service, and the like." [0024]; "The intent analysis procedures discussed herein use various machine learning algorithms, machine learning processes, and classification algorithms to determine a user intent associated with one or more user communications and/or user interactions. These algorithms and procedures identify various statistical correlations between topics, phrases, and other data." [0069]), 
grouping decision points and observations into data segments ("Topic clustering module 218 identifies related topics and clusters those topics together to support the intent analysis described herein." [0038]; "The message components are then correlated with other message components from multiple social media interactions and communications to generate topic clusters (block 708). The message components may also be correlated with information from other information sources, such as product information sources, product review sources, and the like. The correlated message components are formed into one or more topic clusters associated with a particular topic (e.g., a product, service, or product category)" [0063]), and 
using the data segments in predicting a first or second hidden state ("The intent analysis procedures discussed herein use various machine learning algorithms, machine learning processes, and classification algorithms to determine a user intent associated with one or more user communications and/or user interactions." [0069]; "The second type of information is associated with a user's intent level (e.g., whether they are gathering information or ready to buy a particular product or service)." [0081]; this teaches analyzing words to determine if the user is in a first state of gathering information or a second state of ready to buy) and predicted behavior (Fig. 9, generate response 906, 910 based on user intent 904, 908; "A Maximum entropy classifier is a model used to predict the probabilities of different possible outcomes." [0064]).
Mehanian and Dilip are analogous art because both are directed towards determining through user interaction a user's predicted intent to purchase. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the hidden markov model based on website interactions of Mehanian with the user intent conversation analysis of Dilip.  The modification would have been obvious because one of ordinary skill would be motivated to support users of online systems and services by analyzing intent associated with user communications, as suggested by Dilip (Dilip: [0003]).

Regarding Claim 2,
The Mehanian/Dilip combination teaches the method of claim 1.  Mehanian further teaches wherein fitting the hidden Markov model comprises:
selecting training sequences of observations from the training non-conversational data ("During a particular user's session, the training encoder module 342 (discussed in detail in reference to FIG. 3B) may encode the user's clickstream path as HSSSP. Thus, the user started out on the website's home page (H). The user then made 3 searches, landing on 3 different search results pages (SSS). The user then selected (e.g., touched, clicked, etc.) on one of the listed products on the last search results page, which landed him or her on a product description page (P)" [0047]), wherein each observation  includes one or more respective task features ("FIG. 2 is a table illustrating an example embodiment of a coding scheme for clickstream analysis. In this example, the resulting emission alphabet 202 has 12 symbols and the table includes a column showing an example emission alphabet 202, a list of webpages or actions 204 corresponding to each symbol in the emission alphabet 202, and set of descriptions 206 corresponding to each symbol in the emissions alphabet 202." [0046]) related to respective interactions ("user's clickstream path as HSSSP" [0047]) with a respective consumer entity  ("particular user's session" [0047]) via a relationship management tool (training encoder module 342);
identifying a number of states for the hidden Markov model ("FIGS. 1A-1E illustrate an example embodiment for a 2-state HMM including a state diagram, a transition matrix, emission distributions and a sample output. FIG. 1A in particular is an illustration 100 of an example 2-state HMM, where the two states are represented as disks labeled State 0 (disk 104) and State 1 (disk 108)." [0040]); and
fitting the training sequences of observations to a corresponding Markov chain having the identified number of states ("The HMM accounts for this clickstream path as follows. The user started in State 0 and made transitions back to State 0 for the next 4 page visits. For instance, the user remained in State 0 for 5 sequential page visits). While in State 0, the user visited different pages with probabilities given by the emission distribution for State 0 shown in FIG. 1C. After the fifth page, the user transitioned to State 1 and made transitions back to State 1 for the next 3 page visits. While in State 1, the user visited different pages with probabilities most closely corresponding to the distribution given by the State 1 emission distribution shown in FIG. 1D." [0048]).

Regarding Claim 5,
The Mehanian/Dilip combination teaches the method of claim 1.  Mehanian and Dilip further teach wherein grouping the observations and the decision points into the data segments comprises:
determining that the first hidden state is associated with a first subset of the observations associated with a first time period (Mehanian: "the historical data may include session data pertaining to a multiplicity of users over a period of time. The session data includes a record of each page (URL) that a given user visited in his or her session." [0080]; "At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]);
identifying a first subset of the decision points associated with the first time period (Dilip: " In operation, response generator 118 retrieves social media interactions and similar communications (e.g., “tweets” on Twitter, blog posts and social media posts) during a particular time period" [0059]);
assigning the first subset of the observations and the first subset of the decision points to the first data segment (Dilip: "Topic clustering module 218 identifies related topics and clusters those topics together to support the intent analysis described herein." [0038]; "The message components are then correlated with other message components from multiple social media interactions and communications to generate topic clusters (block 708). The message components may also be correlated with information from other information sources, such as product information sources, product review sources, and the like. The correlated message components are formed into one or more topic clusters associated with a particular topic (e.g., a product, service, or product category)" [0063]) based on the first time period being associated with the first hidden state, the first subset of the observations (Mehanian: "the historical data may include session data pertaining to a multiplicity of users over a period of time. The session data includes a record of each page (URL) that a given user visited in his or her session." [0080]; "At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]), and the first subset of the decision points (Dilip: " In operation, response generator 118 retrieves social media interactions and similar communications (e.g., “tweets” on Twitter, blog posts and social media posts) during a particular time period" [0059]);
determining that the second hidden state is associated with a second subset of the observations associated that with a second time period (Mehanian: "the historical data may include session data pertaining to a multiplicity of users over a period of time. The session data includes a record of each page (URL) that a given user visited in his or her session." [0080]; "At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]);
(Dilip: " In operation, response generator 118 retrieves social media interactions and similar communications (e.g., “tweets” on Twitter, blog posts and social media posts) during a particular time period" [0059]); and
assigning the second subset of the observations and the second subset of the decision points to the second data segment (Dilip: "Topic clustering module 218 identifies related topics and clusters those topics together to support the intent analysis described herein." [0038]; "The message components are then correlated with other message components from multiple social media interactions and communications to generate topic clusters (block 708). The message components may also be correlated with information from other information sources, such as product information sources, product review sources, and the like. The correlated message components are formed into one or more topic clusters associated with a particular topic (e.g., a product, service, or product category)" [0063]) based on the second time period being associated with the second hidden state, the second subset of the observations (Mehanian: "the historical data may include session data pertaining to a multiplicity of users over a period of time. The session data includes a record of each page (URL) that a given user visited in his or her session." [0080]; "At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]), and the second subset of the decision points (Dilip: " In operation, response generator 118 retrieves social media interactions and similar communications (e.g., “tweets” on Twitter, blog posts and social media posts) during a particular time period" [0059]).
Mehanian and Dilip are analogous art because both are directed towards determining through user interaction a user's predicted intent to purchase. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the hidden markov model based on website interactions of Mehanian with the user intent conversation analysis of Dilip.  The modification would have been obvious because one of ordinary skill would be motivated to support users of online systems and services by analyzing intent associated with user communications, as suggested by Dilip (Dilip: [0003]).

Regarding Claims 15, 16, and 18,
Claims 15, 16, and 18 recite(s) a non-transitory medium having instructions executable by a processing device for performing functions corresponding to the method steps  performed by a processing device recited in claims 1, 2, and 5, respectively.  The Mehanian/Dilip combination teaches the limitations of claims 15, 16, and 18 as set forth above in connection with claims 1, 2, and 5.  Therefore, claims 15, 16, and 18 are rejected under the same rationale as respective claims 1, 2, and 5.

Claims 3-4 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mehanian et al. (US 2015/0269609, hereinafter "Mehanian") in view of Dilip et al. (US 2011/0179114, hereinafter "Dilip) and Yue et al. (Modeling Search Processes using Hidden States in Collaborative Exploratory Web Search, hereinafter "Yue").

Regarding Claim 3,
The Mehanian/Dilip combination teaches the method of claim 2.  The Mehanian/Dilip combination does not teach 
evaluating, based on a first output of a model-selection function, the hidden Markov model having the identified number of states, wherein the model-selection function includes a first term rewarding an increased log-likelihood for the hidden Markov model and a second term penalizing an increased number of states in the hidden Markov model;
identifying a different number of states for the hidden Markov model; and
fitting the training sequences of observations to an additional Markov chain having the different number of states;
evaluating, based on a second output of the model-selection function, the hidden Markov model having the different number of states; and
selecting the hidden Markov model having the different number of states based on the second output of the model-selection function being less than the first output of the model-selection function.

Yue teaches evaluating, based on a first output of a model-selection function (model selection functions "Akaike information criterion (AIC)" and "Bayesian information criterion (BIC)" in sec. Model Selection, p. 824-825; the lower the AIC or BIC score the better), the hidden Markov model having the identified number of states, wherein the model-selection function ("BIC = -2 x log(L) + log(s) x p   Eq. (1)" sec. Model Selection, p. 824-825) includes a first term rewarding an increased log-("L denotes the log-likelihood of all samples" sec. Model Selection, p. 824-825) and a second term penalizing an increased number of states in the hidden Markov model ("p can be computed using (N-1) + (N-1)x(N-1) + Nx(M-1)" sec. Model Selection, p. 824-825; "number of hidden states N" sec. Model Selection, p. 824-825);
identifying a different number of states for the hidden Markov model and fitting the training sequences of observations to an additional Markov chain having the different number of states; evaluating, based on a second output of the model-selection function, the hidden Markov model having the different number of states ("Figure 5 plots the BIC values against the number of hidden states in the IND condition. We can see that BIC has the optimal value when the number of hidden states is set to 4." sec. Validation of HMM, p. 825; Figure 5 teaches testing models with hidden states numbering 2, 3, 4, 5, 6, and 7 and evaluating each using BIC); and
selecting the hidden Markov model having the different number of states based on the second output of the model-selection function being less than the first output of the model-selection function ("Figure 5 plots the BIC values against the number of hidden states in the IND condition. We can see that BIC has the optimal value when the number of hidden states is set to 4." sec. Validation of HMM, p. 825;  Table 2: Hidden States and Emission Probability in IND; "In HMM, each observed action corresponds to a hidden state in Table 2." sec. Validating HMM in Individual Search; Table 2 uses four hidden states as found in Figure 5 to be the optimal value).
(Yue: sec. Model Selection, p. 824).

Regarding Claim 4,
The Mehanian/Dilip/Yue combination teaches the method of claim 3.  Mehanian and Dilip further teaches wherein fitting further comprises identifying, from the hidden Markov model, (i) the first subset of observations (Mehanian: Figure 1C, State 0 Emission Distribution 170) and decision points (Dilip: "Intent analyzer 120 determines an intent associated with the various user communications and response generator 118 generates a response to particular communications based on the intent and other data associated with similar communications. A user intent may include, for example, an intent to purchase a product or service, an intent to obtain information about a product or service, an intent to seek comments from other users of a product or service, and the like." [0024]; "The intent analysis procedures discussed herein use various machine learning algorithms, machine learning processes, and classification algorithms to determine a user intent associated with one or more user communications and/or user interactions. These algorithms and procedures identify various statistical correlations between topics, phrases, and other data." [0069]), wherein the first subset of observations and decision points have a first probability of being associated with the first hidden state of the hidden Markov model (Mehanian: Figure 6, Purchase likelihood 650), wherein the first probability is greater than a threshold (Mehanian: Figure 6, Compute probability 670, non-purchase likelihood 660; the non-purchase likelihood is a threshold for the purchase likelihood for determining which is more likely in computing the purchase probability) and (ii) the second subset set of observations (Figure 1D, State 1 Emission Distribution 180) and decision points ("Intent analyzer 120 determines an intent associated with the various user communications and response generator 118 generates a response to particular communications based on the intent and other data associated with similar communications. A user intent may include, for example, an intent to purchase a product or service, an intent to obtain information about a product or service, an intent to seek comments from other users of a product or service, and the like." [0024]; "The intent analysis procedures discussed herein use various machine learning algorithms, machine learning processes, and classification algorithms to determine a user intent associated with one or more user communications and/or user interactions. These algorithms and procedures identify various statistical correlations between topics, phrases, and other data." [0069]), wherein the second subset set of observations and decision points have a second probability of being associated with the second hidden state of the hidden Markov model (Mehanian: Figure 6, non-purchase likelihood 660), wherein the second probability is greater than the threshold (Mehanian: Figure 6, Compute probability 670, non-purchase likelihood 660; the purchase likelihood is a threshold for the non-purchase likelihood for determining which is more likely in computing the purchase probability).

Regarding Claim 17,
Claim 17 recite(s) a non-transitory medium having instructions executable by a processing device for performing functions corresponding to the method steps  performed by a processing device recited in claim 3.  The Mehanian/Dilip/Yue combination teaches the limitations of claim 17 as set forth above in connection with claim 3.  Therefore, claim 17 is rejected under the same rationale as respective claim 3.

Claims 6 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mehanian et al. (US 2015/0269609, hereinafter "Mehanian") in view of Dilip et al. (US 2011/0179114, hereinafter "Dilip) and Yeung et al. (Logistic Regression: An advancement of predicting consumer purchase propensity, hereinafter "Yeung").

Regarding Claim 6,
The Mehanian/Dilip combination teaches the method of claim 1.  Mehanian and Dilip further teach wherein training the first predictive model for the first hidden state and the second predictive model for the second hidden state comprises, for each hidden state of the first and second hidden states:
selecting a respective data segment is associated with the hidden state (Dilip: "The intent analysis procedures discussed herein use various machine learning algorithms, machine learning processes, and classification algorithms to determine a user intent associated with one or more user communications and/or user interactions." [0069]; "The second type of information is associated with a user's intent level (e.g., whether they are gathering information or ready to buy a particular product or service)." [0081]), wherein the respective data segment includes decision point values (Dilip: "Topic clustering module 218 identifies related topics and clusters those topics together to support the intent analysis described herein." [0038]; "The message components are then correlated with other message components from multiple social media interactions and communications to generate topic clusters (block 708). The message components may also be correlated with information from other information sources, such as product information sources, product review sources, and the like. The correlated message components are formed into one or more topic clusters associated with a particular topic (e.g., a product, service, or product category)" [0063]), observation values (Mehanian: Figure 5, HMM params 550, 560; "In some embodiments, the HMM training module 344 (described in detail in reference to FIG. 3B) learns the model parameters from historical data using a certain algorithm, such as the Baum-Welch algorithm, to model clickstream paths with HMMs." [0049]; "A forward variable is computed for each event that occurs. For example, a forward variable is computed for a first page visit, and then for every page visit subsequent to that during a browsing session of a particular user. At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]), and training predictive behavior values (Dilip: Fig. 9, generate response 906, 910 based on user intent 904, 908; "A Maximum entropy classifier is a model used to predict the probabilities of different possible outcomes." [0064]);
Mehanian and Dilip are analogous art because both are directed towards determining through user interaction a user's predicted intent to purchase. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the hidden markov model based on website interactions of Mehanian with the user intent conversation analysis of Dilip.  The modification would have been obvious because one of ordinary skill would be motivated to support users of online systems and services by analyzing intent associated with user communications, as suggested by Dilip (Dilip: [0003]).

The Mehanian/Dilip combination does not teach accessing a logistic regression model having (i) predictor variables corresponding to the decision point values and the observation values and (ii) an output variable corresponding to the training predictive behavior values;
determining a respective set of regression coefficients that combine the predictor variables having the decision point values and the observation values from the respective data segment into the training predictive behavior values; and
outputting the logistic regression model with the respective set of regression coefficients as a respective predictive model for the hidden state.

Yeung teaches accessing a logistic regression model ("logistic regression model" sec. Development of logistic regression model, p. 76) having (i) predictor ("explanatory variables (EVs)" sec. Questionnaire design – dependent and explanatory variables, p. 73) and (ii) an output variable corresponding to the training predictive behavior values ("dependent variable (DV)" sec. Questionnaire design – dependent and explanatory variables, p. 73);
determining a respective set of regression coefficients ("coefficients Bj" sec. Development of logistic regression model, p. 76) that combine the predictor variables having the decision point values and the observation values from the respective data segment into the training predictive behavior values (Table 3; "From these results, the model was as follows:

    PNG
    media_image3.png
    102
    906
    media_image3.png
    Greyscale
); and
outputting the logistic regression model with the respective set of regression coefficients as a respective predictive model for the hidden state ("As shown in Table 4, the overall accuracy of the logistic regression model to predict consumer purchase propensity is acceptable (overall 66.5% predicted correctly)" sec. Evaluation of predictive power using classification table, pp. 78-79).

Mehanian and Yeung are analogous art because they are both directed towards predicting a user's purchase probability. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the purchase prediction model of the Mehanian/Dilip/Yue combination with the logistic regression model of Yeung.  The modification would have been obvious because Mehanian suggests substituting its neural (Mehanian: [0098]).  One of ordinary skill in the art would be motivated to perform this substitution because logistic regression is an effective analytical method that is well suited for predicting consumer purchase propensity, as suggested by Yeung (Yeung: Abstract, p. 71; Conclusion, p. 79).

Regarding Claim 19,
Claim 19 recite(s) a non-transitory medium having instructions executable by a processing device for performing functions corresponding to the method steps  performed by a processing device recited in claim 6.  The Mehanian/Dilip/Yeung combination teaches the limitations of claim 19 as set forth above in connection with claim 6.  Therefore, claim 19 is rejected under the same rationale as respective claim 6.

Claims 7 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mehanian et al. (US 2015/0269609, hereinafter "Mehanian") in view of Dilip et al. (US 2011/0179114, hereinafter "Dilip), Yue et al. (Modeling Search Processes using Hidden States in Collaborative Exploratory Web Search, hereinafter "Yue"), and Yeung et al. (Logistic Regression: An advancement of predicting consumer purchase propensity, hereinafter "Yeung").

Regarding Claim 7,
The Mehanian/Dilip combination teaches the method of claim 1.  Mehanian and Dilip further teach wherein training the hidden Markov model comprises (i) selecting training sequences of observations from the training non-conversational data (Mehanian: "During a particular user's session, the training encoder module 342 (discussed in detail in reference to FIG. 3B) may encode the user's clickstream path as HSSSP. Thus, the user started out on the website's home page (H). The user then made 3 searches, landing on 3 different search results pages (SSS). The user then selected (e.g., touched, clicked, etc.) on one of the listed products on the last search results page, which landed him or her on a product description page (P)" [0047]) and (ii) fitting the training sequences of observations to a corresponding Markov chain (Mehanian: "The HMM accounts for this clickstream path as follows. The user started in State 0 and made transitions back to State 0 for the next 4 page visits. For instance, the user remained in State 0 for 5 sequential page visits). While in State 0, the user visited different pages with probabilities given by the emission distribution for State 0 shown in FIG. 1C. After the fifth page, the user transitioned to State 1 and made transitions back to State 1 for the next 3 page visits. While in State 1, the user visited different pages with probabilities most closely corresponding to the distribution given by the State 1 emission distribution shown in FIG. 1D." [0048]);
wherein grouping the observations and the decision points into the data segments comprises, for each hidden state:
determining that the hidden state is associated with a respective subset of the observations associated with a respective time period (Mehanian: "the historical data may include session data pertaining to a multiplicity of users over a period of time. The session data includes a record of each page (URL) that a given user visited in his or her session." [0080]; "At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]),
identifying a respective subset of the decision points associated with the respective time period (Dilip: " In operation, response generator 118 retrieves social media interactions and similar communications (e.g., “tweets” on Twitter, blog posts and social media posts) during a particular time period" [0059]), and
grouping the respective subset of the observations and the respective subset of the decision points into a respective one of the data segments (Dilip: "Topic clustering module 218 identifies related topics and clusters those topics together to support the intent analysis described herein." [0038]; "The message components are then correlated with other message components from multiple social media interactions and communications to generate topic clusters (block 708). The message components may also be correlated with information from other information sources, such as product information sources, product review sources, and the like. The correlated message components are formed into one or more topic clusters associated with a particular topic (e.g., a product, service, or product category)" [0063]);
wherein generating the first predictive model for the first hidden state and the second predictive model for the second hidden state comprises, for each hidden state of the first and second hidden states:
selecting a respective data segment is associated with the hidden state (Dilip: "The intent analysis procedures discussed herein use various machine learning algorithms, machine learning processes, and classification algorithms to determine a user intent associated with one or more user communications and/or user interactions." [0069]; "The second type of information is associated with a user's intent level (e.g., whether they are gathering information or ready to buy a particular product or service)." [0081]), the data segment including decision point values (Dilip: "Topic clustering module 218 identifies related topics and clusters those topics together to support the intent analysis described herein." [0038]; "The message components are then correlated with other message components from multiple social media interactions and communications to generate topic clusters (block 708). The message components may also be correlated with information from other information sources, such as product information sources, product review sources, and the like. The correlated message components are formed into one or more topic clusters associated with a particular topic (e.g., a product, service, or product category)" [0063]), observation values (Mehanian: Figure 5, HMM params 550, 560; "In some embodiments, the HMM training module 344 (described in detail in reference to FIG. 3B) learns the model parameters from historical data using a certain algorithm, such as the Baum-Welch algorithm, to model clickstream paths with HMMs." [0049]; "A forward variable is computed for each event that occurs. For example, a forward variable is computed for a first page visit, and then for every page visit subsequent to that during a browsing session of a particular user. At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]), and training predictive behavior values (Dilip: Fig. 9, generate response 906, 910 based on user intent 904, 908; "A Maximum entropy classifier is a model used to predict the probabilities of different possible outcomes." [0064]),
Mehanian and Dilip are analogous art because both are directed towards determining through user interaction a user's predicted intent to purchase. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the hidden markov model based on website interactions of Mehanian with the user intent conversation analysis of Dilip.  The modification would have been obvious because one of ordinary skill would be motivated to support users of online systems and services by analyzing intent associated with user communications, as suggested by Dilip (Dilip: [0003]).

The Mehanian/Dilip combination do not teach:
wherein the hidden Markov model generated from the training sequences of observations has a number of states that minimizes one or more of an Akaike Information Criterion function and a Bayesian Information Criterion function;
accessing a logistic regression model having (i) predictor variables corresponding to the decision point values and the observation values and (ii) an output variable corresponding to the training predictive behavior values;
determining a respective set of regression coefficients that combine the predictor variables having the decision point values and the observation values from the respective data segment into the training predictive behavior values; and
outputting the logistic regression model with the respective set of regression coefficients as a respective predictive model for the hidden state.

Yue teaches wherein the hidden Markov model generated from the training sequences of observations has a number of states that minimizes one or more of an Akaike Information Criterion function and a Bayesian Information Criterion function ("Figure 5 plots the BIC values against the number of hidden states in the IND condition. We can see that BIC has the optimal value when the number of hidden states is set to 4." sec. Validation of HMM, p. 825;  Table 2: Hidden States and Emission Probability in IND; "In HMM, each observed action corresponds to a hidden state in Table 2." sec. Validating HMM in Individual Search; Table 2 uses four hidden states as found in Figure 5 to be the optimal value)
Mehanian and Yue are analogous art because both are directed towards modeling user activity on the Web using Hidden Markov Models. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the Hidden Markov Model of the Mehanian/Dilip combination with the model selection process of Yue.  The modification would have been obvious because one of ordinary skill would be motivated to choose the optimal number of states that describes a model precisely without overfitting, as suggested by Yue (Yue: sec. Model Selection, p. 824).

Yeung teaches accessing a logistic regression model ("logistic regression model" sec. Development of logistic regression model, p. 76) having (i) predictor variables corresponding to the decision point values and the observation values ("explanatory variables (EVs)" sec. Questionnaire design – dependent and explanatory variables, p. 73) and (ii) an output variable corresponding to the training ("dependent variable (DV)" sec. Questionnaire design – dependent and explanatory variables, p. 73);
determining a respective set of regression coefficients ("coefficients Bj" sec. Development of logistic regression model, p. 76) that combine the predictor variables having the decision point values and the observation values from the respective data segment into the training predictive behavior values (Table 3; "From these results, the model was as follows:

    PNG
    media_image3.png
    102
    906
    media_image3.png
    Greyscale
); and
outputting the logistic regression model with the respective set of regression coefficients as a respective predictive model for the hidden state ("As shown in Table 4, the overall accuracy of the logistic regression model to predict consumer purchase propensity is acceptable (overall 66.5% predicted correctly)" sec. Evaluation of predictive power using classification table, pp. 78-79).
Mehanian and Yeung are analogous art because they are both directed towards predicting a user's purchase probability. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the purchase prediction model of the Mehanian/Dilip/Yue combination with the logistic regression model of Yeung.  The modification would have been obvious because Mehanian suggests substituting its neural network classifier with a logistic regression model (Mehanian: [0098]).  One of ordinary skill in the art would be motivated to perform this substitution because logistic regression is an effective analytical method that is well suited for predicting consumer purchase propensity, as suggested by Yeung (Yeung: Abstract, p. 71; Conclusion, p. 79).


Regarding Claim 20,
Claim 20 recite(s) a non-transitory medium having instructions executable by a processing device for performing functions corresponding to the method steps  performed by a processing device recited in claim 7.  The Mehanian/Dilip/Yue/Yeung combination teaches the limitations of claim 20 as set forth above in connection with claim 7.  Therefore, claim 20 is rejected under the same rationale as respective claim 7.

Claims 8, 9, and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mehanian et al. (US 2015/0269609, hereinafter "Mehanian") in view of Dilip et al. (US 2011/0179114, hereinafter "Dilip) and Vairavan et al. (Prediction of Mortality in an Intensive care Unit using Logistic Regression and a Hidden Markov Model, hereinafter "Vairavan").

Regarding Claim 8,
Mehanian teaches a computing system comprising: a processor (data processing system [0111]); and non-transitory computer-readable medium (computer-readable medium [0110]) having instructions stored thereon, wherein when executed by the processor, the instructions perform operations comprising: 
accessing training non-conversational data (Figure 5, clickstreams 530, 540) having observations (Figure 5, HMM params 550, 560; "In some embodiments, the HMM training module 344 (described in detail in reference to FIG. 3B) learns the model parameters from historical data using a certain algorithm, such as the Baum-Welch algorithm, to model clickstream paths with HMMs." [0049]); 
fitting a hidden Markov model to the training non-conversational data ("The HMM training module 344 uses the clickstreams to train one or more HMMs." [0069]), wherein the hidden Markov model includes a first hidden state and a second hidden state (Figure 1A, State 0, State 1);
grouping the observations and decision points into data segments, wherein (i) a first data segment includes a first subset of the observations (Figure 1C, State 0 Emission Distribution 170) and the decision points associated with the first hidden state of the Hidden Markov Model (Figure 1C, State 0) and (ii) a second data segment includes a second subset of the observations (Figure 1D, State 1 Emission Distribution 180) and the decision points associated with the second hidden state of the Hidden Markov Model (Figure 1D, State 1) ("A forward variable is computed for each event that occurs. For example, a forward variable is computed for a first page visit, and then for every page visit subsequent to that during a browsing session of a particular user. At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]);
training (a) a first model to predict the first hidden state of the Hidden Markov Model (Figure 5, Train HMM 570, Purchase HMM params 550; Figure 6, Purchase likelihood 650) by mapping the first data segment to a first set of predictor variables of the first model (Figure 5, Purchase clickstreams 530) and (b) a second model to predict the second hidden state of the Hidden Markov Model (Figure 5, Train HMM 580, Non-purchase HMM params 560; Figure 6, non-purchase likelihood 660)  by mapping the second data segment (Non-purchase clickstreams 540) to a second set of predictor variables of the second model (Non-purchase clickstreams 540); and 
providing the trained first model and the trained second model to an additional computing system (Figure 10) that is configured for: 
determining that input non-conversational data (Figure 6, Clickstream 630) for an entity is more likely to correspond to the first hidden state as compared to the second hidden state ("As described at least in reference to FIG. 6, the classifier module 354 scores (e.g., using the Forward Recursion algorithm) the current clickstream using the HMM parameters received from the training engine 340 to determine likelihoods that the current clickstream (e.g., encoded in the form of a sequence of symbols as in FIGS. 7A and 7B) corresponds to an output of the HMM." [0073]); and 
generating a predicted behavior (Figure 10, Purchase probability 1090; alternatively, Score Purchase HMM output 1030, 1040) by applying the first model (Figure 10, Score Purchase HMM 1010) to input non-conversational data (Figure 10, Clickstream 630) ("Other attributes (e.g., user type (e.g., consumer or business), demographic information, product information, etc.) 1080, from other sources, may also be used as input to the neural network classifier 1070. Based on all of its inputs, the neural network classifier 1070 outputs a purchase probability 1090. This computed probability is informed by both clickstream and non-clickstream data." [0098]).  


Mehanian also does not explicitly teach using logistic regression models to predict the hidden states of the Hidden Markov Model.

Dilip teaches accessing training conversational data ("By extracting data from multiple sources (e.g. social media conversations...)" [0051]; "In the following example, a conversation includes "Deciding which camera to buy between a Canon Powershot SD1000 or a Nikon Coolpix S230."" [0026]),
identifying decision points ("Procedure 800 is useful in classifying words and/or phrases contained in various social media communications, catalogs, product listings, online conversations and any other data source. [0083]) based on a textual analysis of the training conversational data wherein each decision point identifies a respective action or a sentiment ("Intent analyzer 120 determines an intent associated with the various user communications and response generator 118 generates a response to particular communications based on the intent and other data associated with similar communications. A user intent may include, for example, an intent to purchase a product or service, an intent to obtain information about a product or service, an intent to seek comments from other users of a product or service, and the like." [0024]; "The intent analysis procedures discussed herein use various machine learning algorithms, machine learning processes, and classification algorithms to determine a user intent associated with one or more user communications and/or user interactions. These algorithms and procedures identify various statistical correlations between topics, phrases, and other data." [0069]);
grouping decision points and observations into data segments ("Topic clustering module 218 identifies related topics and clusters those topics together to support the intent analysis described herein." [0038]; "The message components are then correlated with other message components from multiple social media interactions and communications to generate topic clusters (block 708). The message components may also be correlated with information from other information sources, such as product information sources, product review sources, and the like. The correlated message components are formed into one or more topic clusters associated with a particular topic (e.g., a product, service, or product category)" [0063]), and
using the data segments in predicting hidden states and predicted behavior ("The intent analysis procedures discussed herein use various machine learning algorithms, machine learning processes, and classification algorithms to determine a user intent associated with one or more user communications and/or user interactions." [0069]; "The second type of information is associated with a user's intent level (e.g., whether they are gathering information or ready to buy a particular product or service)." [0081]; this teaches analyzing words to determine if the user is in a first state of gathering information or a second state of ready to buy) and predicted behavior (Fig. 9, generate response 906, 910 based on user intent 904, 908; "A Maximum entropy classifier is a model used to predict the probabilities of different possible outcomes." [0064]).
Mehanian and Dilip are analogous art because both are directed towards determining through user interaction a user's predicted intent to purchase. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the hidden markov model based on website interactions of Mehanian with the user intent conversation analysis of Dilip.  The modification would have been obvious because one of ordinary skill would be motivated to support users of online systems and services by analyzing intent associated with user communications, as suggested by Dilip (Dilip: [0003]).

Vairavan teaches using logistic regression models (Figure 1, p. 394) to predict the hidden states of the Hidden Markov Model ("we propose an algorithm based on logistic regress and Hidden-Markov model" sec. Abstract, p. 393; "The Event 1 specific algorithm uses logistic regression to combine different features... and the output of HMM." sec. 2.3, p. 394; The HMM requires three probabilities for each variable 'V'... All the probabilities above are calculated from the data." sec. 2.3, p. 394; Figure 3; this teaches that logistic regression is used to predict the probabilities of the Hidden Markov Model states).
Mehanian and Vairavan both are directed to the problem of determining hidden states using Hidden Markov Models.  It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the hidden markov model of Mehanian 

Regarding Claim 9,
The Mehanian/Dilip/Vairavan combination teaches the computing system of claim 8.  Mehanian further teaches wherein generating the hidden Markov model comprises:
selecting training sequences of observations from the training non-conversational data ("During a particular user's session, the training encoder module 342 (discussed in detail in reference to FIG. 3B) may encode the user's clickstream path as HSSSP. Thus, the user started out on the website's home page (H). The user then made 3 searches, landing on 3 different search results pages (SSS). The user then selected (e.g., touched, clicked, etc.) on one of the listed products on the last search results page, which landed him or her on a product description page (P)" [0047]), wherein each observation includes one or more respective task features ("FIG. 2 is a table illustrating an example embodiment of a coding scheme for clickstream analysis. In this example, the resulting emission alphabet 202 has 12 symbols and the table includes a column showing an example emission alphabet 202, a list of webpages or actions 204 corresponding to each symbol in the emission alphabet 202, and set of descriptions 206 corresponding to each symbol in the emissions alphabet 202." [0046]) related to respective interactions ("user's clickstream path as HSSSP" [0047]) with a respective consumer entity ("particular user's session" [0047]) via a relationship management tool (training encoder module 342);
identifying a number of states for the hidden Markov model ("FIGS. 1A-1E illustrate an example embodiment for a 2-state HMM including a state diagram, a transition matrix, emission distributions and a sample output. FIG. 1A in particular is an illustration 100 of an example 2-state HMM, where the two states are represented as disks labeled State 0 (disk 104) and State 1 (disk 108)." [0040]); and
fitting the training sequences of observations to a corresponding Markov chain having the identified number of states ("The HMM accounts for this clickstream path as follows. The user started in State 0 and made transitions back to State 0 for the next 4 page visits. For instance, the user remained in State 0 for 5 sequential page visits). While in State 0, the user visited different pages with probabilities given by the emission distribution for State 0 shown in FIG. 1C. After the fifth page, the user transitioned to State 1 and made transitions back to State 1 for the next 3 page visits. While in State 1, the user visited different pages with probabilities most closely corresponding to the distribution given by the State 1 emission distribution shown in FIG. 1D." [0048]).


Regarding Claim 12,
The Mehanian/Dilip/Vairavan combination teaches the computing system of claim 8.  Mehanian and Dilip further teach wherein grouping the observations and the decision points into the data segments comprises:
(Mehanian: "the historical data may include session data pertaining to a multiplicity of users over a period of time. The session data includes a record of each page (URL) that a given user visited in his or her session." [0080]; "At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]);
identifying a first subset of the decision points associated with the first time period (Dilip: " In operation, response generator 118 retrieves social media interactions and similar communications (e.g., “tweets” on Twitter, blog posts and social media posts) during a particular time period" [0059]);
assigning the first subset of the observations and the first subset of the decision points to the first data segment (Dilip: "Topic clustering module 218 identifies related topics and clusters those topics together to support the intent analysis described herein." [0038]; "The message components are then correlated with other message components from multiple social media interactions and communications to generate topic clusters (block 708). The message components may also be correlated with information from other information sources, such as product information sources, product review sources, and the like. The correlated message components are formed into one or more topic clusters associated with a particular topic (e.g., a product, service, or product category)" [0063])  based on the first time period being associated with the first hidden state, the first subset of the observations (Mehanian: "the historical data may include session data pertaining to a multiplicity of users over a period of time. The session data includes a record of each page (URL) that a given user visited in his or her session." [0080]; "At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]), and the first subset of the decision points (Dilip: " In operation, response generator 118 retrieves social media interactions and similar communications (e.g., “tweets” on Twitter, blog posts and social media posts) during a particular time period" [0059]);
determining that the second hidden state is associated with a second subset of the observations associated that with a second time period (Mehanian: "the historical data may include session data pertaining to a multiplicity of users over a period of time. The session data includes a record of each page (URL) that a given user visited in his or her session." [0080]; "At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]);
identifying a second subset of the decision points associated with the second time period (Dilip: " In operation, response generator 118 retrieves social media interactions and similar communications (e.g., “tweets” on Twitter, blog posts and social media posts) during a particular time period" [0059]); and
assigning the second subset of the observations and the second subset of the decision points to the second data segment (Dilip: "Topic clustering module 218 identifies related topics and clusters those topics together to support the intent analysis described herein." [0038]; "The message components are then correlated with other message components from multiple social media interactions and communications to generate topic clusters (block 708). The message components may also be correlated with information from other information sources, such as product information sources, product review sources, and the like. The correlated message components are formed into one or more topic clusters associated with a particular topic (e.g., a product, service, or product category)" [0063]) based on the second time period being associated with the second hidden state, the second subset of the observations (Mehanian: "the historical data may include session data pertaining to a multiplicity of users over a period of time. The session data includes a record of each page (URL) that a given user visited in his or her session." [0080]; "At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]), and the second subset of the decision points (Dilip: " In operation, response generator 118 retrieves social media interactions and similar communications (e.g., “tweets” on Twitter, blog posts and social media posts) during a particular time period" [0059]).
Mehanian and Dilip are analogous art because both are directed towards determining through user interaction a user's predicted intent to purchase. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the hidden markov model based on website interactions of the Mehanian/Vairavan combination with the user intent conversation analysis of Dilip.  The modification would have been obvious because one of ordinary skill would be motivated to support users of (Dilip: [0003]).

Claims 10-11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mehanian et al. (US 2015/0269609, hereinafter "Mehanian") in view of Dilip et al. (US 2011/0179114, hereinafter "Dilip), Vairavan et al. (Prediction of Mortality in an Intensive care Unit using Logistic Regression and a Hidden Markov Model, hereinafter "Vairavan"), and Yue et al. (Modeling Search Processes using Hidden States in Collaborative Exploratory Web Search, hereinafter "Yue").

Regarding Claims 10,
The Mehanian/Dilip/Vairavan combination teaches the computing system of claim 9.  The Mehanian/Dilip/Vairavan combination does not explicitly teach: 
means for evaluating, based on a first output of a model-selection function, the hidden Markov model having the identified number of states, wherein the model-selection function includes a first term rewarding an increased log-likelihood for the hidden Markov model and a second term penalizing an increased number of states in the hidden Markov model;
means for identifying a different number of states for the hidden Markov model; and
means for fitting the training sequences of observations to an additional Markov chain having the different number of states;

means for selecting the hidden Markov model having the different number of states based on the second output of the model-selection function being less than the first output of the model-selection function.

Yue teaches means for evaluating, based on a first output of a model-selection function (model selection functions "Akaike information criterion (AIC)" and "Bayesian information criterion (BIC)" in sec. Model Selection, p. 824-825; the lower the AIC or BIC score the better), the hidden Markov model having the identified number of states, wherein the model-selection function ("BIC = -2 x log(L) + log(s) x p   Eq. (1)" sec. Model Selection, p. 824-825) includes a first term rewarding an increased log-likelihood for the hidden Markov model ("L denotes the log-likelihood of all samples" sec. Model Selection, p. 824-825) and a second term penalizing an increased number of states in the hidden Markov model ("p can be computed using (N-1) + (N-1)x(N-1) + Nx(M-1)" sec. Model Selection, p. 824-825; "number of hidden states N" sec. Model Selection, p. 824-825);
means for identifying a different number of states for the hidden Markov model; means for fitting the training sequences of observations to an additional Markov chain having the different number of states; means for evaluating, based on a second output of the model-selection function, the hidden Markov model having the different number of states ("Figure 5 plots the BIC values against the number of hidden states in the IND condition. We can see that BIC has the optimal value when the number of hidden states is set to 4." sec. Validation of HMM, p. 825; Figure 5 teaches testing models with hidden states numbering 2, 3, 4, 5, 6, and 7 and evaluating each using BIC); and
means for selecting the hidden Markov model having the different number of states based on the second output of the model-selection function being less than the first output of the model-selection function ("Figure 5 plots the BIC values against the number of hidden states in the IND condition. We can see that BIC has the optimal value when the number of hidden states is set to 4." sec. Validation of HMM, p. 825;  Table 2: Hidden States and Emission Probability in IND; "In HMM, each observed action corresponds to a hidden state in Table 2." sec. Validating HMM in Individual Search; Table 2 uses four hidden states as found in Figure 5 to be the optimal value).
Mehanian and Yue are analogous art because both are directed towards modeling user activity on the Web using Hidden Markov Models. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the Hidden Markov Model of the Mehanian/Dilip combination with the model selection process of Yue.  The modification would have been obvious because one of ordinary skill would be motivated to choose the optimal number of states that describes a model precisely without overfitting, as suggested by Yue (Yue: sec. Model Selection, p. 824).

Regarding Claim 11,
The Mehanian/Dilip/Vairavan/Yue combination teaches the computing system of claim 10.  Yue further teaches wherein the model-selection function comprises one or ("Akaike information criterion (AIC)" and "Bayesian information criterion (BIC)" in sec. Model Selection, p. 824-825).

Claims 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mehanian et al. (US 2015/0269609, hereinafter "Mehanian") in view of Dilip et al. (US 2011/0179114, hereinafter "Dilip), Vairavan et al. (Prediction of Mortality in an Intensive care Unit using Logistic Regression and a Hidden Markov Model, hereinafter "Vairavan"), and Yeung et al. (Logistic Regression: An advancement of predicting consumer purchase propensity, hereinafter "Yeung").

Regarding Claim 13,
The Mehanian/Dilip/Vairavan combination teaches the computing system of claim 8.  Mehanian and Dilip further teaches wherein generating the first logistic regression model for the first hidden state and the second logistic regression model for the second hidden state comprises, for each hidden state of the first and second hidden states:
selecting a respective data segment is associated with the hidden state (Dilip: "The intent analysis procedures discussed herein use various machine learning algorithms, machine learning processes, and classification algorithms to determine a user intent associated with one or more user communications and/or user interactions." [0069]; "The second type of information is associated with a user's intent level (e.g., whether they are gathering information or ready to buy a particular product or service)." [0081]), wherein the respective data segment includes (Dilip: "Topic clustering module 218 identifies related topics and clusters those topics together to support the intent analysis described herein." [0038]; "The message components are then correlated with other message components from multiple social media interactions and communications to generate topic clusters (block 708). The message components may also be correlated with information from other information sources, such as product information sources, product review sources, and the like. The correlated message components are formed into one or more topic clusters associated with a particular topic (e.g., a product, service, or product category)" [0063]), observation values (Mehanian: Figure 5, HMM params 550, 560; "In some embodiments, the HMM training module 344 (described in detail in reference to FIG. 3B) learns the model parameters from historical data using a certain algorithm, such as the Baum-Welch algorithm, to model clickstream paths with HMMs." [0049]; "A forward variable is computed for each event that occurs. For example, a forward variable is computed for a first page visit, and then for every page visit subsequent to that during a browsing session of a particular user. At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]), and training predictive behavior values (Dilip: Fig. 9, generate response 906, 910 based on user intent 904, 908; "A Maximum entropy classifier is a model used to predict the probabilities of different possible outcomes." [0064]);
Mehanian and Dilip are analogous art because both are directed towards determining through user interaction a user's predicted intent to purchase. It would have (Dilip: [0003]).

The Mehanian/Dilip/Vairavan combination does not explicitly teach accessing, for the respective logistic regression model, (i) predictor variables corresponding to the decision point values and the observation values and (ii) an output variable corresponding to the training predictive behavior values;
determining a respective set of regression coefficients that combine the predictor variables having the decision point values and the observation values from the respective data segment into the training predictive behavior values; and
outputting the logistic regression model with the respective set of regression coefficients as a respective predictive model for the hidden state.

Yeung teaches accessing, for the respective logistic regression model ("logistic regression model" sec. Development of logistic regression model, p. 76), (i) predictor variables corresponding to the decision point values and the observation values ("explanatory variables (EVs)" sec. Questionnaire design – dependent and explanatory variables, p. 73) and (ii) an output variable corresponding to the training ("dependent variable (DV)" sec. Questionnaire design – dependent and explanatory variables, p. 73);
determining a respective set of regression coefficients ("coefficients Bj" sec. Development of logistic regression model, p. 76) that combine the predictor variables having the decision point values and the observation values from the respective data segment into the training predictive behavior values (Table 3; "From these results, the model was as follows:

    PNG
    media_image3.png
    102
    906
    media_image3.png
    Greyscale
); and
outputting the respective logistic regression model with the respective set of regression coefficients for the hidden state ("As shown in Table 4, the overall accuracy of the logistic regression model to predict consumer purchase propensity is acceptable (overall 66.5% predicted correctly)" sec. Evaluation of predictive power using classification table, pp. 78-79).
Mehanian and Yeung are analogous art because they are both directed towards predicting a user's purchase probability. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the purchase prediction model of the Mehanian/Dilip/Vairavan combination with the logistic regression model of Yeung.  The modification would have been obvious because Mehanian suggests substituting its neural network classifier with a logistic regression model (Mehanian: [0098]).  One of ordinary skill in the art would be motivated to perform this substitution because logistic regression is an effective analytical method that is well suited for predicting consumer (Yeung: Abstract, p. 71; Conclusion, p. 79).

Claim 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mehanian et al. (US 2015/0269609, hereinafter "Mehanian") in view of Dilip et al. (US 2011/0179114, hereinafter "Dilip) and Vairavan et al. (Prediction of Mortality in an Intensive care Unit using Logistic Regression and a Hidden Markov Model, hereinafter "Vairavan"), Yue et al. (Modeling Search Processes using Hidden States in Collaborative Exploratory Web Search, hereinafter "Yue"), and Yeung et al. (Logistic Regression: An advancement of predicting consumer purchase propensity, hereinafter "Yeung").

Regarding Claim 14,
The Mehanian/Dilip/Vairavan combination teaches the computing system of claim 8.  Mehanian and Dilip further teaches wherein generating the hidden Markov model comprises (i) selecting training sequences of observations from the training non-conversational data (Mehanian: "During a particular user's session, the training encoder module 342 (discussed in detail in reference to FIG. 3B) may encode the user's clickstream path as HSSSP. Thus, the user started out on the website's home page (H). The user then made 3 searches, landing on 3 different search results pages (SSS). The user then selected (e.g., touched, clicked, etc.) on one of the listed products on the last search results page, which landed him or her on a product description page (P)" [0047]) and (ii) fitting the training sequences of (Mehanian: "The HMM accounts for this clickstream path as follows. The user started in State 0 and made transitions back to State 0 for the next 4 page visits. For instance, the user remained in State 0 for 5 sequential page visits). While in State 0, the user visited different pages with probabilities given by the emission distribution for State 0 shown in FIG. 1C. After the fifth page, the user transitioned to State 1 and made transitions back to State 1 for the next 3 page visits. While in State 1, the user visited different pages with probabilities most closely corresponding to the distribution given by the State 1 emission distribution shown in FIG. 1D." [0048]), 
wherein grouping the observations and the decision points into the data segments comprises, for each hidden state:
determining that the hidden state is associated with a respective subset of the observations associated that with a respective time period (Mehanian: "the historical data may include session data pertaining to a multiplicity of users over a period of time. The session data includes a record of each page (URL) that a given user visited in his or her session." [0080]; "At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]),
identifying a respective subset of the decision points associated with the respective time period (Dilip: " In operation, response generator 118 retrieves social media interactions and similar communications (e.g., “tweets” on Twitter, blog posts and social media posts) during a particular time period" [0059]), and
(Dilip: "Topic clustering module 218 identifies related topics and clusters those topics together to support the intent analysis described herein." [0038]; "The message components are then correlated with other message components from multiple social media interactions and communications to generate topic clusters (block 708). The message components may also be correlated with information from other information sources, such as product information sources, product review sources, and the like. The correlated message components are formed into one or more topic clusters associated with a particular topic (e.g., a product, service, or product category)" [0063]);
wherein generating the first logistic regression model for the first hidden state and the second logistic regression model for the second hidden state comprises, for each hidden state of the first and second hidden states:
selecting a respective data segment is associated with the hidden state (Dilip: "The intent analysis procedures discussed herein use various machine learning algorithms, machine learning processes, and classification algorithms to determine a user intent associated with one or more user communications and/or user interactions." [0069]; "The second type of information is associated with a user's intent level (e.g., whether they are gathering information or ready to buy a particular product or service)." [0081]), the data segment including decision point values (Dilip: "Topic clustering module 218 identifies related topics and clusters those topics together to support the intent analysis described herein." [0038]; "The message components are then correlated with other message components from multiple social media interactions and communications to generate topic clusters (block 708). The message components may also be correlated with information from other information sources, such as product information sources, product review sources, and the like. The correlated message components are formed into one or more topic clusters associated with a particular topic (e.g., a product, service, or product category)" [0063]), observation values (Mehanian: Figure 5, HMM params 550, 560; "In some embodiments, the HMM training module 344 (described in detail in reference to FIG. 3B) learns the model parameters from historical data using a certain algorithm, such as the Baum-Welch algorithm, to model clickstream paths with HMMs." [0049]; "A forward variable is computed for each event that occurs. For example, a forward variable is computed for a first page visit, and then for every page visit subsequent to that during a browsing session of a particular user. At the end of the session, or when the user lands on a particular page type, the forward variable for the last event that occurred is the one that may be used by the classifier." [0083]), and training predictive behavior values (Dilip: Fig. 9, generate response 906, 910 based on user intent 904, 908; "A Maximum entropy classifier is a model used to predict the probabilities of different possible outcomes." [0064])

The Mehanian/Dilip/Vairavan combination does not explicitly teach wherein the hidden Markov model generated from the training sequences of observations has a number of states that minimizes one or more of an Akaike Information Criterion function and a Bayesian Information Criterion function;

determining a respective set of regression coefficients that combine the predictor variables having the decision point values and the observation values from the respective data segment into the training predictive behavior values, and
outputting the respective logistic regression model with the respective set of regression coefficients for the hidden state.

Yue teaches wherein the hidden Markov model generated from the training sequences of observations has a number of states that minimizes one or more of an Akaike Information Criterion function and a Bayesian Information Criterion function ("Figure 5 plots the BIC values against the number of hidden states in the IND condition. We can see that BIC has the optimal value when the number of hidden states is set to 4." sec. Validation of HMM, p. 825;  Table 2: Hidden States and Emission Probability in IND; "In HMM, each observed action corresponds to a hidden state in Table 2." sec. Validating HMM in Individual Search; Table 2 uses four hidden states as found in Figure 5 to be the optimal value)
Mehanian and Yue are analogous art because both are directed towards modeling user activity on the Web using Hidden Markov Models. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the Hidden Markov Model of the Mehanian/Dilip/Vairavan combination with the model selection process of Yue.  The modification would have been obvious because one of ordinary skill would be (Yue: sec. Model Selection, p. 824).

Yeung teaches accessing a logistic regression model ("logistic regression model" sec. Development of logistic regression model, p. 76) having (i) predictor variables corresponding to the decision point values and the observation values ("explanatory variables (EVs)" sec. Questionnaire design – dependent and explanatory variables, p. 73) and (ii) an output variable corresponding to the training predictive behavior values ("dependent variable (DV)" sec. Questionnaire design – dependent and explanatory variables, p. 73);
determining a respective set of regression coefficients ("coefficients Bj" sec. Development of logistic regression model, p. 76) that combine the predictor variables having the decision point values and the observation values from the respective data segment into the training predictive behavior values (Table 3; "From these results, the model was as follows:

    PNG
    media_image3.png
    102
    906
    media_image3.png
    Greyscale
); and
outputting the respective logistic regression model with the respective set of regression coefficients for the hidden state ("As shown in Table 4, the overall accuracy of the logistic regression model to predict consumer purchase propensity is acceptable (overall 66.5% predicted correctly)" sec. Evaluation of predictive power using classification table, pp. 78-79).
(Mehanian: [0098]).  One of ordinary skill in the art would be motivated to perform this substitution because logistic regression is an effective analytical method that is well suited for predicting consumer purchase propensity, as suggested by Yeung (Yeung: Abstract, p. 71; Conclusion, p. 79).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLES C KUO whose telephone number is (571)270-7477.  The examiner can normally be reached on M-F: 9:00 a.m. - 6:00 p.m..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571) 272-9767.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/CHARLES C KUO/Examiner, Art Unit 2126  
/MICHAEL J HUNTLEY/Primary Examiner, Art Unit 2116