DETAILED ACTION
Response to Arguments
Applicant’s arguments with respect to claims 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1, 10 and 16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being incomplete for omitting essential steps, such omission amounting to a gap between the steps.  See MPEP § 2172.01.  The omitted steps relate to the third condition and the relevance of the second condition. 
In the former case,  claim 1 details that  that the third condition is associated with the user segment (i.e. wherein the third condition is associated with the user segment), but then goes on to detail that the third condition is associated with a different user segment (i.e. generating, by the computing device, a user segment associated with the third condition). Because the 
In the latter case, claim 1 details that the third condition is associated with the user segment, based on the first relevance of the first condition and a second relevance of a second condition related to the second attribute. The first relevance of the first condition has been previously detailed with the predictive model limitations (i.e. wherein a first relevance of the first condition is based on a comparison of the first predicted outcomes to the outcome of interest), but a second relevance of a second condition related to the second attribute has been stated without detailing the steps taken to arrive at a second relevance of a second condition related to the second attribute. Because essential steps have been omitted, the claim is indefinite.   
 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:

2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim 1-7 and 9-20 are rejected under 35 U.S.C. 103 as being unpatentable over Hueter et al. US 2016/0019587 Al (“Hueter”) in view of Ma, Haiying. "A Study on Customer Segmentation for E-Commerce Using the Generalized Association Rules and Decision Tree." American Journal of Industrial and Business Management 5.12 (2015)(“Ma”). 
Regarding claim 1, Hueter teaches a computer-implemented method for determining user segments created by a predictive model based on user behavioral data, the method comprising: 
receiving, by a computing device, training data and user input, the training data associated with training a predictive model and comprising a plurality of instances, each instance associated with a user interaction within a computer network and comprising a plurality of attributes and an outcome associated with the user interaction, and an input defining an outcome of interest(Hueter paras. 0047-0051, “FIG. 6 shows the use of the invention in a system that selects subjects to whom to recommend a specific item. The application using the recommendation service makes a Service Customer Request to the system. The request includes the attributes that are available and relevant to the request, which include but are not limited to information about the page being viewed, including category, search result, or specific item being viewed; information about the visitor, including age, gender, income, number of children, marital status, income, lifetime value, or other attributes; and information about the nature of the subject's visit to the site, including location (latitude, longitude, altitude, Hueter teaches The application using the recommendation service makes a Service Customer Request to the system. The request includes the attributes that are available and relevant to the request (i.e. receiving, by a computing device, training data and user input, the training data associated with training a predictive model) which include but are not limited to information about the page being viewed, including category, search result, or specific item being viewed; information about the visitor, including age, gender, income, number of children, marital status, income, lifetime value, or other attributes; and information about the nature of the subject's visit to the site, including location (latitude, longitude, altitude, state, country, city, postal code, or other location information), time-of-day (adjusted for location), type of device, type of browser, connection speed, referring URL, search engine keyword or other attributes of the visit (i.e. and comprising a plurality of instances, each instance associated with a user interaction within a computer network and comprising a plurality of attributes) returns a score for each possible available subject, whereby the scores indicate the relative probabilities of the subjects transacting the item (i.e. and an outcome associated with the user interaction) selects subjects to whom to recommend a specific item (i.e. and an input defining an outcome of interest)); 
generating, by the computing device, a set of conditions from the training data, each condition comprising an attribute and a range of values for the attribute(Hueter para. 0051-  “The subject are ranked by their combined scores and then filtered according to any specified business rules, which may include rules for pricing, category matching, inventory, or other merchandising goals. Business rules may be based on any attributes of the context, including subject attributes and content metadata.” Hueter teaches The subject are ranked by their combined scores and then filtered according to any specified business rules (i.e. generating, by the computing device, a set of conditions from the training data) Business rules may be based on any attributes of the context, including subject attributes and content metadata (i.e. each condition comprising an attribute and a range of values for the attribute)); and presenting, by the computing device, the user segment to the operator at an interface(Hueter para. 0098, “FIG. 10 shows the parameter selection process based on the first level of candidate segments. This user interface would allow an operator, for example a merchandiser or marketing manager, to get an idea of which variables are predictive of subjects' intents to transact. The operator would then select which variables to include in the segmentation model.” Hueter teaches FIG. 10 shows the parameter selection process based on the first level of candidate segments. This user interface (i.e. and presenting, by the computing device, the user segment to the operator at an interface)).  
Hueter does not teach: determining, by the computing device, a set of relevant conditions according to the predictive model from the set of conditions to the outcome of interest, wherein a relevance of the set of conditions is determined by:  generating a first condition related to a first attribute, wherein the first condition is associated with a first user segment in the training data, applying the predictive model to the first user segment to compute first predicted outcomes for first users in the first user segment, wherein a first relevance of the first condition is based on a comparison of the first predicted outcomes to the outcome of interest; and generating a third 
However Ma teaches: determining, by the computing device, a set of relevant conditions according to the predictive model from the set of conditions to the outcome of interest(Ma, pgs., 815-816, “Step 1: The first stage of the model is to select the variety variable of the purchased commodities from all variables, forming a data item set. Each data within the set corresponds to one type of commodity, constituting a set of objects… Step 6: Extract rules from the pruned decision tree.” Ma teaches Extract rules from the pruned decision tree (i.e. determining, by the computing device, a set of relevant conditions according to the predictive model) Step 1: The first stage of the model is to select the variety variable of the purchased commodities from all variables, forming a data item set. Each data within the set corresponds to one type of commodity, constituting a set of objects (i.e. from the set of conditions to the outcome of interest)), 
wherein a relevance of the set of conditions is determined by:  generating a first condition related to a first attribute, wherein the first condition is associated with a first user segment in the training data(Ma, pg. 815, “Step 1: The first stage of the model is to select the variety variable of the purchased commodities from all variables, forming a data item set. Each data within the set corresponds to one type of commodity, constituting a set of objects. Calculate the degree of support for all possible rules: The support of rule X => Y in the data set is the ratio of numbers between data sets with X, Y and all arrangements.” Ma teaches The support of rule X => Y in the data set (i.e. generating a first condition related to a first attribute) Each data within the set corresponds to one type of commodity (i.e. wherein the first condition is associated with a first user segment in the training data)), 
applying the predictive model to the first user segment to compute first predicted outcomes for first users in the first user segment, wherein a first relevance of the first condition is based on a comparison of the first predicted outcomes to the outcome of interest (Ma, pgs., 815-817, “Step 4: On the second stage of the model, decision tree C5.0 can be used to add up and induce the features obtained out of the association rules… Results shows the outputs of analysis based on the above three outputs generated out of the integrated model. The accuracy ratios of the three rule sets are…85.38%, 93.57%, 82.46% with the decision tree C5.0 model, shown in Figure 3.” Ma teaches On the second stage of the model, decision tree C5.0 can be used to add up and induce the features obtained out of the association rules (i.e. applying the predictive model to the first user segment to compute first predicted outcomes for first users in the first user segment) The accuracy ratio of the first rule set is 85.38%, with the decision tree C5.0 model, shown in Figure 3 (i.e. wherein a first relevance of the first condition is based on a comparison of the first predicted outcomes to the outcome of interest)); 
and generating a third condition related to the first attribute and a second attribute, wherein the third condition is associated with the user segment(Ma, pg. 815, “Step 2: On the base of mining, we specify a minimum support, find out all the specified item sets with the minimum support from the database of commodity variety, known as the frequent item sets. Generate the required generalized association rules by applying the frequent item sets. For example, for frequent item set ABCD and AB, if the ratio conf = support (ABCD)/support (AB) Ma teaches the generalized association rule AB => CD will be generated  (i.e. and generating a third condition related to the first attribute and a second attribute) the specified item sets with the minimum support from the database of commodity variety (i.e. wherein the third condition is associated with the user segment)), based on the first relevance of the first condition and a second relevance of a second condition related to the second attribute (Ma, pgs., 815-818, “Step 4: On the second stage of the model, decision tree C5.0 can be used to add up and induce the features obtained out of the association rules… Results shows the outputs of analysis based on the above three outputs generated out of the integrated model. The accuracy ratios of the three rule sets are…85.38%, 93.57%, 82.46% with the decision tree C5.0 model, shown in Figure 3.” Ma teaches The accuracy ratio of the first rule set is 85.38%, and the accuracy of the second rule set is 93.57%  with the decision tree C5.0 model, shown in Figure 3 (i.e. based on the first relevance of the first condition and a second relevance of a second condition related to the second attribute));  
generating, by the computing device, a user segment associated with the third condition from the set of relevant conditions based on the relevance of the third condition(Ma, pgs., 815-818, “Step 4: On the second stage of the model, decision tree C5.0 can be used to add up and induce the features obtained out of the association rules… Results shows the outputs of analysis based on the above three outputs generated out of the integrated model. The accuracy ratios of the three rule sets are…85.38%, 93.57%, 82.46% with the decision tree C5.0 model, shown in Figure 3.” Ma teaches Figure 3 (i.e. generating, by the computing device, a user segment associated with the third condition from the set of relevant conditions) The accuracy ratio of the third rule set is 82.46%  with the decision tree C5.0 model, shown in Figure 3 (i.e. based on the relevance of the third condition)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Hueter in view of Ma to teach: determining, by the computing device, a set of relevant conditions according to the predictive model from the set of conditions to the outcome of interest, wherein a relevance of the set of conditions is determined by:  generating a first condition related to a first attribute, wherein the first condition is associated with a first user segment in the training data, applying the predictive model to the first user segment to compute first predicted outcomes for first users in the first user segment, wherein a first relevance of the first condition is based on a comparison of the first predicted outcomes to the outcome of interest; and generating a third condition related to the first attribute and a second attribute, wherein the third condition is associated with the user segment, based on the first relevance of the first condition and a second relevance of a second condition related to the second attribute; generating, by the computing device, a user segment associated with the third condition from the set of relevant conditions based on the relevance of the third condition. The motivation to do so would be to combine association rules from big data along with rule-based machine learning to produce better customer segmentation groups (Ma, pgs. 813-814, “With the continuous development of e-commerce, the traditional technique of customer segmentation has been unable to cope with the massive and complex customer data. Based on the data mining technique, the new analyzing technique provides new solutions to the massive data of complex customer segmentation. Through collecting and classifying customer information, the new technique intends to find out customer groups with different attribute features: the demand characteristics of the overall customer internal, the buying behavior, the browsing 
Regarding claim 2, Hueter in view of Ma teaches the method of claim 1, wherein generating the set of conditions further comprises: extracting each condition present in each instance of the training data, and aggregating each condition into the set of conditions (Hueter, para. 0057, “FIG. 7B shows an example of a transformation of attributes…The attributes employed in the invention may be considered as m-dimensional tuples ( or m-tuples) that are members of a set constructed from the Cartesian product of the sets of attributes of interest.” Huete teaches FIG. 7B shows an example of a transformation of attributes (i.e. extracting each condition present in each instance of the training data) The attributes employed in the invention may be considered as m-dimensional tuples ( or m-tuples) that are members of a set (i.e. and aggregating each condition into the set of conditions)).  
Regarding claim 3, Hueter in view of Ma teaches the method of claim 1, further comprising determining a relevance of the user segment based on a predicted outcome from the predictive model given the conditions that are included in the user segment(Hueter, paras. 0073-0076, “Calculate the density factor d=r/s, whereby r=(number of items of interest in peak sequence) and s=(number of all items in peak sequence). Note that d is
number between 0 and 1…A better significance calculation is attained by replacing the formula in step 7 above with the following:                         
                            R
                            =
                            
                                
                                    
                                        
                                            r
                                            -
                                            
                                                
                                                    r
                                                
                                                
                                                    a
                                                    v
                                                    g
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        r
                                        +
                                        
                                            
                                                r
                                            
                                            
                                                a
                                                v
                                                g
                                            
                                        
                                    
                                
                            
                            >
                            T
                        
                     where                         
                            
                                
                                    r
                                
                                
                                    a
                                    v
                                    g
                                
                            
                            =
                            s
                            ⋅
                            
                                
                                    N
                                
                                
                                    p
                                
                            
                            /
                            
                                
                                    N
                                
                                
                                    t
                                    o
                                    t
                                    a
                                    l
                                
                            
                        
                      and for example T=2. The above process is repeated for all dimensions and cells.” Hueter teaches the following:                         
                            R
                            =
                            
                                
                                    
                                        
                                            r
                                            -
                                            
                                                
                                                    r
                                                
                                                
                                                    a
                                                    v
                                                    g
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        r
                                        +
                                        
                                            
                                                r
                                            
                                            
                                                a
                                                v
                                                g
                                            
                                        
                                    
                                
                            
                            >
                            T
                        
                     where                         
                            
                                
                                    r
                                
                                
                                    a
                                    v
                                    g
                                
                            
                            =
                            s
                            ⋅
                            
                                
                                    N
                                
                                
                                    p
                                
                            
                            /
                            
                                
                                    N
                                
                                
                                    t
                                    o
                                    t
                                    a
                                    l
                                
                            
                        
                      and for example T=2 (i.e. determining a relevance of the user segment) Calculate the density factor d=r/s, whereby r=(number of items of interest in peak sequence) and s=(number of all items in peak sequence). Note that d is number between 0 and 1 (i.e. based on a predicted outcome from the predictive model given the conditions that are included in the user segment)).  
Regarding claim 4, Hueter in view of Ma teaches the method of claim 3, further comprising: generating a second user segment; and determining, based on the relevance of the user segment and a relevance of the second user segment, an optimal set of user segments, the optimal set of user segments comprising at least one of the user segment and second user segment, wherein the optimal set of user segments include most relevant user segments used by the predictive model in predicting the outcome of interest(Hueter paras. 0083-0098, “The system can compose the sequences of several items created with step 2 of the attribute analysis in paragraph…into a single sequence Dpa…and subsequently analyze the resulting sequence Dpa… FIG. 9B shows an example of the composition of presence-absence sequences from several items into a composite sequence…we may wish to exploit the advantages of considering the presence or absence of events pertaining to a set of items of interest, instead of single items… a composition of the sequences for each individual item (or a collection of subsets of items) that aims to increase the significance of the resulting sequence instead of possibly decreasing it. We can accomplish this by constructing the composition as follows:                         
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                
                            
                            =
                            
                                
                                    w
                                
                                
                                    1
                                
                            
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                    ,
                                     
                                    1
                                
                            
                            +
                            
                                
                                    w
                                
                                
                                    2
                                
                            
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                    ,
                                     
                                    2
                                
                            
                            +
                            
                                
                                    w
                                
                                
                                    3
                                
                            
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                    ,
                                     
                                    3
                                
                            
                            +
                            …
                        
                     …The above process of choosing the signs for                         
                            
                                
                                    w
                                
                                
                                    j
                                
                            
                        
                     successively one term at a time is intended to avoid the computational cost of a global optimization algorithm ( such as simulated annealing or genetic programming) that would explore the choice of sign for each term independently, to arrive at the signs that maximize significance or variance in the sequence
pa….” Hueter teaches The system can compose the sequences of several items created with step 2 of the attribute analysis in paragraph…into a single sequence Dpa…and subsequently analyze the resulting sequence Dpa… FIG. 9B shows an example of the composition of presence-absence sequences from several items into a composite sequence (i.e. generating a second user segment) a composition of the sequences for each individual item (or a collection of subsets of items) that aims to increase the significance of the resulting sequence instead of possibly decreasing it. We can accomplish this by constructing the composition as follows:                         
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                
                            
                            =
                            
                                
                                    w
                                
                                
                                    1
                                
                            
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                    ,
                                     
                                    1
                                
                            
                            +
                            
                                
                                    w
                                
                                
                                    2
                                
                            
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                    ,
                                     
                                    2
                                
                            
                            +
                            
                                
                                    w
                                
                                
                                    3
                                
                            
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                    ,
                                     
                                    3
                                
                            
                            +
                            …
                        
                     (i.e. and determining, based on the relevance of the user segment and a relevance of the second user segment) The above process of choosing the signs for                         
                            
                                
                                    w
                                
                                
                                    j
                                
                            
                        
                     successively one term at a time is intended to avoid the computational cost of a global optimization algorithm ( such as simulated annealing or genetic programming) that would explore the choice of sign for each term independently, to arrive at the signs that maximize significance or variance in the sequence (i.e. an optimal set of user segments, the optimal set of user segments comprising at least one of the user segment and second user segment, wherein the optimal set of user segments include most relevant user segments used by the predictive model in predicting the outcome of interest)).  
Regarding claim 5, Hueter in view of Ma teaches the method of claim 4, further comprising: determining, by the computing device, that the second user segment is redundant compared to the user segment; and removing, by the computing device, the second user segment from the optimal set of user segments(Hueter paras. 0083-0098, “By summing until the resulting sequence changes by less than a chosen amount when a term is added, with the change measured using vector-lengths of the sequences with an appropriate norm, such as a Cartesian norm                         
                            
                                
                                    L
                                
                                
                                    p
                                
                            
                        
                     where p=2 (for example, stop when Dpa and the ith term                         
                            
                                
                                    w
                                
                                
                                    i
                                
                            
                        
                    Dpa satisfy                         
                            
                                
                                    
                                        
                                            w
                                        
                                        
                                            i
                                        
                                    
                                    
                                        
                                            D
                                        
                                        
                                            p
                                            a
                                        
                                    
                                
                            
                            <
                            ϵ
                            |
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                
                            
                            |
                        
                    , for a previously chosen value of                         
                            ϵ
                        
                    )… but there may be reasons not clear from the training data set to exclude certain variables from the model, such as because it is known to the operator that a particular variable may not be readily available in the operational system or that one variable is redundant to another.” Hueter teaches By summing until the resulting sequence changes by less than a chosen amount when a term is added (i.e. determining, by the computing device, that the second user segment is redundant compared to the user segment) exclude certain variables from the model, such as because it is known to the operator that a particular variable may not be readily available in the operational system or that one variable is redundant to another (i.e. and removing, by the computing device, the second user segment from the optimal set of user segments)).  
Regarding claim 6, Hueter in view of Ma teaches the method of claim 4, further comprising processing the user segment to remove redundant or overlapping conditions(Hueter paras. 0083-0098, “[B]ut there may be reasons not clear from the training data set to exclude certain variables from the model, such as because it is known to the operator that a particular variable may not be readily available in the operational system or that one variable is redundant to another.” Hueter teaches exclude certain variables from the model, such as because it is known to the operator that a particular variable may not be readily available in the operational system or that one variable is redundant to another (i.e. processing the user segment to remove redundant or overlapping conditions)).1  
Regarding claim 7, Hueter in view of Ma teaches the method of claim 4, wherein determining an optimal set of user segments further comprises: creating a set of user segments, the set of user segments including the user segment and the second user segment; determining a first metric for the user segment and a second metric for the second user segment, the first metric and the second metric based on user segment precision and coverage; based on the first metric being higher than the second metric, retaining the user segment in the optimal set of user segments and removing the second metric from the set of user segments; and providing the set of user segments as the optimal set of segments(Hueter paras. 0083-0098, “[A] composition of the sequences for each individual item (or a collection of subsets of items) that aims to increase the significance of the resulting sequence instead of possibly decreasing it. We can accomplish this by constructing the composition as follows:                         
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                
                            
                            =
                            
                                
                                    w
                                
                                
                                    1
                                
                            
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                    ,
                                     
                                    1
                                
                            
                            +
                            
                                
                                    w
                                
                                
                                    2
                                
                            
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                    ,
                                     
                                    2
                                
                            
                            +
                            
                                
                                    w
                                
                                
                                    3
                                
                            
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                    ,
                                     
                                    3
                                
                            
                            +
                            …
                        
                     …The arithmetic signs of the weights                         
                            
                                
                                    w
                                
                                
                                    j
                                
                            
                        
                     are chosen so that the contribution of Dpa increases the significance of the composite sequence. Several methods may be used to select these signs: We may evaluate the cumulative sum Dpa one term at a time in the order j= 1, 2, 3....choosing the sign for                         
                            
                                
                                    w
                                
                                
                                    i
                                
                            
                        
                    , at each step that results in the larger significance for Dpa after the ith term is included…By summing until the resulting sequence changes by less than a chosen amount when a term is added, with the change measured using vector-lengths of the sequences with an appropriate norm, such as a Cartesian norm                         
                            
                                
                                    L
                                
                                
                                    p
                                
                            
                        
                     where p=2 (for example, stop when Dpa and the ith term                         
                            
                                
                                    w
                                
                                
                                    i
                                
                            
                        
                    Dpa satisfy                         
                            
                                
                                    
                                        
                                            w
                                        
                                        
                                            i
                                        
                                    
                                    
                                        
                                            D
                                        
                                        
                                            p
                                            a
                                        
                                    
                                
                            
                            <
                            ϵ
                            |
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                
                            
                            |
                        
                    , for a previously chosen value of                         
                            ϵ
                        
                    )… but there may be reasons not clear from the training data set to exclude certain variables from the model, such as because it is known to the operator that a particular variable may not be readily available in the operational system or that one variable is redundant to another.” Hueter teaches A composition of the sequences for each individual item (or a collection of subsets of items) that aims to increase the significance of the resulting sequence instead of possibly decreasing it. We can accomplish this by constructing the composition as follows:                         
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                
                            
                            =
                            
                                
                                    w
                                
                                
                                    1
                                
                            
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                    ,
                                     
                                    1
                                
                            
                            +
                            
                                
                                    w
                                
                                
                                    2
                                
                            
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                    ,
                                     
                                    2
                                
                            
                            +
                            
                                
                                    w
                                
                                
                                    3
                                
                            
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                    ,
                                     
                                    3
                                
                            
                            +
                            …
                        
                     (i.e. creating a set of user segments, the set of user segments including the user segment and the second user segment) The arithmetic signs of the weights                         
                            
                                
                                    w
                                
                                
                                    j
                                
                            
                        
                     are chosen so that the contribution of Dpa increases the significance of the composite sequence. Several methods may be used to select these signs: We may evaluate the cumulative sum Dpa one term at a time in the order j= 1, 2, 3....choosing the sign for                         
                            
                                
                                    w
                                
                                
                                    i
                                
                            
                        
                    , at each step that results in the larger significance for Dpa after the ith term is included (i.e. determining a first metric for the user segment and a second metric for the second user segment, the first metric and the second metric based on user segment precision and coverage) By summing until the resulting sequence changes by less than a chosen amount when a term is added, with the change measured using vector-lengths of the sequences with an appropriate norm, such as a Cartesian norm                         
                            
                                
                                    L
                                
                                
                                    p
                                
                            
                        
                     where p=2 (for example, stop when Dpa and the ith term                         
                            
                                
                                    w
                                
                                
                                    i
                                
                            
                        
                    Dpa satisfy                         
                            
                                
                                    
                                        
                                            w
                                        
                                        
                                            i
                                        
                                    
                                    
                                        
                                            D
                                        
                                        
                                            p
                                            a
                                        
                                    
                                
                            
                            <
                            ϵ
                            |
                            
                                
                                    D
                                
                                
                                    p
                                    a
                                
                            
                            |
                        
                    , for a previously chosen value of                         
                            ϵ
                        
                    ) (i.e. based on the first metric being higher than the second metric, retaining the user segment in the optimal set of user segments) but there may be reasons not clear from the training data set to exclude certain variables from the model, such as because it is known to the operator that a particular variable may not be readily available in the operational system or that one variable is redundant to another (i.e. and removing the second metric from the set of user segments; and providing the set of user segments as the optimal set of segments)).  
Regarding claim 9, Hueter in view of Ma teaches the method of claim 1, further comprising: identifying, by the computing device, a set of ranges for numerical values within the training data; and converting numerical data into categorical data by replacing a numerical value with a range(Hueter paras. 0057-0063, “Collectively, the set of m-dimensional attribute tuples Z may be transformed to an n-dimensional space of real-valued n-tuples Q, with a function Q=f(Z). The invention may then be applied to the data using the  The function can incorporate the mapping of categorical or binary variables to real numbers, as described above, thereby allowing software implementations to treat all attributes consistently.” Hueter teaches the set of m-dimensional attribute tuples Z may be transformed to an n-dimensional space of real-valued n-tuples Q, with a function Q=f(Z) (i.e. identifying, by the computing device, a set of ranges for numerical values within the training data) The function can incorporate the mapping of categorical or binary variables to real numbers (i.e. and converting numerical data into categorical data by replacing a numerical value with a range)).
Referring to independent claims 10 and 16, they are rejected on the same basis as
independent claim 1 since they are analogous claims.
Referring to dependent claims 11-15, they are rejected on the same basis as dependent claims 2-6 since they are analogous claims.
Referring to dependent claims 17-20, they are rejected on the same basis as dependent claims 2-5 since they are analogous claims.
 Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Hueter et al. US 2016/0019587 Al (“Hueter”) in view of Ma, Haiying. "A Study on Customer Segmentation for E-Commerce Using the Generalized Association Rules and Decision Tree." American Journal of Industrial and Business Management 5.12 (2015)(“Ma”) and in view of Carvalho, Deborah R et al. "A hybrid decision tree/genetic algorithm method for data mining." Information Sciences 163.1-3 (2004)(“ Carvalho”). 
Regarding claim 8, Hueter in view of Ma teaches the method of claim 4, but does not teach wherein the determining the optimal set of user segments further comprises: using a genetic algorithm to select an initial population of relevant conditions and create a set of user segments; iteratively performing operations comprising: determining a fitness score of each user segment in the set of user segments, based on the fitness score, combining two of the user segments from the set of user segments, and updating the set of user segments with a new relevant condition; and based on a threshold being met, provide providing a user segment from the set of user segments.
However, Carvalho teaches:  wherein the determining the optimal set of user segments further comprises: using a genetic algorithm to select an initial population of relevant conditions and create a set of user segments(Carvalho, pgs. 15-17, “First, GAs work with a population of candidate solutions (individuals)…Intuitively, the ability of GAs to cope with attribute interaction makes them a potentially useful solution for the problem of small disjuncts….” Carvalho teaches First, GAs work with a population of candidate solutions (individuals) (i.e. using a genetic algorithm to select an initial population of relevant conditions) useful solution for the problem of small disjuncts (i.e. and create a set of user segments)); iteratively performing operations comprising: determining a fitness score of each user segment in the set of user segments, based on the fitness score(Carvalho, pgs. 17-20, “Let us now turn to the fitness function––i.e., to the function used to evaluate the quality of the candidate small-disjunct rule represented by an individual. In both GAs described in this paper, the fitness function is given by the formula: Fitness (TP/(TP + FN)) * (TN/(FP+TN)) where TP, FN, TN and FP standing for the number of true positives, false negatives, true negatives and false positives––are well-known variables often used to evaluate the performance ), combining two of the user segments from the set of user segments, and updating the set of user segments with a new relevant condition(Carvalho, pgs. 25-27, “The pseudo-code of our GA with sequential niching is shown, at a high level of abstraction, in Fig. 6… First, it runs the GA, using TrainingSet-2 as the training data for the GA. The best rule found by the GA is added to RuleSet. Then the examples correctly covered by that rule are removed from TrainingSet-2, so that in the next iteration of the WHILE loop TrainingSet-2 will have a smaller cardinality.” Carvalho teaches The pseudo-code of our GA with sequential niching is shown in Fig. 6.The best rule found by the GA is added to RuleSet (i.e. combining two of the user segments from the set of user segments and updating the set of user segments with a new relevant condition)); and based on a threshold being met, providing a user segment from the set of user segments(Carvalho, pg. 27, “This process is iteratively performed while the number of examples in TrainingSet-2 is greater than a user-defined threshold, specified as 5 in our experiments. (It is assumed that when the cardinality of TrainingSet-2 is smaller than 5 there are too few examples to allow the discovery of a reliable classification rule.).”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Hueter in view of Ma and in view of Carvalho to teach: wherein the determining the optimal set of user segments further comprises: using a genetic algorithm to select an initial population of relevant conditions and create a set of user segments; iteratively performing operations comprising: determining a fitness score of each user segment in the set of user segments, based on the fitness score, combining two of the user segments from the set of user segments, and updating the set of user segments with a new relevant condition; and based on a threshold being met, provide providing a user segment from the set of user segments. The motivation to do so would be to reduce the errors associated with .

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J. Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



Adam Clark Standke
Assistant Examiner
Art Unit 2129
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 According to the broadest reasonable interpretation (BRI), the use of alternative language amounts to the claim requiring one or more elements but not all.