DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending and have been examined.
The present application was filed on 03/07/2018.
Information Disclosure Statement

The information disclosure statement (IDS) was submitted on 09/26/2019 and 09/18/2020.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding Claim 1, 
Step 1 Analysis:  Claim 1 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to a method of generating decision trees using historical information provided as a dataset.  The following limitation:
generating … an ensemble of decision trees based at least in part on the historical data such that for each given leaf node of decision trees in the ensemble, the given leaf node is 
as drafted, is a process that, under its broadest reasonable interpretation, cover mental processes (concepts performed in the human mind (including an observation, evaluation, judgement, opinion)) but for the recitation of generic computer components (“by the one or more hardware processors”).  For example, but for the generic computer components language, the above limitation in the context of this claim encompasses generating … an ensemble of decision trees based at least in part on the historical data such that for each given leaf node of decision trees in the ensemble, the given leaf node is associated with a contingency table of action-outcome value pairs, wherein the contingency table associated with the given leaf node is generated based at least in part on the set of observed feature values, the observed action value, and the observed outcome value for a given set of instances associated with the given leaf node, and wherein the given set of instances is included in the set of instances (corresponds to evaluation and judgement), which is considered a mental process. Accordingly, the claim recites an abstract idea. 
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional element of  “by the one or more hardware processors” as drafted, is reciting 
accessing … historical data for a set of instances, wherein for a particular instance in the set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value; and 
… storing … the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Regarding Claim 2, 
Step 1 Analysis:  Claim 2 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites the method of claim 1. The claim further recites wherein the generating the ensemble of decision trees cornprises, for a particular decision tree in the ensemble: generating a subsample of instances from the set of instances; and generating the particular decision tree from a root non-leaf node to a plurality of leaf nodes based at least in part on the subsample of instances, wherein the generating the particular decision tree comprises, for each particular non-leaf node of the particular decision tree: generating a set of candidate features from the set of features; for each particular candidate feature in the set of candidate features, determining a set of possible splits for splitting a particular set of instances of the particular non-leaf node into a left set of instances for a left non-leaf node and a right set of instances for a right non-leaf node, determining, from the set of possible splits, an optimal split for the particular set of instances; associating the particular non-leaf node with a contingency table of action-outcome value pairs generated based at least in part on the particular set of instances of the particular non-leaf node; determining whether the optimal split satisfies a leaf node criterion; in response to determining that the optimal split does not satisfy the leaf node criterion, splitting the particular set of instances, according to the optimal split, into the left set of instances and the right set of instances; and in response to determining that the optimal split satisfies the leaf node criterion, designating the particular non-leaf node to be a leaf node. Nothing in the claim elements precludes these steps from practically being performed in the mind.  
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The 
accessing … historical data for a set of instances, wherein for a particular instance in the set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value; and 
… storing … the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Regarding Claim 3, 
Step 1 Analysis:  Claim 3 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to a method of generating decision trees using historical information provided as a dataset.  The following limitation:
wherein the determining the set of possible splits for the particular candidate feature, into the left set of instances and into the right set of instances, is based at least in part on a split criterion at least defined by:                     
                        S
                        
                            
                                
                                    
                                        L
                                        ,
                                        R
                                    
                                
                                ;
                                P
                            
                        
                        =
                        C
                        
                            
                                P
                            
                        
                        -
                        
                            
                                C
                                
                                    
                                        L
                                    
                                
                                +
                                C
                                
                                    
                                        R
                                    
                                
                            
                        
                        ,
                    
                 where L represents the left set of instances, where R represents the right set of instances, where P represents the particular set of instances such that                     
                        P
                        =
                        L
                         
                        ∪
                        R
                        ,
                    
                 where for k number of actions,                      
                        C
                        
                            
                                U
                            
                        
                        =
                         
                        φ
                        
                            
                                
                                    
                                        r
                                    
                                    
                                        o
                                    
                                
                                
                                    
                                        U
                                    
                                
                                +
                                1
                            
                        
                        +
                         
                        φ
                        
                            
                                
                                    
                                        r
                                    
                                    
                                        1
                                    
                                
                                
                                    
                                        U
                                    
                                
                                +
                                1
                            
                        
                        -
                        φ
                        
                            
                                N
                                
                                    
                                        U
                                    
                                
                                +
                                1
                            
                        
                        +
                         
                        
                            
                                ∑
                                
                                    i
                                    =
                                    1
                                
                                
                                    k
                                
                            
                            
                                φ
                                
                                    
                                        
                                            
                                                c
                                            
                                            
                                                i
                                            
                                        
                                        
                                            
                                                U
                                            
                                        
                                        +
                                        1
                                    
                                
                            
                        
                        -
                         
                        φ
                        
                            
                                
                                    
                                        a
                                    
                                    
                                        o
                                        i
                                    
                                
                                
                                    
                                        U
                                    
                                
                                +
                                1
                            
                        
                        -
                         
                        φ
                        
                            
                                
                                    
                                        r
                                        a
                                    
                                    
                                        1
                                        i
                                    
                                
                                
                                    
                                        U
                                    
                                
                                +
                                1
                            
                        
                        ,
                    
                 where                     
                        φ
                    
                 represents a log function, where U represents the contingency table associated with the particular non-leaf node, where                     
                        
                            
                                r
                            
                            
                                i
                            
                        
                        
                            
                                U
                            
                        
                    
                 represents a sum of row i,                     
                        
                            
                                c
                            
                            
                                j
                            
                        
                        
                            
                                U
                            
                        
                    
                 represents a sum of column i, where                     
                        
                            
                                a
                            
                            
                                i
                                j
                            
                        
                        
                            
                                U
                            
                        
                    
                 represents an entry of the contingency table U at row i, column j, and where N(U) represents a sum of row sums and column sums
as drafted, is a process that, under its broadest reasonable interpretation, includes no tangible structural elements as it covers the splitting operation based on a mathematical relationship with no limits on the claim scope.  The claim is directed to an abstract idea. 
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional element of  “by the one or more hardware processors” as drafted, is reciting implementing a mental process in a computer environment and using a computer as a tool to perform a mental process. The following steps:
accessing … historical data for a set of instances, wherein for a particular instance in the set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value; and 
… storing … the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Regarding Claim 4, 
Step 1 Analysis:  Claim 4 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to a method of generating decision trees using historical information provided as a dataset.  The following limitation:
wherein the determining the optimal split is further based at least in part on an optimal split criterion at least defined by:                      
                        
                            
                                
                                    
                                        L
                                    
                                    
                                        *
                                    
                                
                                ,
                                 
                                
                                    
                                        R
                                    
                                    
                                        *
                                    
                                
                            
                        
                        =
                        a
                        r
                        g
                        
                            
                                
                                    
                                        max
                                    
                                    
                                        (
                                        L
                                        ,
                                        R
                                        )
                                    
                                
                            
                            ⁡
                            
                                S
                                 
                                
                                    
                                        
                                            
                                                L
                                                ,
                                                R
                                            
                                        
                                        ;
                                        P
                                    
                                
                                ,
                            
                        
                    
                 where                     
                        
                            
                                L
                            
                            
                                *
                            
                        
                    
                 represents the left set of instances according to the optimal split and                     
                        
                            
                                R
                            
                            
                                *
                            
                        
                    
                represents the right set of instances according to the optimal split
as drafted, is a process that, under its broadest reasonable interpretation, includes no tangible structural elements as it covers the splitting operation based on a mathematical relationship with no limits on the claim scope.  The claim is directed to an abstract idea. 
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional element of  “by the one or more hardware processors” as drafted, is reciting implementing a mental process in a computer environment and using a computer as a tool to perform a mental process. The following steps:
accessing … historical data for a set of instances, wherein for a particular instance in the set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value; and 
… storing … the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept.
Regarding Claim 5, 
Step 1 Analysis:  Claim 5 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to a method of generating decision trees using historical information provided as a dataset.  The following limitation:
wherein leaf node criterion specifies that the optimal split satisfies a condition at least defined by:                     
                        S
                         
                        
                            
                                
                                    
                                        L
                                        ,
                                        R
                                    
                                
                                ;
                                P
                            
                        
                        <
                        0
                         
                    
                
as drafted, is a process that, under its broadest reasonable interpretation, includes no tangible structural elements as it covers the splitting operation based on a mathematical relationship with no limits on the claim scope.  The claim is directed to an abstract idea. 
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional element of  “by the one or more hardware processors” as drafted, is reciting 
accessing … historical data for a set of instances, wherein for a particular instance in the set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value; and 
… storing … the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept.
Regarding Claim 6, 
Step 1 Analysis:  Claim 6 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites the method of claim 2. The claim further recites wherein the determining the set of possible splits for each particular candidate feature in the set of candidate features comprises subgrouping values of the particular candidate features. Nothing in the claim elements precludes these steps from practically being performed in the mind.  
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional element of  “by the one or more hardware processors” as drafted, is reciting implementing a mental process in a computer environment and using a computer as a tool to perform a mental process. The following steps:
accessing … historical data for a set of instances, wherein for a particular instance in the set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value; and 
… storing … the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Regarding Claim 7, 
Step 1 Analysis:  Claim 7 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites the method of claim 2. The claim further recites wherein the determining the set of possible splits for each particular candidate feature in the set of candidate features comprises ordering values of the particular candidate features. Nothing in the claim elements precludes these steps from practically being performed in the mind.  
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional element of  “by the one or more hardware processors” as drafted, is reciting implementing a mental process in a computer environment and using a computer as a tool to perform a mental process. The following steps:
accessing … historical data for a set of instances, wherein for a particular instance in the set of instances, wherein the historical data comprises, for a particular instance in the set of 
… storing … the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Regarding Claim 8, 
Step 1 Analysis:  Claim 8 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to a method of generating decision trees using historical information provided as a dataset.  The following limitation:
generating … prediction data by processing the input data using the ensemble, the prediction data comprising a predicted action value for the set of new feature values

Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional element of  “by the one or more hardware processors” as drafted, is reciting implementing a mental process in a computer environment and using a computer as a tool to perform a mental process. The following steps:
accessing … model data comprising an ensemble of decision trees trained on historical data for a set of instances, wherein for a particular instance in the set of instances, the historical data comprises a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value, and wherein each leaf node of decision trees in the ensemble is associated with a contingency table of action-outcome value pairs generated based at least in part on the historical data; and 
accessing … input data comprising a set of new feature values for the set of features

Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Regarding Claim 9, 
Step 1 Analysis:  Claim 9 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites the method of claim 8. The claim further recites wherein the generating the prediction data comprises: determining a set of particular contingency tables of action-outcome value pairs from leaf nodes of the decisions trees in the ensernble by processing the set of new feature values using each particular decision tree, in the ensemble, to determine a particular contingency table of a leaf node of the particular decision tree; and determining the predicted action value for the set of new feature values by combining the set of particular contingency tables. Nothing in the claim elements precludes these steps from practically being performed in the mind.  
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional element of  “by the one or more hardware processors” as drafted, is reciting implementing a mental process in a computer environment and using a computer as a tool to perform a mental process. The following steps:
accessing … model data comprising an ensemble of decision trees trained on historical data for a set of instances, wherein for a particular instance in the set of instances, the historical data comprises a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value, and wherein each leaf node of decision trees in the ensemble is associated with a contingency table of action-outcome value pairs generated based at least in part on the historical data; and 
accessing … input data comprising a set of new feature values for the set of features
represents steps required to access data when all applications of the method would require accessing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental 
Regarding Claim 10, 
Step 1 Analysis:  Claim 10 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites the method of claim 9. The claim further recites  wherein the combining the set of particular contingency tables comprises: determining a set of best action values for the set of particular contingency tables by determining, for each given contingency table in the set of particular contingency tables, a best action value voted for by a given decision tree in the ensemble based at least in part on the given contingency table associated with the given decision tree; determining a set of counts for one or more different action values by determining a count for each different action value in the set of best action values; and selecting the predicted action value, from the one or more different action values, based at least in part on the set of counts. Nothing in the claim elements precludes these steps from practically being performed in the mind.  
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional element of  “by the one or more hardware processors” as drafted, is reciting implementing a mental process in a computer environment and using a computer as a tool to perform a mental process. The following steps:
accessing … model data comprising an ensemble of decision trees trained on historical data for a set of instances, wherein for a particular instance in the set of instances, the historical data comprises a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value, and wherein each leaf node of decision trees in the ensemble is associated with a contingency table of action-outcome value pairs generated based at least in part on the historical data; and 
accessing … input data comprising a set of new feature values for the set of features
represents steps required to access data when all applications of the method would require accessing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Regarding Claim 11, 
Step 1 Analysis:  Claim 11 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites the method of claim 9. The claim further recites  wherein the combining the set of particular contingency tables comprises: determining a set of 
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional element of  “by the one or more hardware processors” as drafted, is reciting implementing a mental process in a computer environment and using a computer as a tool to perform a mental process. The following steps:
accessing … model data comprising an ensemble of decision trees trained on historical data for a set of instances, wherein for a particular instance in the set of instances, the historical data comprises a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value, and wherein each leaf node of decision trees in the ensemble is associated with a contingency table of action-outcome value pairs generated based at least in part on the historical data; and 
accessing … input data comprising a set of new feature values for the set of features
represents steps required to access data when all applications of the method would require accessing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Regarding Claim 12, 
Step 1 Analysis:  Claim 12 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites the method of claim 9. The claim further recites wherein the contingency table comprises a 2 x k contingency table, where k represents number of different action values. Nothing in the claim elements precludes these steps from practically being performed in the mind.  
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional element of “by the one or more hardware processors” as drafted, is reciting implementing a mental process in a computer environment and using a computer as a tool to perform a mental process. The following steps:
accessing … model data comprising an ensemble of decision trees trained on historical data for a set of instances, wherein for a particular instance in the set of instances, the historical data comprises a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value, and wherein each leaf node of decision trees in the ensemble is associated with a contingency table of action-outcome value pairs generated based at least in part on the historical data; and 
accessing … input data comprising a set of new feature values for the set of features
represents steps required to access data when all applications of the method would require accessing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Regarding Claim 13, 
Step 1 Analysis:  Claim 13 is directed to a computer-readable medium, which is directed to an article of manufacture, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to a method of generating decision trees using historical information provided as a dataset.  The following limitation:
generating … an ensemble of decision trees based at least in part on the historical data such that for each given leaf node of decision trees in the ensemble, the given leaf node is associated with a contingency table of action-outcome value pairs, wherein the contingency table associated with the given leaf node is generated based at least in part on the set of observed feature values, the observed action value, and the observed outcome value for a given set of instances associated with the given leaf node, and wherein the given set of instances is included in the set of instances
as drafted, is a process that, under its broadest reasonable interpretation, cover mental processes (concepts performed in the human mind (including an observation, evaluation, judgement, opinion)) but for the recitation of generic computer components (“instructions that, when executed by one or more hardware processors of a machine, cause the machine to perform operations”, “by the one or more hardware processors”).  For example, but for the generic computer components language, the above limitation in the context of this claim encompasses generating … an ensemble of decision trees based at least in part on the historical data such that for each given leaf node of decision trees in the ensemble, the given leaf node is associated with a contingency table of action-outcome value pairs, wherein the contingency table associated with the given leaf node is generated based at least in part on the set of observed feature values, the observed action value, and the observed outcome value for a given set of instances associated with the given leaf node, and wherein the given set of instances is included in the set of instances (corresponds to evaluation and judgement), which is considered a mental process. Accordingly, the claim recites an abstract idea. 
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to 
accessing  historical data for a set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value; and 
… storing the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Regarding Claim 14, 
Step 1 Analysis:  Claim 14 is directed to a computer-readable medium, which is directed to an article of manufacture, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites the medium of claim 13. The claim further recites wherein the generating the ensemble of decision trees comp1ises, for a particular decision tree in the ensemble: generating a subsample of instances from the set of instances; and generating the particular decision tree from a root non-leaf node to a plurality of leaf nodes based at least in part on the subsample of instances, wherein the generating the particular decision tree comprises, for each particular non-leaf node of the particular decision tree:  generating a set of candidate features from the set of features; for each particular candidate feature in the set of candidate features, determining a set of possible splits for splitting a particular set of instances of the particular non-leaf node into a left set of instances for a left non-leaf node and a right set of instances for a right non-leaf node, determining, from the set of possible splits, an optimal split for the particular set of instances; associating the particular non-leaf node with a contingency table of action-outcome value pairs based at least in part on the particular set of instances of the particular non-leaf node; determining whether the optimal split satisfies a leaf node criterion; in response to determining that the optimal split does not satisfy the leaf node criterion, splitting the particular set of instances, according to the optimal split, into the left set of instances and the right set of instances; and in response to determining that the optimal split satisfies the leaf node criterion, designating the particular non-leaf node to be a leaf node.. Nothing in the claim elements precludes these steps from practically being performed in the mind.  
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to 
accessing  historical data for a set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value; and 
… storing the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Regarding Claim 15, 
Step 1 Analysis:  Claim 15 is directed to a computer-readable medium, which is directed to an article of manufacture, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to a method of generating decision trees using historical information provided as a dataset.  The following limitation:
wherein the determining the set of possible splits for the particular candidate feature, into the left set of instances and into the right set of instances, is based at least in part on a split criterion at least defined by:                     
                        S
                        
                            
                                
                                    
                                        L
                                        ,
                                        R
                                    
                                
                                ;
                                P
                            
                        
                        =
                        C
                        
                            
                                P
                            
                        
                        -
                        
                            
                                C
                                
                                    
                                        L
                                    
                                
                                +
                                C
                                
                                    
                                        R
                                    
                                
                            
                        
                        ,
                    
                 where L represents the left set of instances, where R represents the right set of instances, where P represents the particular set of instances such that                     
                        P
                        =
                        L
                         
                        ∪
                        R
                        ,
                    
                 where for k number of actions,                      
                        C
                        
                            
                                U
                            
                        
                        =
                         
                        φ
                        
                            
                                
                                    
                                        r
                                    
                                    
                                        o
                                    
                                
                                
                                    
                                        U
                                    
                                
                                +
                                1
                            
                        
                        +
                         
                        φ
                        
                            
                                
                                    
                                        r
                                    
                                    
                                        1
                                    
                                
                                
                                    
                                        U
                                    
                                
                                +
                                1
                            
                        
                        -
                        φ
                        
                            
                                N
                                
                                    
                                        U
                                    
                                
                                +
                                1
                            
                        
                        +
                         
                        
                            
                                ∑
                                
                                    i
                                    =
                                    1
                                
                                
                                    k
                                
                            
                            
                                φ
                                
                                    
                                        
                                            
                                                c
                                            
                                            
                                                i
                                            
                                        
                                        
                                            
                                                U
                                            
                                        
                                        +
                                        1
                                    
                                
                            
                        
                        -
                         
                        φ
                        
                            
                                
                                    
                                        a
                                    
                                    
                                        o
                                        i
                                    
                                
                                
                                    
                                        U
                                    
                                
                                +
                                1
                            
                        
                        -
                         
                        φ
                        
                            
                                
                                    
                                        r
                                        a
                                    
                                    
                                        1
                                        i
                                    
                                
                                
                                    
                                        U
                                    
                                
                                +
                                1
                            
                        
                        ,
                    
                 where                     
                        φ
                    
                 represents a log function, where U represents the contingency table associated with the particular non-leaf node, where                     
                        
                            
                                r
                            
                            
                                i
                            
                        
                        
                            
                                U
                            
                        
                    
                 represents a sum of row i,                     
                        
                            
                                c
                            
                            
                                j
                            
                        
                        
                            
                                U
                            
                        
                    
                 represents a sum of column i, where                     
                        
                            
                                a
                            
                            
                                i
                                j
                            
                        
                        
                            
                                U
                            
                        
                    
                 represents an entry of the contingency table U at row i, column j, and where N(U) represents a sum of row sums and column sums
as drafted, is a process that, under its broadest reasonable interpretation, includes no tangible structural elements as it covers the splitting operation based on a mathematical relationship with no limits on the claim scope.  The claim is directed to an abstract idea. 
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The 
accessing  historical data for a set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value; and 
… storing the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 


Regarding Claim 16, 
Step 1 Analysis:  Claim 16 is directed to a computer-readable medium, which is directed to an article of manufacture, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to a method of generating decision trees using historical information provided as a dataset.  The following limitation:
wherein the determining the optimal split is further based at least in part on an optimal split criterion at least defined by:                      
                        
                            
                                
                                    
                                        L
                                    
                                    
                                        *
                                    
                                
                                ,
                                 
                                
                                    
                                        R
                                    
                                    
                                        *
                                    
                                
                            
                        
                        =
                        a
                        r
                        g
                        
                            
                                
                                    
                                        max
                                    
                                    
                                        
                                            
                                                L
                                                ,
                                                R
                                            
                                        
                                    
                                
                            
                            ⁡
                            
                                S
                                 
                                
                                    
                                        
                                            
                                                L
                                                ,
                                                R
                                            
                                        
                                        ;
                                        P
                                    
                                
                                ,
                            
                        
                    
                
as drafted, is a process that, under its broadest reasonable interpretation, includes no tangible structural elements as it covers the splitting operation based on a mathematical relationship with no limits on the claim scope.  The claim is directed to an abstract idea. 
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional elements of “instructions that, when executed by one or more hardware processors of a machine, cause the machine to perform operations” and “by the one or more hardware processors” as drafted, are reciting implementing mental processes in a computer environment and using a computer as a tool to perform a mental process. The following steps:
accessing  historical data for a set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value; and 
… storing the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Regarding Claim 17, 
Step 1 Analysis:  Claim 17 is directed to a computer-readable medium, which is directed to an article of manufacture, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to a method of generating decision trees using historical information provided as a dataset.  The following limitation:
wherein leaf node criterion specifies that the optimal split satisfies a condition at least defined by:                     
                        S
                         
                        
                            
                                
                                    
                                        L
                                        ,
                                        R
                                    
                                
                                ;
                                P
                            
                        
                        <
                        0
                         
                    
                
as drafted, is a process that, under its broadest reasonable interpretation, includes no tangible structural elements as it covers the splitting operation based on a mathematical relationship with no limits on the claim scope.  The claim is directed to an abstract idea. 
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional elements of “instructions that, when executed by one or more hardware processors of a machine, cause the machine to perform operations” and “by the one or more hardware processors” as drafted, are reciting implementing mental processes in a computer environment and using a computer as a tool to perform a mental process. The following steps:
accessing  historical data for a set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value; and 
… storing the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental 
Regarding Claim 18, 
Step 1 Analysis:  Claim 18 is directed to a computer-readable medium, which is directed to an article of manufacture, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites the medium of claim 14. The claim further recites wherein the determining the set of possible splits for each particular candidate feature in the set of candidate features comprises subgrouping values of the particular candidate features. Nothing in the claim elements precludes these steps from practically being performed in the mind.  
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional elements of “instructions that, when executed by one or more hardware processors of a machine, cause the machine to perform operations” and “by the one or more hardware processors” as drafted, are reciting implementing mental processes in a computer environment and using a computer as a tool to perform a mental process. The following steps:
accessing  historical data for a set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value; and 
… storing the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Regarding Claim 19, 
Step 1 Analysis:  Claim 19 is directed to a computer-readable medium, which is directed to an article of manufacture, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites the medium of claim 14. The claim further recites wherein the determining the set of possible splits for each particular candidate feature in the set of candidate features comprises ordering values of the particular candidate features. Nothing in the claim elements precludes these steps from practically being performed in the mind.  

Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The additional elements of “instructions that, when executed by one or more hardware processors of a machine, cause the machine to perform operations” and “by the one or more hardware processors” as drafted, are reciting implementing mental processes in a computer environment and using a computer as a tool to perform a mental process. The following steps:
accessing  historical data for a set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value; and 
… storing the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees
represents steps required to access and store data when all applications of the method would require accessing and storing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental 
Regarding Claim 20,
Step 1 Analysis:  Claim 20 is directed to a system, which is directed to a machine, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to a method of generating decision trees using historical information provided as a dataset.  The following limitation:
generating prediction data by processing the input data using the ensemble, the prediction data comprising a predicted action value for the set of new feature values
as drafted, is a process that, under its broadest reasonable interpretation, cover mental processes (concepts performed in the human mind (including an observation, evaluation, judgement, opinion)) but for the recitation of generic computer components (“one or more hardware processors; and a memory storing instructions configured to instruct the one or more hardware processors to perform operations”, “an Industrial Internet-of-Things (IIoT) device”).  For example, but for the generic computer components language, the above limitation in the context of this claim encompasses generating prediction data by processing the input data using the ensemble, the prediction data comprising a predicted action value for the set of new feature values (corresponds to evaluation and judgement), which is considered a mental process. Accordingly, the claim recites an abstract idea. 
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.  In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, performing a mental process in a computer environment or using a computer as a tool to perform a mental process.  See MPEP 2106.  The 
accessing model data comprising an ensemble of decision trees trained on historical data for a set of instances, wherein for a particular instance in the set of instances, the historical data comprises a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value, and wherein each leaf node of decision trees in the ensemble is associated with a contingency table of action-outcome value pairs generated based at least in part on the historical data; 
accessing input data based at least in part on device data received from … , the input data comprising a set of new feature values for the set of features
represents steps required to access data when all applications of the method would require accessing data. As such, these steps are considered insignificant extra solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.   The claim is directed to an abstract idea. 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, the additional element amounts to implementing a mental process in a computer environment and using a computer as a tool to perform a mental process.   Implementing a mental process in a computer environment or using a computer as a tool to perform a mental process cannot provide an inventive concept. 
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1, 8, and 13 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Ma et al.  (US 2012/0066125 A1).
Regarding Claim 1,
Ma et al. teaches a method comprising:
accessing, by one or more hardware processors, historical data for a set of instances (paragraph 0019, “The modeling computer stores a data model in memory representing transaction features, transaction feature values, transaction acceptance decisions and rejection decisions that the reviewer could perform or did perform, based at least in part on the set of similar transactions” and paragraph 0026, “Many types of data are captured as part of a proposed online credit transaction and may be made available to a reviewer. As used herein, each type of data is termed a feature , such that each proposed transaction and historical transaction is a collection of data values, with one data value corresponding to one feature …” teach data model comprising historical online credit transactions stored on computer for 
wherein for a particular instance in the set of instances, wherein the historical data comprises, for a particular instance in the set of instances, a set of observed feature values for a set of features (paragraphs 0026-0036, “[0026] Many types of data are captured as part of a proposed online credit transaction and may be made available to a reviewer. As used herein, each type of data is termed a feature, such that each proposed transaction and historical transaction is a collection of data values, with one data value corresponding to one feature. Examples of features include: 
[0027] 1. Name or type of fraud scoring model that is used to score a transaction;
[0028] 2. Country identified in a billing address for the customer,
[0029] 3. Country identified in a shipping address for the transaction;
[0030] 4. Fraud score;
[0031] 5. One or more Decision rules that triggered the manual review;
[0032] 6. One or more Factor codes, … ;
[0033] 7. One or more Information codes, … ;
[0034] 8. Merchant identifier uniquely identifying a merchant of the goods or services involved in the transaction;
[0035] 9. Name, number or other identifier of the reviewer;
[0036] 10. Organization that is performing the review … “
teaches a set of attributes associated with an online credit card transaction with different possible numeric/non-numeric values per attribute [observed feature values for a set of features]),
paragraph 0056, “… the set of candidate features comprise features corresponding to factor codes and information codes. … each factor code and information code candidate feature may only take on values of one (corresponding to “fired) and zero (corresponding to “not fired.”)” teaches observed values of “fired” and “not fired” for factor code feature and information code feature [observed action value]), and 
an observed outcome value for the observed action value (paragraph 0048,  “In hierarchical decision tree 100, transaction features that are more discriminating in predicting a likelihood of accepting or rejecting a transaction under review are represented as nodes closer to the root of hierarchical decision tree 100 than transaction features that are less discriminative …” teaches transaction acceptance and rejection of a transaction [observed outcome value for the observed action value]);
generating, by the one or more hardware processors, an ensemble of decision trees based at least in part on the historical data such that for each given leaf node of decision trees in the ensemble, the given leaf node is associated with a contingency table of action-outcome value pairs (paragraph 0066 “At step 212, a feature from the set of remaining available/combined features is selected as the splitting node to be added to the sub-tree. In an embodiment, each splitting node is determined using a contingency table calculated for each candidate feature from the set of remaining available/ combined features. A contingency table is constructed using the set of historical transactions corresponding to the parent node of the sub-tree, with this set separated in subsets representing the acceptance or rejection the historical transactions, and separated by the data values of each candidate feature”  teaches generating splitting nodes from each feature to be added to sub-tree, with each splitting node being determined using a contingency table [generating[[,]], by the one or more hardware processors, 
wherein the contingency table associated with the given leaf node is generated based at least in part on the set of observed feature values, the observed action value, and the observed outcome value for a given set of instances associated with the given leaf node (paragraph 0056 “ … the set of candidate features comprise features corresponding to factor codes and information codes. In an embodiment each factor code and information code candidate feature may only take on values of one (corresponding to “fired) and zero (corresponding to “not fired.”)” and paragraph 0067, “In an embodiment, each candidate feature takes on data values of zero or one. A data value of one corresponds to the “firing of the candidate feature, and a data value of zero corresponds to the candidate feature “not firing.” “Firing in this context, means that a candidate feature in a particular transaction, which may be a historical transaction, has a data value of one with respect to a particular outcome for the particular transaction … A contingency table for this candidate feature could appear as follows:

    PNG
    media_image1.png
    217
    662
    media_image1.png
    Greyscale

”   teaches values of one or zero for factor code feature and information code feature  [set of observed feature values], teaches “fired” vs “not fired” [observed action value], and teaches 
wherein the given set of instances is included in the set of instances (paragraph 0066 “At step 212, a feature from the set of remaining available/combined features is selected as the splitting node to be added to the sub-tree. In an embodiment, each splitting node is determined using a contingency table calculated for each candidate feature from the set of remaining available/ combined features. A contingency table is constructed using the set of historical transactions corresponding to the parent node of the sub-tree, with this set separated in subsets representing the acceptance or rejection the historical transactions, and separated by the data values of each candidate feature” teaches set of historical transactions associated with a given feature being used to construct the contingency table [wherein the given set of instances is included in the set of instances]); and 
	storing, by the one or more hardware processors, the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees (paragraph 0019 “The modeling computer stores a data model in memory representing transaction features, transaction feature values, transaction acceptance decisions and rejection decisions that the reviewer could perform or did perform, based at least in part on the set of similar transactions” and paragraph 0022 “In an embodiment, the data model is a decision tree represented by an XML file…” teaches storing a decision tree in an XML file containing historical transactions used to construct the model [storing, by the one or more hardware processors, the ensemble of decision trees as model data for subsequent access and use of the ensemble of decision trees]).
Regarding Claim 8,
Ma et al. teaches a method comprising:
accessing, by one or more hardware processors, model data comprising an ensemble of decision trees trained on historical data for a set of instances (paragraph 0019, “The modeling computer stores a data model in memory representing transaction features, transaction feature values, transaction acceptance decisions and rejection decisions that the reviewer could perform or did perform, based at least in part on the set of similar transactions”, paragraph 0022, “ … the data model is a decision tree represented by an XML file …”  and paragraph 0026, “Many types of data are captured as part of a proposed online credit transaction and may be made available to a reviewer. As used herein, each type of data is termed a feature , such that each proposed transaction and historical transaction is a collection of data values, with one data value corresponding to one feature …” teaches decision tree comprising historical online credit transactions stored on computer for access to a reviewer [accessing, by one or more hardware processors, model data comprising an ensemble of decision trees trained on historical data for a set of instances]), 
wherein for a particular instance in the set of instances, the historical data comprises a set of observed feature values for a set of features (paragraphs 0026-0036, “[0026] Many types of data are captured as part of a proposed online credit transaction and may be made available to a reviewer. As used herein, each type of data is termed a feature, such that each proposed transaction and historical transaction is a collection of data values, with one data value corresponding to one feature. Examples of features include: 
[0027] 1. Name or type of fraud scoring model that is used to score a transaction;
[0028] 2. Country identified in a billing address for the customer,
[0029] 3. Country identified in a shipping address for the transaction;
[0030] 4. Fraud score;
[0031] 5. One or more Decision rules that triggered the manual review;
[0032] 6. One or more Factor codes, … ;
[0033] 7. One or more Information codes, … ;
[0034] 8. Merchant identifier uniquely identifying a merchant of the goods or services involved in the transaction;
[0035] 9. Name, number or other identifier of the reviewer;
[0036] 10. Organization that is performing the review … “
teaches a set of attributes associated with an online credit card transaction with different possible numeric/non-numeric values per attribute [observed feature values for a set of features]),
an observed action value (paragraph 0056, “… the set of candidate features comprise features corresponding to factor codes and information codes. … each factor code and information code candidate feature may only take on values of one (corresponding to “fired) and zero (corresponding to “not fired.”)” teaches observed values of “fired” and “not fired” for factor code feature and information code feature [observed action value]), and 
an observed outcome value for the observed action value (paragraph 0048,  “In hierarchical decision tree 100, transaction features that are more discriminating in predicting a likelihood of accepting or rejecting a transaction under review are represented as nodes closer to the root of hierarchical decision tree 100 than transaction features that are less discriminative …” teaches transaction acceptance and rejection of a transaction [observed outcome value for the observed action value]), and 
wherein each leaf node of decision trees in the ensemble is associated with a contingency table of action-outcome value pairs generated based at least in part on the historical data paragraph 0066 “At step 212, a feature from the set of remaining available/combined features is selected as the splitting node to be added to the sub-tree. In an embodiment, each splitting node is determined using a contingency table calculated for each candidate feature from the set of remaining available/ combined features. A contingency table is constructed using the set of historical transactions corresponding to the parent node of the sub-tree, with this set separated in subsets representing the acceptance or rejection the historical transactions, and separated by the data values of each candidate feature”  teaches generating splitting nodes from each feature to be added to sub-tree, with each splitting node being determined using a contingency table [wherein each leaf node of decision trees in the ensemble is associated with a contingency table of action-outcome value pairs generated based at least in part on the historical data]);
accessing, by the one or more hardware processors, input data comprising a set of new feature values for the set of features (paragraph 0018, “At a modeling computer, data items are collected related to a proposed online credit purchase transaction that has been recommended for review. A set of similar past online credit purchase transactions are identified. Each member of the set has one or more transaction features having transaction feature data values that are similar to the transaction data items of the proposed online credit purchase transaction, and a decision value specifying whether the member of the set was actually accepted or rejected by a reviewer after a review”  teaches receiving new credit purchase transaction comprising one or more transaction features at a computer [accessing, by the one or more hardware processors, input data comprising a set of new feature values for the set of features]); and
paragraphs 0018-0020, “[0018] At a modeling computer, data items are collected related to a proposed online credit purchase transaction that has been recommended for review. A set of similar past online credit purchase transactions are identified. Each member of the set has one or more transaction features having transaction feature data values that are similar to the transaction data items of the proposed online credit purchase transaction, and a decision value specifying whether the member of the set was actually accepted or rejected by a reviewer after review. 
[0019] The modeling computer stores a data model in memory representing transaction features, transaction feature values, transaction acceptance decisions and rejection decisions that the reviewer could perform or did perform, based at least in part on the set of similar transactions.
[0020]  … the data model is used to automatically determine a likelihood value representing a particular decision of whether the proposed online credit card transaction would be accepted or rejected by the reviewer of the merchant if the reviewer actually reviewed the transaction data”  and paragraph 0022 “ … the data model is a decision tree represented by an XML file …”  teaches computing likelihood value using decision tree representing a decision regarding proposed credit card transaction based upon action values of transaction features [generating, by the one or more hardware processors, prediction data by processing the input data using the ensemble, the prediction data comprising a predicted action value for the set of new feature values]). 
Regarding Claim 13,

	Ma et al.  further teaches a non-transitory computer-readable medium comprising instructions that, when executed by one or more hardware processors of a machine, cause the
machine to perform operations (paragraphs 0116-0117  “[0116] The term “storage media' as used herein refers to any media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 710. Volatile media includes dynamic memory. Such as main memory 706. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM,
any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.  [0117] Storage media is distinct from but may be used in conjunction with transmission media …”).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


Claims 2, 6-7, 14, and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Ma et al.  (US 2012/0066125 A1) in view of López-Chau et al. (“Fisher’s decision tree”).
Regarding Claim 2,
Ma et al. teaches the method of claim 1. 
	Ma et al. further teaches wherein the generating the ensemble of decision trees cornprises, for a particular decision tree in the ensemble: generating a subsample of instances from the set of instances (paragraph 0018 “At a modeling computer, data items are collected related to a proposed online credit purchase transaction that has been recommended for review. A set of similar past online credit purchase transactions are identified. Each member of the set has one or more transaction features having transaction feature data values that are similar to the transaction data items of the proposed online credit purchase transaction, and a decision value specifying whether the member of the set was actually accepted or rejected by a reviewer after review” teaches identifying a subset of past online credit card transactions similar to the proposed online credit card transaction recommended for review [generating a subsample of instances from the set of instances]),
generating the particular decision tree from a root non-leaf node to a plurality of leaf nodes based at least in part on the subsample of instances (paragraph 0046, “… creating and storing the stored data model may result in creating and storing a hierarchical decision tree.  In an embodiment, one pre-determined transaction feature is associated with a root node of the decision tree.  The root node corresponds to a top-most decision rule to be applied to data values of a first transaction feature.  A selected set of child nodes associated with corresponding transaction features are also pre-determined” teaches determining a selected set of child nodes of a root node associated with transaction feature [generating the particular decision tree from a root non-leaf node to a plurality of leaf nodes based at least in part on the subsample of instances ]),
wherein the generating the particular decision tree comprises, for each particular non-leaf node of the particular decision tree: generating a set of candidate features from the set of features (paragraph 0048, “ … In hierarchical decision tree 100, a subset of transaction features have been preselected for use at the top levels of hierarchical decision tree 100. Alternatively, determining and selecting the subset of transaction features, which are more discriminating, could be performed using the methods described below for the determination of sub-trees…”  and  paragraph 0056,  “…  the set of candidate features comprise features corresponding to factor codes and information codes. In an embodiment each factor code and information code candidate feature may only take on values of one (corresponding to “fired) and Zero (corresponding to “not fired.”)”  teaches determining a subset of candidate transaction features [generating a set of candidate features from the set of features]); and
… associating the particular non-leaf node with a contingency table of action-outcome value pairs generated based at least in part on the particular set of instances of the particular non-leaf node ( paragraph 0066 “At step 212, a feature from the set of remaining available/combined features is selected as the splitting node to be added to the sub-tree. In an embodiment, each splitting node is determined using a contingency table calculated for each candidate feature from the set of remaining available/ combined features. A contingency table is constructed using the set of historical transactions corresponding to the parent node of the sub-tree, with this set separated in subsets representing the acceptance or rejection the historical transactions, and separated by the data values of each candidate feature” teaches a contingency table being constructed at each splitting node using the set of historical transactions corresponding to the parent node of sub-tree [associating the particular non-leaf node with a contingency table of action-outcome value pairs generated based at least in part on the particular set of instances of the particular non-leaf node]).
Ma et al. does not appear to explicitly teach for each particular candidate feature in the set of candidate features, determining a set of possible splits for splitting a particular set of instances of the particular non-leaf node into a left set of instances for a left non-leaf node and a right set of instances for a right non-leaf node, determining, from the set of possible splits, an optimal split for the particular set of instances; … detemining whether the optimal split satisfies a leaf node criterion; in response to determining that the optimal split does not satisfy the leaf node criterion, splitting the particular set of instances, according to the optimal split, into the left set of instances and the right set of instances; and in response to determining that the optimal split satisfies the leaf node criterion, designating the particular non-leaf node to be a leaf node.
López-Chau et al. teaches for each particular candidate feature in the set of candidate features, determining a set of possible splits for splitting a particular set of instances of the particular non-leaf node into a left set of instances for a left non-leaf node and a right set of instances for a right non-leaf node (p. 6287, section 4, paragraph 8, “Once the tree has been induced each non terminal node contains the vector ω and the optimal split point, the label of unseen examples is predicted by following down the tree down to a leaf as follows:  
If a current node is a leaf then the label of the example is the class associated to the leaf, stop.
Otherwise, project the new example on ω, and go to left or right node according to the following rule
If projection is greater than split point go to right node, 
otherwise go to left node,
recursively repeat this process."
teaches right set of new examples having their projections greater than split point and left set of new examples having projections less than split point [for each particular candidate feature in the set of candidate features, determining a set of possible splits for splitting a particular set of instances of the particular non-leaf node into a left set of instances for a left non-leaf node and a right set of instances for a right non-leaf node]),
p. 6287, Algorithm 2 
    PNG
    media_image2.png
    724
    754
    media_image2.png
    Greyscale

);
determining whether the optimal split satisfies a leaf node criterion (p. 6285, Section 3.1, paragraph 4, “The general methodology to build a decision tree is as follows: beginning from root node (it contains X), split the data into two or more smaller disjoint subsets, each subset should contain all or most of its elements with the same class, however this is not necessary.
Each subset can be considered as a leaf whether certain criterion is satisfied (usually a minimum number of objects within the node or a level of impurity), in this case the partition process is stopped and the node is labeled according majority class in that node, otherwise the process is recursively applied on each subset. A branch or link is created from a node to each one of its partitions” teaches leaf node criterion satisfied based either on a minimum number of objects within the node or minimum level of impurity [determining whether the optimal split satisfies a leaf node criterion]);
in response to determining that the optimal split does not satisfy the leaf node criterion, splitting the particular set of instances, according to the optimal split, into the left set of instances and the right set of instances (p. 6287, section 4, paragraph 8, “Once the tree has been induced each non terminal node contains the vector ω and the optimal split point, the label of unseen examples is predicted by following down the tree down to a leaf as follows:  
If a current node is a leaf then the label of the example is the class associated to the leaf, stop.
Otherwise, project the new example on ω, and go to left or right node according to the following rule
If projection is greater than split point go to right node, 
otherwise go to left node,
recursively repeat this process.”
 teaches splitting of instances according to step 2 [in response to determining that the optimal split does not satisfy the leaf node criterion, splitting the particular set of instances, according to the optimal split, into the left set of instances and the right set of instances]); and 
p. 6287, section 4, paragraph 5, “… When a node can not be split any more because it is pure or there are not enough objects, the node becomes a leaf of the tree. The leaf is assigned to the class having the highest probability.” teaches designation of node as a leaf node based on purity and having less than minimum number of objects [in response to determining that the optimal split satisfies the leaf node criterion, designating the particular non-leaf node to be a leaf node]).
	Ma et al. and López-Chau et al. are considered analogous art because they are directed to training decision tree classifiers in order to produce comprehensible models. 
	In view of the teachings of Ma et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of López-Chau et al. at the time the application was filed in order to induce oblique decision trees whose accuracy, size, number of leaves, and training time are provided at a lower computational cost than univariate decision trees (cf. López-Chau et al., p. 6283, section 1, paragraph 7, “The oblique (or multivariate) trees are DT whose separating hyperplanes are not necessarily parallel to axes, in contrast, the hyperplanes can have an arbitrary direction. The induced oblique DT are usually smaller and in certain cases they achieve better classification accuracy than univariate DT…”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Ma et al. discloses this as a necessary activity for the taught invention (cf. Ma et al., paragraph 0079  “… the hierarchical decision tree is traversed using data for the transaction under review, to obtain a set of “neighbors' of the transaction under review that share the set of transaction feature values. For example, traversal involves starting at a root node of the decision tree, determining what feature the node represents, finding the value for that feature in the data for the transaction under review, 
Regarding Claim 6,
Ma et al. in view of López-Chau et al. teaches the method of claim 2.
Ma et al. further teaches wherein the determining the set of possible splits for each particular candidate feature in the set of candidate feature comprises subgrouping values of the particular candidate features (paragraph 0064, “ … In an embodiment, each feature of the pair of available features takes on data values of zero or one, and data values for the combined available feature are set equal to the logical 'or' value of the data values of the pair of the available features used to form the combined available feature.” teaches subgrouping of data values of features using logical “or” function [wherein the determining the set of possible splits for each particular candidate feature in the set of candidate feature comprises subgrouping values of the particular candidate features]).
Regarding Claim 7,
Ma et al. in view of López-Chau et al. teaches the method of claim 2.
Ma et al. further teaches wherein the determining the set of possible splits for each particular candidate feature in the set of candidate feature comprises ordering values of the particular candidate features. (paragraph 0064, “ … In an embodiment, each feature of the pair of available features takes on data values of zero or one, and data values for the combined available feature are set equal to the logical 'or' value of the data values of the pair of the available features used to form the combined available feature.” teaches ordering values of features via assignment of zero or one to each feature value [wherein the determining the set of possible splits for each particular candidate feature in the set of candidate feature comprises ordering values of the particular candidate features]).
Regarding Claim 14,
	Claim 14 is substantially similar to claim 2 and therefore is rejected on the same ground as claim 2.  Claim 14 is directed to a “non-transitory computer-readable medium” that corresponds to the method of claim 2. 
	Ma et al.  further teaches a non-transitory computer-readable medium comprising instructions that, when executed by one or more hardware processors of a machine, cause the
machine to perform operations (paragraphs 0116-0117  “[0116] The term “storage media' as used herein refers to any media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 710. Volatile media includes dynamic memory. Such as main memory 706. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM,
any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.  [0117] Storage media is distinct from but may be used in conjunction with transmission media …”).
Regarding Claim 18
	Claim 18 is substantially similar to claim 6 and therefore is rejected on the same ground as claim 6.  Claim 18 is directed to a “non-transitory computer-readable medium” that corresponds to the method of claim 6. 
	Ma et al.  further teaches a non-transitory computer-readable medium comprising instructions that, when executed by one or more hardware processors of a machine, cause the
machine to perform operations (paragraphs 0116-0117  “[0116] The term “storage media' as used herein refers to any media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 710. Volatile media includes dynamic memory. Such as main memory 706. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM,
any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.  [0117] Storage media is distinct from but may be used in conjunction with transmission media …”).
Regarding Claim 19,
	Claim 19 is substantially similar to claim 7 and therefore is rejected on the same ground as claim 7.  Claim 19 is directed to a “non-transitory computer-readable medium” that corresponds to the method of claim 7. 
	Ma et al.  further teaches a non-transitory computer-readable medium comprising instructions that, when executed by one or more hardware processors of a machine, cause the
paragraphs 0116-0117  “[0116] The term “storage media' as used herein refers to any media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 710. Volatile media includes dynamic memory. Such as main memory 706. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM,
any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.  [0117] Storage media is distinct from but may be used in conjunction with transmission media …”).
Claims 3-5 and 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over Ma et al.  (US 2012/0066125 A1) in view of López-Chau et al. (“Fisher’s decision tree”) and in further view of Stokes et al. (US 2012/0323829 A1).
Regarding Claim 3, 
	Ma et al. in view of López-Chau et al. teaches the method of claim 2.
	Ma et al. in view of López-Chau et al. does not appear to explicitly teach wherein the determining the set of possible splits for the particular candidate feature, into the left set of instances and into the right set of instances, is based at least in part on a split criterion at least defined by:                         
                            S
                            
                                
                                    
                                        
                                            L
                                            ,
                                            R
                                        
                                    
                                    ;
                                    P
                                
                            
                            =
                            C
                            
                                
                                    P
                                
                            
                            -
                            
                                
                                    C
                                    
                                        
                                            L
                                        
                                    
                                    +
                                    C
                                    
                                        
                                            R
                                        
                                    
                                
                            
                            ,
                        
                     where L represents the left set of instances, where R represents the right set of instances, where P represents the particular set of instances such that                         
                            P
                            =
                            L
                             
                            ∪
                            R
                            ,
                        
                     where for k number of actions,                          
                            C
                            
                                
                                    U
                                
                            
                            =
                             
                            φ
                            
                                
                                    
                                        
                                            r
                                        
                                        
                                            o
                                        
                                    
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            +
                             
                            φ
                            
                                
                                    
                                        
                                            r
                                        
                                        
                                            1
                                        
                                    
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            -
                            φ
                            
                                
                                    N
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            +
                             
                            
                                
                                    ∑
                                    
                                        i
                                        =
                                        1
                                    
                                    
                                        k
                                    
                                
                                
                                    φ
                                    
                                        
                                            
                                                
                                                    c
                                                
                                                
                                                    i
                                                
                                            
                                            
                                                
                                                    U
                                                
                                            
                                            +
                                            1
                                        
                                    
                                
                            
                            -
                             
                            φ
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            o
                                            i
                                        
                                    
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            -
                             
                            φ
                            
                                
                                    
                                        
                                            r
                                            a
                                        
                                        
                                            1
                                            i
                                        
                                    
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            ,
                        
                     where                         
                            φ
                        
                     represents a U represents the contingency table associated with the particular non-leaf node, where                         
                            
                                
                                    r
                                
                                
                                    i
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents a sum of row i,                         
                            
                                
                                    c
                                
                                
                                    j
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents a sum of column i, where                         
                            
                                
                                    a
                                
                                
                                    i
                                    j
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents an entry of the contingency table U at row i, column j, and where N(U) represents a sum of row sums and column sums. 
Stokes et al. teaches wherein the determining the set of possible splits for the particular candidate feature, into the left set of instances and into the right set of instances, is based at least in part on a split criterion at least defined by:                         
                            S
                            
                                
                                    
                                        
                                            L
                                            ,
                                            R
                                        
                                    
                                    ;
                                    P
                                
                            
                            =
                            C
                            
                                
                                    P
                                
                            
                            -
                            
                                
                                    C
                                    
                                        
                                            L
                                        
                                    
                                    +
                                    C
                                    
                                        
                                            R
                                        
                                    
                                
                            
                            ,
                        
                     where L represents the left set of instances, where R represents the right set of instances, where P represents the particular set of instances such that                         
                            P
                            =
                            L
                             
                            ∪
                            R
                            ,
                        
                     where for k number of actions,                          
                            C
                            
                                
                                    U
                                
                            
                            =
                             
                            φ
                            
                                
                                    
                                        
                                            r
                                        
                                        
                                            o
                                        
                                    
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            +
                             
                            φ
                            
                                
                                    
                                        
                                            r
                                        
                                        
                                            1
                                        
                                    
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            -
                            φ
                            
                                
                                    N
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            +
                             
                            
                                
                                    ∑
                                    
                                        i
                                        =
                                        1
                                    
                                    
                                        k
                                    
                                
                                
                                    φ
                                    
                                        
                                            
                                                
                                                    c
                                                
                                                
                                                    i
                                                
                                            
                                            
                                                
                                                    U
                                                
                                            
                                            +
                                            1
                                        
                                    
                                
                            
                            -
                             
                            φ
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            o
                                            i
                                        
                                    
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            -
                             
                            φ
                            
                                
                                    
                                        
                                            r
                                            a
                                        
                                        
                                            1
                                            i
                                        
                                    
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            ,
                        
                     where                         
                            φ
                        
                     represents a log function, where U represents the contingency table associated with the particular non-leaf node, where                         
                            
                                
                                    r
                                
                                
                                    i
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents a sum of row i,                         
                            
                                
                                    c
                                
                                
                                    j
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents a sum of column i, where                         
                            
                                
                                    a
                                
                                
                                    i
                                    j
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents an entry of the contingency table U at row i, column j, and where N(U) represents a sum of row sums and column sums (paragraphs 0028-0029 “[0028] For an example scenario considering the unpacked string feature, the training set may include a single example (e.g. file) containing the unpacked string "XYZ”, and the file may be associated with a malware family Rbot. A classification algorithm may then learn to predict that all files containing string “XYZ” are likely to … belong to the Rbot family.  To choose the best subset of features from the large number of potential features, a feature selection based on                         
                            2
                            ×
                            2
                        
                     contingency tables may be employed … The contingency table for the potential string feature "XYZ has four elements: A, B, C, and D. A is the number of files which are not determined to be Rbot and do not include the string "XYZ", while D is the number of files of type Rbot which do include the string “XYZ”. B and C are the number of files determined (not determined) to be of type Rbot which do not (do) include string "XYZ”, respectively … After the contingency table has been computed for each potential feature f, a score R(f) can be evaluated according to:  R(f) = Log Г(A+1)+ Log Г(B+1)+ Log Г(C+1)+ Log Г(D+1)+ Log Г(A+B+C+D+4)-( Log Г(A+B+2)+ Log Г(C+D+2)+ Log Г(A+C+2)+ Log Г(A+C+2)+ Log Г(B+D+2)+ Log Г(4)), where Log Г(x) is the log of the Gamma function Г of quantity x.
[0029] The set of potential features can be ranked according to the scores for each class, and the top features selected, which best discriminate between each class (malware family, generic malware, benign program family, and generic benign”  teaches split criterion defined by score R(f) [wherein the determining the set of possible splits for the particular candidate feature, into the left set of instances and into the right set of instances, is based at least in part on a split criterion at least defined by:                         
                            S
                            
                                
                                    
                                        
                                            L
                                            ,
                                            R
                                        
                                    
                                    ;
                                    P
                                
                            
                            =
                            C
                            
                                
                                    P
                                
                            
                            -
                            
                                
                                    C
                                    
                                        
                                            L
                                        
                                    
                                    +
                                    C
                                    
                                        
                                            R
                                        
                                    
                                
                            
                            }
                        
                    ,   where R(f) determines the features that provide best discrimination between malware family/generic malware [where L represents the left set of instances] and benign program family/generic benign [where R represents the right set of instances], 
teaches training data comprising of                          
                            A
                            +
                            B
                            +
                            C
                            +
                            D
                        
                     files [where P represents the particular set of instances such that                         
                            P
                            =
                            L
                             
                            ∪
                            R
                        
                    ],
teaches whether the file contains/doesn’t contain the string “XYZ” as two separate actions [where for k number of actions],
teaches score R(f) representing a linear combination of log-gamma functions  
[                        
                            C
                            
                                
                                    U
                                
                            
                            =
                             
                            φ
                            
                                
                                    
                                        
                                            r
                                        
                                        
                                            o
                                        
                                    
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            +
                             
                            φ
                            
                                
                                    
                                        
                                            r
                                        
                                        
                                            1
                                        
                                    
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            -
                            φ
                            
                                
                                    N
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            +
                             
                            
                                
                                    ∑
                                    
                                        i
                                        =
                                        1
                                    
                                    
                                        k
                                    
                                
                                
                                    φ
                                    
                                        
                                            
                                                
                                                    c
                                                
                                                
                                                    i
                                                
                                            
                                            
                                                
                                                    U
                                                
                                            
                                            +
                                            1
                                        
                                    
                                
                            
                            -
                             
                            φ
                            
                                
                                    
                                        
                                            a
                                        
                                        
                                            o
                                            i
                                        
                                    
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            -
                             
                            φ
                            
                                
                                    
                                        
                                            r
                                            a
                                        
                                        
                                            1
                                            i
                                        
                                    
                                    
                                        
                                            U
                                        
                                    
                                    +
                                    1
                                
                            
                            ,
                        
                     where                         
                            φ
                        
                     represents a log function],
                        
                            2
                             
                            ×
                            2
                        
                     contingency table constructed for each feature f [where U represents the contingency table associated with the particular non-leaf node],
teaches B+ D as sum of row 1 in contingency table representing total number of files determined to be of type Rbot [where                         
                            
                                
                                    r
                                
                                
                                    i
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents a sum of row i] and A +C  as sum of row 2 in contingency table representing total number of files NOT determined to be of type Rbot [ where                         
                            
                                
                                    r
                                
                                
                                    i
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents a sum of row i],
teaches A + B as sum of column 1 of contingency table representing total number of files that do NOT include the string feature “XYZ”  [                        
                            
                                
                                    c
                                
                                
                                    j
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents a sum of column i]  and C+D as sum of column 2 of contingency table representing total number of files that do include the string feature “XYZ” [                        
                            
                                
                                    c
                                
                                
                                    j
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents a sum of column i],
teaches A as entry in row 2, column 1 of contingency table being number of files NOT determined to be Rbot and does NOT include the string “XYZ” [where                         
                            
                                
                                    a
                                
                                
                                    i
                                    j
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents an entry of the contingency table U at row i, column j],
teaches C as entry in row 2, column 2 of contingency table being number of files NOT determined to be Rbot and includes the string “XYZ” [where                         
                            
                                
                                    a
                                
                                
                                    i
                                    j
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents an entry of the contingency table U at row i, column j],
teaches B as entry in row 1, column 1 of contingency table being number of files determined to be Rbot and does NOT include the string “XYZ” [where                         
                            
                                
                                    a
                                
                                
                                    i
                                    j
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents an entry of the contingency table U at row i, column j],
teaches D as entry in row 1, column 2 of contingency table being number of files determined to be Rbot and includes the string “XYZ” [where                         
                            
                                
                                    a
                                
                                
                                    i
                                    j
                                
                            
                            
                                
                                    U
                                
                            
                        
                     represents an entry of the contingency table U at row i, column j], and
N(U) represents a sum of row sums and column sums]).
Any limitation that recites “at least” has been interpreted as requiring one of the alternatives and not all of the alternatives.
Ma et al., López-Chau et al., and Stokes et al. are considered analogous art because they are directed to reliable methods of classifying data with low false positive rates.
In view of the teachings of Ma et al. in view of López-Chau et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Stokes et al. at the time the application was filed in order to induce decision trees that can improve automated malware detection (cf. Stokes et al., [0030], “Once the features for the baseline malware classification system are determined, one or more labeled datasets may be constructed based on logs extracted from an instrumented virtual machine. The logs for each individual file may be processed and any features corresponding to those selected by the feature selection system may be used to construct a training vector whose label is determined to be one of the predefined number of classes, the generic malware class, a specific benign program or the generic benign class…”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Ma et al. discloses this as a necessary activity for the taught invention (cf. Ma et al., paragraph 0079  “… the hierarchical decision tree is traversed using data for the transaction under review, to obtain a set of “neighbors' of the transaction under review that share the set of transaction feature values. For example, traversal involves starting at a root node of the decision tree, determining what feature the node represents, finding the value for that feature in the data for the transaction under review, and deter mining which edge to follow based on the value in comparison to a decision represented in the node. Following an edge leads to a next 
Regarding Claim 4,
Ma et al. in view of López-Chau et al. and in further view of Stokes et al. teaches the method of claim 3.
Ma et al. in view of López-Chau et al. does not appear to explicitly teach wherein the determining the optimal split is further based at least in part on an optimal split criterion at least defined by:                          
                            
                                
                                    
                                        
                                            L
                                        
                                        
                                            *
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            R
                                        
                                        
                                            *
                                        
                                    
                                
                            
                            =
                            a
                            r
                            g
                            
                                
                                    
                                        
                                            max
                                        
                                        
                                            (
                                            L
                                            ,
                                            R
                                            )
                                        
                                    
                                
                                ⁡
                                
                                    S
                                     
                                    
                                        
                                            
                                                
                                                    L
                                                    ,
                                                    R
                                                
                                            
                                            ;
                                            P
                                        
                                    
                                    ,
                                
                            
                        
                     where                         
                            
                                
                                    L
                                
                                
                                    *
                                
                            
                        
                     represents the left set of instances according to the optimal split and                         
                            
                                
                                    R
                                
                                
                                    *
                                
                            
                        
                    represents the right set of instances according to the optimal split.
Stokes et al. teaches wherein the determining the optimal split is further based at least in part on an optimal split criterion at least defined by:                          
                            
                                
                                    
                                        
                                            L
                                        
                                        
                                            *
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            R
                                        
                                        
                                            *
                                        
                                    
                                
                            
                            =
                            a
                            r
                            g
                            
                                
                                    
                                        
                                            max
                                        
                                        
                                            (
                                            L
                                            ,
                                            R
                                            )
                                        
                                    
                                
                                ⁡
                                
                                    S
                                     
                                    
                                        
                                            
                                                
                                                    L
                                                    ,
                                                    R
                                                
                                            
                                            ;
                                            P
                                        
                                    
                                    ,
                                
                            
                        
                     where                         
                            
                                
                                    L
                                
                                
                                    *
                                
                            
                        
                     represents the left set of instances according to the optimal split and                         
                            
                                
                                    R
                                
                                
                                    *
                                
                            
                        
                    represents the right set of instances according to the optimal split (paragraphs 0028-0029 “[0028] For an example scenario considering the unpacked string feature, the training set may include a single example (e.g. file) containing the unpacked string "XYZ”, and the file may be associated with a malware family Rbot. A classification algorithm may then learn to predict that all files containing string “XYZ” are likely to … belong to the Rbot family.  To choose the best subset of features from the large number of potential features, a feature selection based on                         
                            2
                            ×
                            2
                        
                     contingency tables may be employed … The contingency table for the potential string feature "XYZ has four elements: A, B, C, and D. A is the number of files which are not determined to be Rbot and do not include the string "XYZ", while D is the number of files of type Rbot which do include the string “XYZ”. B and C are the number of files determined (not determined) to be of type Rbot which do not (do) include string "XYZ”, respectively … After the contingency table has been computed for each potential feature f, a score R(f) can be evaluated according to:  R(f) = Log Г(A+1)+ Log Г(B+1)+ Log Г(C+1)+ Log Г(D+1)+ Log Г(A+B+C+D+4)-( Log Г(A+B+2)+ Log Г(C+D+2)+ Log Г(A+C+2)+ Log Г(A+C+2)+ Log Г(B+D+2)+ Log Г(4)), where Log Г(x) is the log of the Gamma function Г of quantity x.
[0029] The set of potential features can be ranked according to the scores for each class, and the top features selected, which best discriminate between each class (malware family, generic malware, benign program family, and generic benign)….” teaches maximum score R(f) determining the features that provide best discrimination between left set of instances and right set of instances, wherein left set of instances comprises malware family/generic malware and right set of instances comprise benign program family/generic benign [wherein the determining the optimal split is further based at least in part on an optimal split criterion at least defined by:                          
                            
                                
                                    
                                        
                                            L
                                        
                                        
                                            *
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            R
                                        
                                        
                                            *
                                        
                                    
                                
                            
                            =
                            a
                            r
                            g
                            
                                
                                    
                                        
                                            max
                                        
                                        
                                            (
                                            L
                                            ,
                                            R
                                            )
                                        
                                    
                                
                                ⁡
                                
                                    S
                                     
                                    
                                        
                                            
                                                
                                                    L
                                                    ,
                                                    R
                                                
                                            
                                            ;
                                            P
                                        
                                    
                                    ,
                                
                            
                        
                     where                         
                            
                                
                                    L
                                
                                
                                    *
                                
                            
                        
                     represents the left set of instances according to the optimal split and                         
                            
                                
                                    R
                                
                                
                                    *
                                
                            
                        
                    represents the right set of instances according to the optimal split]).
Any limitation that recites “at least” has been interpreted as requiring one of the alternatives and not all of the alternatives.
	Ma et al., López-Chau et al., and Stokes et al. are combinable for the same rationale as set forth above with respect to claim 3.
Regarding Claim 5,
Ma et al. in view of López-Chau et al. and in further view of Stokes et al. teaches the method of claim 4.
Ma et al. in view of López-Chau et al. does not appear to explicitly teach wherein the leaf node criterion specifies that the optimal split satisfies a condition at least defined by:                         
                            S
                             
                            
                                
                                    
                                        
                                            L
                                            ,
                                            R
                                        
                                    
                                    ;
                                    P
                                
                            
                            <
                            0
                            .
                        
                    	Stokes et al. teaches wherein leaf node criterion specifies that the optimal split satisfies a condition at least defined by:                         
                            S
                             
                            
                                
                                    
                                        
                                            L
                                            ,
                                            R
                                        
                                    
                                    ;
                                    P
                                
                            
                            <
                            0
                             
                        
                    (paragraph 0028 “For an example scenario considering the unpacked string feature, the training set may include a single example (e.g. file) containing the unpacked string "XYZ”, and the file may be associated with a malware family Rbot. A classification algorithm may then learn to predict that all files containing string “XYZ” are likely to … belong to the Rbot family.  To choose the best subset of features from the large number of potential features, a feature selection based on                         
                            2
                            ×
                            2
                        
                     contingency tables may be employed … The contingency table for the potential string feature "XYZ has four elements: A, B, C, and D. A is the number of files which are not determined to be Rbot and do not include the string "XYZ", while D is the number of files of type Rbot which do include the string “XYZ”. B and C are the number of files determined (not determined) to be of type Rbot which do not (do) include string "XYZ”, respectively … After the contingency table has been computed for each potential feature f, a score R(f) can be evaluated according to:  R(f) = Log Г(A+1)+ Log Г(B+1)+ Log Г(C+1)+ Log Г(D+1)+ Log Г(A+B+C+D+4)-( Log Г(A+B+2)+ Log Г(C+D+2)+ Log Г(A+C+2)+ Log Г(A+C+2)+ Log Г(B+D+2)+ Log Г(4)), where Log Г(x) is the log of the Gamma function Г of quantity x”  teaches score R(f) being a negative quantity in most instances since Г(x) is an increasing function of x and Log (Г(x)) is an increasing function of Г(x) [wherein leaf node criterion specifies that the optimal split satisfies a                         
                            S
                             
                            
                                
                                    
                                        
                                            L
                                            ,
                                            R
                                        
                                    
                                    ;
                                    P
                                
                            
                            <
                            0
                        
                    ]).	Any limitation that recites “at least” has been interpreted as requiring one of the alternatives and not all of the alternatives.
	Ma et al., López-Chau et al., and Stokes et al. are combinable for the same rationale as set forth above with respect to claim 3.
Regarding Claim 15,
	Claim 15 is substantially similar to claim 3 and therefore is rejected on the same ground as claim 3.  Claim 15 is directed to a “non-transitory computer-readable medium” that corresponds to the method of claim 3. 
	Ma et al.  further teaches a non-transitory computer-readable medium comprising instructions that, when executed by one or more hardware processors of a machine, cause the
machine to perform operations (paragraphs 0116-0117  “[0116] The term “storage media' as used herein refers to any media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 710. Volatile media includes dynamic memory. Such as main memory 706. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM,
any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.  [0117] Storage media is distinct from but may be used in conjunction with transmission media …”).
Regarding Claim 16,

	Ma et al.  further teaches a non-transitory computer-readable medium comprising instructions that, when executed by one or more hardware processors of a machine, cause the
machine to perform operations (paragraphs 0116-0117  “[0116] The term “storage media' as used herein refers to any media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 710. Volatile media includes dynamic memory. Such as main memory 706. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM,
any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.  [0117] Storage media is distinct from but may be used in conjunction with transmission media …”).
Regarding Claim 17,
	Claim 17 is substantially similar to claim 5 and therefore is rejected on the same ground as claim 5.  Claim 17 is directed to a “non-transitory computer-readable medium” that corresponds to the method of claim 5. 
	Ma et al.  further teaches a non-transitory computer-readable medium comprising instructions that, when executed by one or more hardware processors of a machine, cause the
paragraphs 0116-0117  “[0116] The term “storage media' as used herein refers to any media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 710. Volatile media includes dynamic memory. Such as main memory 706. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM,
any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.  [0117] Storage media is distinct from but may be used in conjunction with transmission media …”).
Claims 9 and 12 is rejected under 35 U.S.C. 103 as being unpatentable over Ma et al.  (US 2012/0066125 A1) in view of Xiao et al. (US 2015/0324398 A1).
Regarding Claim 9,
	Ma et al. teaches the method of claim 8.
	Ma et al. further teaches determining a set of particular contingency tables of action-outcome value pairs from leaf nodes of the decisions trees in the ensemble by processing the set of new feature values using each particular decision tree, in the ensemble, to determine a particular contingency table of a leaf node of the particular decision tree (paragraph 0066 “At step 212, a feature from the set of remaining available/combined features is selected as the splitting node to be added to the sub-tree … each splitting node is determined using a contingency table calculated for each candidate feature from the set of remaining available/ combined features. A contingency table is constructed using the set of historical transactions corresponding to the parent node of the sub-tree, with this set separated in subsets representing the acceptance or rejection the historical transactions, and separated by the data values of each candidate feature”  teaches construction of a set of contingency tables used to determine a features from set of remaining available features [determining a set of particular contingency tables of action-outcome value pairs from leaf nodes of the decisions trees in the ensemble by processing the set of new feature values using each particular decision tree, in the ensemble, to determine a particular contingency table of a leaf node of the particular decision tree]).
	Ma et al. does not appear to explicitly teach determining the predicted action value for the set of new feature values by combining the set of particular contingency tables.
	Xiao et al. teaches determining the predicted action value for the set of new feature values by combining the set of particular contingency tables (paragraph 0176 “In an operation 1310, the contingency table data received from each of the one or more computing devices of grid systems 132 is combined to form a single contingency table for each variable pair. For example, an overall frequency count value is determined for each variable value combination by adding the frequency count values from each matching variable value combination in each contingency table” and paragraph 0184 “… A user may execute use tree data application 800 that interacts with grid control application 1112 by requesting that grid control device 130 use tree data 126 to create contingency tables … By accelerating the counting task of constructing contingency tables and reducing an amount of memory to store data in a useable form for constructing contingency tables, statistical analysis of the data can be made more efficient” teaches adding frequency counts from multiple contingency tables for a particular action value to 
Ma et al. and Xiao et al. are considered analogous art because they are directed to reliable methods of classifying data with low false positive rates.
In view of the teachings of Ma et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Xiao et al. at the time the application was filed in order to create contingency tables to build decision trees with little computational complexity (cf. Xiao et al., [0184], “… A user may execute use tree data application 800 that interacts with grid control application 1112 by requesting that grid control device 130 use tree data 126 to create contingency tables … By accelerating the counting task of constructing contingency tables and reducing an amount of memory to store data in a useable form for constructing contingency tables, statistical analysis of the data can be made more efficient”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Ma et al. discloses this as a necessary activity for the taught invention (cf. Ma et al., paragraph 0079  “… the hierarchical decision tree is traversed using data for the transaction under review, to obtain a set of “neighbors' of the transaction under review that share the set of transaction feature values. For example, traversal involves starting at a root node of the decision tree, determining what feature the node represents, finding the value for that feature in the data for the transaction under review, and deter mining which edge to follow based on the value in comparison to a decision represented in the node. Following an edge leads to a next node at which the process is repeated for another feature, until a terminal node of the tree is reached. The terminal node is associated with identifiers for other historic transactions having all the same transaction feature values that led to that terminal node; these historic transactions are neighbor 
Regarding Claim 12,
	Ma et al. in view of Xiao et al. teaches the method of claim 9.
Ma et al. further teaches wherein the contingency table comprises a 2 x k contingency table, where k represents number of different action values (paragraph 0067,   
    PNG
    media_image3.png
    696
    590
    media_image3.png
    Greyscale
teaches a 2 x 2 contingency table comprising action values of “fired” and “not fired” [wherein the contingency .
Claims 10-11 are rejected under 35 U.S.C. 103 as being unpatentable over Ma et al.  (US 2012/0066125 A1) in view of Xiao et al. (US 2015/0324398 A1) and in further view of Cowling et al. (“Emergent bluffing and inference with Monte Carlo Tree Search”).
Regarding Claim 10,
	Ma et al. in view of Xiao et al. teaches the method of claim 9.
	Ma et al. in view of Xiao et al. does not appear to explicitly teach wherein the combining the set of particular contingency tables comprises: determining a set of best action values for the set of particular contingency tables by determining, for each given contingency table in the set of particular contingency tables, a best action value voted for by a given decision tree in the ensemble based at least in part on the given contingency table associated with the given decision tree; determining a set of counts for one or more different action values by determining a count for each different action value in the set of best action values; and selecting the predicted action value, from the one or more different action values, based at least in part on the set of counts.
	Cowling et al. teaches wherein the combining the set of particular contingency tables comprises: determining a set of best action values for the set of particular contingency tables by determining, for each given contingency table in the set of particular contingency tables, a best action value voted for by a given decision tree in the ensemble based at least in part on the given contingency table associated with the given decision tree (p. 118, section VI, paragraph 10, “Denote the mean reward for an action a from the root of the decision tree by                         
                            
                                
                                    μ
                                
                                
                                    a
                                
                            
                        
                     and the standard deviation                         
                            
                                
                                    σ
                                
                                
                                    a
                                
                            
                            .
                        
                     Choose a* with maximal number of visits from the root of the decision tree, as usual.  Let                         
                            
                                
                                    A
                                
                                
                                    *
                                
                            
                            =
                            
                                
                                    a
                                    :
                                     
                                    
                                        
                                            μ
                                        
                                        
                                            
                                                
                                                    a
                                                
                                                
                                                    *
                                                
                                            
                                        
                                    
                                    -
                                    
                                        
                                            μ
                                        
                                        
                                            a
                                        
                                    
                                    <
                                    
                                        
                                            min
                                        
                                        ⁡
                                        
                                            
                                                
                                                    
                                                        
                                                            σ
                                                        
                                                        
                                                            
                                                                
                                                                    a
                                                                
                                                                
                                                                    *
                                                                
                                                            
                                                        
                                                    
                                                    ,
                                                    
                                                        
                                                            σ
                                                        
                                                        
                                                            a
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            ,
                             
                             
                        
                    … i.e. A* is the set of actions whose average reward is within one standard deviation of the reward for the best action.  Now for each action in A*, sum the number of visits from the root across all the current player’s trees … and play the action for which this sum is maximal … “  teaches A* as a set of best actions with each action in the set of best actions representing a set of visits from the root node of the decision tree whose average reward is within one standard deviation of the reward for the best action [determining a set of best action values for the set of particular contingency tables by determining, for each given contingency table in the set of particular contingency tables, a best action value voted for by a given decision tree in the ensemble based at least in part on the given contingency table associated with the given decision tree]);
	determining a set of counts for one or more different action values by determining a count for each different action value in the set of best action values (p. 118, section VI, paragraph 10, “Denote the mean reward for an action a from the root of the decision tree by                         
                            
                                
                                    μ
                                
                                
                                    a
                                
                            
                        
                     and the standard deviation                         
                            
                                
                                    σ
                                
                                
                                    a
                                
                            
                            .
                        
                     Choose a* with maximal number of visits from the root of the decision tree, as usual.  Let                         
                            
                                
                                    A
                                
                                
                                    *
                                
                            
                            =
                            
                                
                                    a
                                    :
                                     
                                    
                                        
                                            μ
                                        
                                        
                                            
                                                
                                                    a
                                                
                                                
                                                    *
                                                
                                            
                                        
                                    
                                    -
                                    
                                        
                                            μ
                                        
                                        
                                            a
                                        
                                    
                                    <
                                    
                                        
                                            min
                                        
                                        ⁡
                                        
                                            
                                                
                                                    
                                                        
                                                            σ
                                                        
                                                        
                                                            
                                                                
                                                                    a
                                                                
                                                                
                                                                    *
                                                                
                                                            
                                                        
                                                    
                                                    ,
                                                    
                                                        
                                                            σ
                                                        
                                                        
                                                            a
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            ,
                             
                             
                        
                    … i.e. A* is the set of actions whose average reward is within one standard deviation of the reward for the best action.  Now for each action in A*, sum the number of visits from the root across all the current player’s trees … and play the action for which this sum is maximal … “  teaches computing a sum of the number of visits from the root node of the decision tree for each action in A* [determining a set of counts for one or more different action values by determining a count for each different action value in the set of best action values]); and 
selecting the predicted action value, from the one or more different action values, based at least in part on the set of counts (p. 118, section VI, paragraph 10, “Denote the mean reward for an action a from the root of the decision tree by                         
                            
                                
                                    μ
                                
                                
                                    a
                                
                            
                        
                     and the standard deviation                         
                            
                                
                                    σ
                                
                                
                                    a
                                
                            
                            .
                        
                     Choose a* with maximal number of visits from the root of the decision tree, as usual.  Let                         
                            
                                
                                    A
                                
                                
                                    *
                                
                            
                            =
                            
                                
                                    a
                                    :
                                     
                                    
                                        
                                            μ
                                        
                                        
                                            
                                                
                                                    a
                                                
                                                
                                                    *
                                                
                                            
                                        
                                    
                                    -
                                    
                                        
                                            μ
                                        
                                        
                                            a
                                        
                                    
                                    <
                                    
                                        
                                            min
                                        
                                        ⁡
                                        
                                            
                                                
                                                    
                                                        
                                                            σ
                                                        
                                                        
                                                            
                                                                
                                                                    a
                                                                
                                                                
                                                                    *
                                                                
                                                            
                                                        
                                                    
                                                    ,
                                                    
                                                        
                                                            σ
                                                        
                                                        
                                                            a
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            ,
                             
                             
                        
                    … i.e. A* is the set of actions whose average reward is within one standard deviation of the reward for the best action.  Now for each action in A*, sum the number of visits from the root across all the current player’s trees … and play the action for which this sum is maximal … “  teaches selecting the action value in the set of best action values for which the sum of the number of visits from the root node of decision tree is maximum [selecting the predicted action value, from the one or more different action values, based at least in part on the set of counts]).
Ma et al., Xiao et al., and Cowling et al. are considered analogous art because they are directed to reliable methods of classifying data with low false positive rates.
In view of the teachings of Ma et al. in view of Xiao et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Cowling et al. at the time the application was filed in order for a player in a game to effectively construct decision trees when bluffing is an available option (cf. Cowling et al., p. 120, section VIII, paragraph 1, “By including tree nodes for opponent decisions, tree search implicitly constructs an opponent model while it searches. We have shown that this opponent model can be re-used for inference. We have also shown that bluffing behaviours can be introduced simply by allowing the search to sample deteminizations outside the current information set …”). The Examiner notes that a person of ordinary skill in the art would find a suggestion to perform this type of analysis since Ma et al. discloses this as a necessary activity for the taught invention (cf. Ma et al., paragraph 0079  “… the hierarchical decision tree is traversed using data for the transaction under review, to obtain a set of “neighbors' of the transaction under review that share the set of transaction 
Regarding Claim 11,
	Ma et al. in view of Xiao et al. teaches the method of claim 9.
	Ma et al. in view of Xiao et al. does not appear to explicitly teach wherein the combining the set of particular contingency tables comprises: determining a set of counts for one or more different action-outcome value pairs by determining a count for each different action-outcome value pair in the set of particular contingency tables; and selecting the predicted action value, from the one or more different action-outcome value pairs, based at least in part on the set of counts.
Cowling et al. teaches wherein the combining the set of particular contingency tables comprises: determining a set of counts for one or more different action-outcome value pairs by determining a count for each different action-outcome value pair in the set of particular contingency tables (p. 118, section VI, paragraph 10, “Denote the mean reward for an action a from the root of the decision tree by                         
                            
                                
                                    μ
                                
                                
                                    a
                                
                            
                        
                     and the standard deviation                         
                            
                                
                                    σ
                                
                                
                                    a
                                
                            
                            .
                        
                     Choose a* with maximal number of visits from the root of the decision tree, as usual.  Let                         
                            
                                
                                    A
                                
                                
                                    *
                                
                            
                            =
                            
                                
                                    a
                                    :
                                     
                                    
                                        
                                            μ
                                        
                                        
                                            
                                                
                                                    a
                                                
                                                
                                                    *
                                                
                                            
                                        
                                    
                                    -
                                    
                                        
                                            μ
                                        
                                        
                                            a
                                        
                                    
                                    <
                                    
                                        
                                            min
                                        
                                        ⁡
                                        
                                            
                                                
                                                    
                                                        
                                                            σ
                                                        
                                                        
                                                            
                                                                
                                                                    a
                                                                
                                                                
                                                                    *
                                                                
                                                            
                                                        
                                                    
                                                    ,
                                                    
                                                        
                                                            σ
                                                        
                                                        
                                                            a
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            ,
                             
                             
                        
                    … i.e. A* is the set of actions whose average reward is within one standard deviation of the reward for the best action.  Now for each action in A*, sum the number of visits from the root across all the current player’s trees … and play the action for which this sum is maximal. In other words, the chosen action is the most visited across all self-determinizations that is not significantly worse than the most visited action in true determinizations; the bluff that is not significantly worse than the best non-bluff. “  teaches computing a sum of the number of visits from the root node of the decision tree for each action in A*, with the outcome of the action being either success or failure of bluffing while playing a game [wherein the combining the set of particular contingency tables comprises: determining a set of counts for one or more different action-outcome value pairs by determining a count for each different action-outcome value pair in the set of particular contingency tables]); and 
selecting the predicted action value, from the one or more different action-outcome value pairs, based at least in part on the set of counts (p. 118, section VI, paragraph 10, “Denote the mean reward for an action a from the root of the decision tree by                         
                            
                                
                                    μ
                                
                                
                                    a
                                
                            
                        
                     and the standard deviation                         
                            
                                
                                    σ
                                
                                
                                    a
                                
                            
                            .
                        
                     Choose a* with maximal number of visits from the root of the decision tree, as usual.  Let                         
                            
                                
                                    A
                                
                                
                                    *
                                
                            
                            =
                            
                                
                                    a
                                    :
                                     
                                    
                                        
                                            μ
                                        
                                        
                                            
                                                
                                                    a
                                                
                                                
                                                    *
                                                
                                            
                                        
                                    
                                    -
                                    
                                        
                                            μ
                                        
                                        
                                            a
                                        
                                    
                                    <
                                    
                                        
                                            min
                                        
                                        ⁡
                                        
                                            
                                                
                                                    
                                                        
                                                            σ
                                                        
                                                        
                                                            
                                                                
                                                                    a
                                                                
                                                                
                                                                    *
                                                                
                                                            
                                                        
                                                    
                                                    ,
                                                    
                                                        
                                                            σ
                                                        
                                                        
                                                            a
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            ,
                             
                             
                        
                    … i.e. A* is the set of actions whose average reward is within one standard deviation of the reward for the best action.  Now for each action in A*, sum the number of visits from the root across all the current player’s trees … and play the action for which this sum is maximal. In other words, the chosen action is the most visited across all self-determinizations that is not significantly worse than the most visited action in true determinizations; the bluff that is not significantly worse than the best non-bluff. “ teaches selecting the action value in the set of best action values for which the sum of the number of visits from the root node of decision tree is maximum, with the outcome of the action being 
Ma et al., Xiao et al., and Cowling et al. are combinable for the same rationale as set forth above with respect to claim 10.
Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Ma et al.  (US 2012/0066125 A1) in view of Tsai et al. (“Data Mining for Internet of Things: A Survey”).
Regarding Claim 20,
	Ma et al. teaches system comprising: one or more hardware processors; and a memory storing instructions configured to instruct the one or more hardware processors to perform operations (paragraph 0112, “Computer system 700 also includes a main memory 706, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 702 for storing information and instructions to be executed by processor 704 …”) of:
accessing model data comprising an ensemble of decision trees trained on historical data for a set of instances, wherein for a particular instance in the set of instances, the historical data comprises a set of observed feature values for a set of features, an observed action value, and an observed outcome value for the observed action value (paragraph 0019, “The modeling computer stores a data model in memory representing transaction features, transaction feature values, transaction acceptance decisions and rejection decisions that the reviewer could perform or did perform, based at least in part on the set of similar transactions”, paragraph 0022, “ … the data model is a decision tree represented by an XML file …”  and paragraph 0026, “Many types of data are captured as part of a proposed online credit transaction and may be made available to a reviewer. As used herein, each type of data is termed a feature, such that each proposed transaction and historical transaction is a collection of data values, with one data value corresponding to one feature …” teaches decision tree comprising historical online credit transactions stored on computer for access to a reviewer [accessing model data comprising an ensemble of decision trees trained on historical data for a set of instances]),
wherein for a particular instance in the set of instances, the historical data comprises a set of observed feature values for a set of features (paragraphs 0026-0036, “[0026] Many types of data are captured as part of a proposed online credit transaction and may be made available to a reviewer. As used herein, each type of data is termed a feature, such that each proposed transaction and historical transaction is a collection of data values, with one data value corresponding to one feature. Examples of features include: 
[0027] 1. Name or type of fraud scoring model that is used to score a transaction;
[0028] 2. Country identified in a billing address for the customer,
[0029] 3. Country identified in a shipping address for the transaction;
[0030] 4. Fraud score;
[0031] 5. One or more Decision rules that triggered the manual review;
[0032] 6. One or more Factor codes, … ;
[0033] 7. One or more Information codes, … ;
[0034] 8. Merchant identifier uniquely identifying a merchant of the goods or services involved in the transaction;
[0035] 9. Name, number or other identifier of the reviewer;
[0036] 10. Organization that is performing the review … “

an observed action value (paragraph 0056, “… the set of candidate features comprise features corresponding to factor codes and information codes. … each factor code and information code candidate feature may only take on values of one (corresponding to “fired) and zero (corresponding to “not fired.”)” teaches observed values of “fired” and “not fired” for factor code feature and information code feature [observed action value]), and 
an observed outcome value for the observed action value (paragraph 0048,  “In hierarchical decision tree 100, transaction features that are more discriminating in predicting a likelihood of accepting or rejecting a transaction under review are represented as nodes closer to the root of hierarchical decision tree 100 than transaction features that are less discriminative …” teaches transaction acceptance and rejection of a transaction [observed outcome value for the observed action value]), and
wherein each leaf node of decision trees in the ensemble is associated with a contingency table of action-outcome value pairs generated based at least in part on the historical data (paragraph 0066 “At step 212, a feature from the set of remaining available/combined features is selected as the splitting node to be added to the sub-tree. In an embodiment, each splitting node is determined using a contingency table calculated for each candidate feature from the set of remaining available/ combined features. A contingency table is constructed using the set of historical transactions corresponding to the parent node of the sub-tree, with this set separated in subsets representing the acceptance or rejection the historical transactions, and separated by the data values of each candidate feature”  teaches generating splitting nodes from each feature to be added to sub-tree, with each splitting node being determined using a 
… generating prediction data by processing the input data using the ensemble, the prediction data comprising a predicted action value for the set of new feature values (paragraphs 0018-0020, “[0018] At a modeling computer, data items are collected related to a proposed online credit purchase transaction that has been recommended for review. A set of similar past online credit purchase transactions are identified. Each member of the set has one or more transaction features having transaction feature data values that are similar to the transaction data items of the proposed online credit purchase transaction, and a decision value specifying whether the member of the set was actually accepted or rejected by a reviewer after review. 
[0019] The modeling computer stores a data model in memory representing transaction features, transaction feature values, transaction acceptance decisions and rejection decisions that the reviewer could perform or did perform, based at least in part on the set of similar transactions.
[0020]  … the data model is used to automatically determine a likelihood value representing a particular decision of whether the proposed online credit card transaction would be accepted or rejected by the reviewer of the merchant if the reviewer actually reviewed the transaction data”  and paragraph 0022 “ … the data model is a decision tree represented by an XML file …”  teaches computing likelihood value using decision tree representing a decision regarding proposed credit card transaction based upon action values of transaction features [generating prediction data by processing the input data using the ensemble, the prediction data comprising a predicted action value for the set of new feature values]). 
Ma et al. does not appear to explicitly teach accessing input data based at least in part on device data received from an Industrial Internet-of-Things (IIoT) device, the input data comprising a set of new feature values for the set of features.
	Tsai et al. teaches accessing input data based at least in part on device data received from an Industrial Internet-of-Things (IIoT) device, the input data comprising a set of new feature values for the set of features (pp. 78-79, section II, paragraph 5 “ … IoT collects data from different sources, which may contain data for the IoT itself. KDD, when applied to IoT, will convert the data collected by IoT into useful information that can then be converted into knowledge … not all the attributes of the data are useful for mining; so, feature selection is usually used to select the key attributes from each record in the database for mining …” teaches IoT data comprising key attributes selected as features [accessing input data based at least in part on device data received from an Industrial Internet-of-Things (IIoT) device, the input data comprising a set of new feature values for the set of features]).
	Ma et al. and Tsai et al. are considered analogous art because they are directed to mining input data to efficiently construct and train decision making classifiers.
In view of the teachings of Ma et al. it would have been obvious for a person of ordinary skill in the art to apply the teachings of Tsai et al. at the time the application was filed in order to convert the data generated or captured by IoT into knowledge to provide a more convenient environment to people (cf. Tsai et al., p. 90, section IV, part B, paragraph 5, “One of the promising researches on the smart object is for things to think by themselves. In practice, mining technologies have the potential to filter out the redundant data and to decide what kind of data or information needs to be uploaded to the system that will be useful for applications on a broad region with limited resources, such Ma et al. discloses this as a necessary activity for the taught invention (cf. Ma et al., paragraph 0079  “… the hierarchical decision tree is traversed using data for the transaction under review, to obtain a set of “neighbors' of the transaction under review that share the set of transaction feature values. For example, traversal involves starting at a root node of the decision tree, determining what feature the node represents, finding the value for that feature in the data for the transaction under review, and deter mining which edge to follow based on the value in comparison to a decision represented in the node. Following an edge leads to a next node at which the process is repeated for another feature, until a terminal node of the tree is reached. The terminal node is associated with identifiers for other historic transactions having all the same transaction feature values that led to that terminal node; these historic transactions are neighbor transactions, and each such neighbor transaction has an associated decision value representing a reviewer's actual historic decision for that transaction).
Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:  Shotton et al. (US 2012/0239174 A1) the use of decision trees to predict joint positions of humans or animals in an image in order to control a computer game for other applications.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHIAKA CHUKWUMA OKOROH whose telephone number is (571)272-3710.  The examiner can normally be reached on M - F 7:30 AM - 4:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CHIAKA CHUKWUMA OKOROH/Examiner, Art Unit 2125                                            
‘s