DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
The present application is being examined under the claims filed on 05/09/2022.
Claims 1, 3-9, 11-17, 19, and 20 are amended.
Claims 21-23 are new.
Claims 2, 10, and 18 are canceled.
Claims 1, 3-9, 11-17, 19-23 are rejected.
Claims 1, 3-9, 11-17, 19-23 are pending.

Drawings
The Drawings filed on 12/12/2018 are acceptable for examination purposes.

Specification
The Specification filed on 12/12/2018 are acceptable for examination purposes.

Response to Arguments
In reference to rejections under 35 USC § 101
Applicant asserts (pgs. 17-19) that the claimed limitations clearly integrate any alleged abstract idea into a practical application. Moreover, these limitations make clear that the claims are directed to significantly more than any alleged abstract idea.
Examiner respectfully disagrees. Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application? No, the claim does not recite additional elements that integrate the judicial exception into a practical application; MPEP 2106.04(d). Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception? Limitation “providing input data to the base models to generate a plurality of intermediate outputs, each intermediate output comprising a base prediction” is directed to receiving or transmitting data over a network. Receiving or transmitting data over a network is well-understood, routine, conventional activity as per MPEP 2106.05(d). Limitation “training a plurality of base classification algorithms in a multi-layered machine learning system using training data to generate a plurality of base models, wherein training the base classification algorithms comprises using different training data or different machine learning techniques to specialize different base models differently such that the different base models are non-linear and complementary to one another, and wherein each of the base models is generated using a different one of the base classification algorithms”. The training of classifiers using different training data, outputs from different models, or different machine learning techniques are well-understood, routine, conventional activity as per Menahem et al. - US 20090182696 A1.
Applicant's arguments filed 05/09/2022 have been fully considered but they are not persuasive.

In reference to rejections under 35 USC § 102
Applicant asserts (pg. 20 to pg. 21) that Menahem does not disclose or suggest anything about a fusion model that defines how base predictions are combined using goodness factors associated with the base models, where the goodness factors identify average performances of the base models.
Examiner respectfully disagrees. Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0068]-[0088] discloses the Super Classifier (i.e. fusion mode) which processes the outputs from the Base Classifiers to generate a final decision. Examiner notes that Fig. 5 clearly shows the Super Classifier is generated based on the Meta Classifiers, the Meta Classifiers are generated based on the Specialist Classifiers, and the Specialist Classifiers are generated based on the Base Classifiers. Menahem further discloses the Super Classifier defined how the previous predictions are combined using an average performance; see at least Fig. 14, Fig. 15, Fig. 21, and Fig. 22.
Applicant's arguments filed 05/09/2022 have been fully considered but they are not persuasive.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1, 3-9, 11-17, 19-23 are rejected under 35 U.S.C. 101.
In reference to Claim 1
Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process.

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Limitation “processing the intermediate outputs using a fusion model to generate a final output associated with the input data, wherein the fusion model defines how the base predictions are combined using goodness factors associated with the base models, wherein the goodness factors identify average performances of the base models, and wherein the fusion model is generated using a meta classification algorithm in the multi-layered machine learning system” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a). Examiner notes that the limitation only requires processing the intermediate outputs to generate a final output.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, the claim does not recite additional elements that integrate the judicial exception into a practical application; MPEP 2106.04(d).

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
Limitation “providing input data to the base models to generate a plurality of intermediate outputs, each intermediate output comprising a base prediction” is directed to receiving or transmitting data over a network. Receiving or transmitting data over a network is well-understood, routine, conventional activity as per MPEP 2106.05(d).
Limitation “training a plurality of base classification algorithms in a multi-layered machine learning system using training data to generate a plurality of base models, wherein training the base classification algorithms comprises using different training data or different machine learning techniques to specialize different base models differently such that the different base models are non-linear and complementary to one another, and wherein each of the base models is generated using a different one of the base classification algorithms”. The training of classifiers using different training data, outputs from different models, or different machine learning techniques are well-understood, routine, conventional activity as per Menahem et al. - US 20090182696 A1.

In reference to Claim 3. The claim recites the additional limitations of “training the meta classification algorithm using outputs from the base classification algorithms or the base models based on the training data to generate the fusion model”. The training of classifiers using different training data, outputs from different models, or different machine learning techniques are well-understood, routine, conventional activity as per Menahem et al. - US 20090182696 A1.

In reference to Claim 4. The claim recites the additional limitations of “training each of the base classification algorithms using samples of training data that do not carry an equal amount of information as the other base classification algorithms”. The training of classifiers using different training data, outputs from different models, or different machine learning techniques are well-understood, routine, conventional activity as per Menahem et al. - US 20090182696 A1.

In reference to Claim 5. The claim recites the additional limitations of “the training data used by each of at least one of the base classification algorithms is selected based on an uncertainty associated with at least one other of the base classification algorithms”. The additional limitations are directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a).

In reference to Claim 6. The claim recites the additional limitations of “the training data used by a specific one of the base classification algorithms is selected based on the least-confident strategy comprising:
            
                
                    
                        x
                    
                    
                        *
                    
                
                =
                
                    
                        a
                        r
                        g
                        m
                        a
                        x
                    
                    
                        x
                        ,
                        M
                    
                
                (
                1
                -
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                y
                            
                            ~
                        
                    
                    
                        x
                    
                
                )
            
        
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,             
                
                    
                        y
                    
                    ~
                
            
         represents a probable output for the training data x over the base model M, and P represents a probability function”. The additional limitations are directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a).

In reference to Claim 7. The claim recites the additional limitations of “the training data used by a specific one of the base classification algorithms is selected based on the least-reliable strategy comprising:
            
                
                    
                        x
                    
                    
                        *
                    
                
                =
                
                    
                        a
                        r
                        g
                        m
                        i
                        n
                    
                    
                        x
                        ,
                        M
                    
                
                (
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                
                                    
                                        y
                                    
                                    
                                        2
                                    
                                
                            
                            ~
                        
                    
                    
                        x
                    
                
                -
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                
                                    
                                        y
                                    
                                    
                                        1
                                    
                                
                            
                            ~
                        
                    
                    
                        x
                    
                
                )
            
        
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,             
                
                    
                        
                            
                                y
                            
                            
                                1
                            
                        
                    
                    ~
                
            
         and             
                
                    
                        
                            
                                y
                            
                            
                                2
                            
                        
                    
                    ~
                
            
         represent two probable outputs for the training data x over the base model M, and P represents a probability function”. The additional limitations are directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a).

In reference to Claim 8. The claim recites the additional limitations of “the training data used by a specific one of the base classification algorithms is selected based on the most output entropy strategy comprising:
            
                
                    
                        x
                    
                    
                        *
                    
                
                =
                
                    
                        a
                        r
                        g
                        m
                        a
                        x
                    
                    
                        x
                        ,
                        M
                    
                
                (
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                y
                            
                            ~
                        
                    
                    
                        x
                    
                
                l
                o
                g
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                y
                            
                            ~
                        
                    
                    
                        x
                    
                
                )
            
        
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,             
                
                    
                        y
                    
                    ~
                
            
         represents a probable output for the training data x over the base model M, and P represents a probability function”. The additional limitations are directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a).

In reference to Claim 9
Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a machine.

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Limitation “process the intermediate outputs using a fusion model to generate a final output associated with the input data, wherein the fusion model defines how the base predictions are combined using goodness factors associated with the base models, wherein the goodness factors identify average performances of the base models, and wherein the fusion model is generated using a meta classification algorithm in the multi-layered machine learning system” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a). Examiner notes that the limitation only requires processing the intermediate outputs to generate a final output.
The limitations of “at least one memory” and “at least one processor coupled to the at least one memory” are directed to generic computer elements recited at a high level of generality and merely used computers as a tool to perform the processes; MPEP 2106.04(a).

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, the claim does not recite additional elements that integrate the judicial exception into a practical application; MPEP 2106.04(d).

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
Limitation “process input data to the base models to generate a plurality of intermediate outputs, each intermediate output comprising a base prediction” is directed to receiving or transmitting data over a network. Receiving or transmitting data over a network is well-understood, routine, conventional activity as per MPEP 2106.05(d).
Limitation “train a plurality of base classification algorithms in a multi-layered machine learning system using training data to generate a plurality of base models, wherein, to train the base classification algorithms, the at least one processor is configured to use different training data or different machine learning techniques to specialize different base models differently such that the different base models are non-linear and complementary to one another, and wherein each of the base models is generated using a different one of the base classification algorithms”. The training of classifiers using different training data, outputs from different models, or different machine learning techniques are well-understood, routine, conventional activity as per Menahem et al. - US 20090182696 A1.

In reference to Claim 11. The claim recites the additional limitations of “train the meta classification algorithm using outputs from the base classification algorithms or the base models based on the training data to generate the fusion model”. The training of classifiers using different training data, outputs from different models, or different machine learning techniques are well-understood, routine, conventional activity as per Menahem et al. - US 20090182696 A1.

In reference to Claim 12. The claim recites the additional limitations of “train each of the base classification algorithms using samples of training data that do not carry an equal amount of information as the other base classification algorithms”. The training of classifiers using different training data, outputs from different models, or different machine learning techniques are well-understood, routine, conventional activity as per Menahem et al. - US 20090182696 A1.

In reference to Claim 13. The claim recites the additional limitations of “select the training data used by each of at least one of the base classification algorithms based on an uncertainty associated with at least one other of the base classification algorithms”. The additional limitations are directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a).

In reference to Claim 14. The claim recites the additional limitations of “select the training data used by a specific one of the base classification algorithms based on the least-confident strategy comprising:
            
                
                    
                        x
                    
                    
                        *
                    
                
                =
                
                    
                        a
                        r
                        g
                        m
                        a
                        x
                    
                    
                        x
                        ,
                        M
                    
                
                (
                1
                -
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                y
                            
                            ~
                        
                    
                    
                        x
                    
                
                )
            
        
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,             
                
                    
                        y
                    
                    ~
                
            
         represents a probable output for the training data x over the base model M, and P represents a probability function”. The additional limitations are directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a).

In reference to Claim 15. The claim recites the additional limitations of “select the training data used by a specific one of the base classification algorithms based on the least-reliable strategy comprising:
            
                
                    
                        x
                    
                    
                        *
                    
                
                =
                
                    
                        a
                        r
                        g
                        m
                        i
                        n
                    
                    
                        x
                        ,
                        M
                    
                
                (
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                
                                    
                                        y
                                    
                                    
                                        2
                                    
                                
                            
                            ~
                        
                    
                    
                        x
                    
                
                -
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                
                                    
                                        y
                                    
                                    
                                        1
                                    
                                
                            
                            ~
                        
                    
                    
                        x
                    
                
                )
            
        
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,             
                
                    
                        
                            
                                y
                            
                            
                                1
                            
                        
                    
                    ~
                
            
         and             
                
                    
                        
                            
                                y
                            
                            
                                2
                            
                        
                    
                    ~
                
            
         represent two probable outputs for the training data x over the base model M, and P represents a probability function”. The additional limitations are directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a).

In reference to Claim 16. The claim recites the additional limitations of “select the training data used by a specific one of the base classification algorithms based on the least-reliable strategy comprising:
            
                
                    
                        x
                    
                    
                        *
                    
                
                =
                
                    
                        a
                        r
                        g
                        m
                        a
                        x
                    
                    
                        x
                        ,
                        M
                    
                
                (
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                y
                            
                            ~
                        
                    
                    
                        x
                    
                
                l
                o
                g
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                y
                            
                            ~
                        
                    
                    
                        x
                    
                
                )
            
        
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,             
                
                    
                        
                            
                                y
                            
                            
                                1
                            
                        
                    
                    ~
                
            
         and             
                
                    
                        
                            
                                y
                            
                            
                                2
                            
                        
                    
                    ~
                
            
         represent two probable outputs for the training data x over the base model M, and P represents a probability function”. The additional limitations are directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a).

In reference to Claim 17
Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a machine.

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Limitation “process the intermediate outputs using a fusion model to generate a final output associated with the input data, wherein the fusion model defines how the base predictions are combined using goodness factors associated with the base models, wherein the goodness factors identify average performances of the base models, and wherein the fusion model is generated using a meta classification algorithm in the multi-layered machine learning system” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a). Examiner notes that the limitation only requires processing the intermediate outputs to generate a final output.
The limitation of “a non-transitory computer readable medium containing computer readable program code that, when executed, causes an electronic device” are directed to generic computer elements recited at a high level of generality and merely used computers as a tool to perform the processes; MPEP 2106.04(a).

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, the claim does not recite additional elements that integrate the judicial exception into a practical application; MPEP 2106.04(d).

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
Limitation “process input data to the base models to generate a plurality of intermediate outputs, each intermediate output comprising a base prediction” is directed to receiving or transmitting data over a network. Receiving or transmitting data over a network is well-understood, routine, conventional activity as per MPEP 2106.05(d).
Limitation “train a plurality of base classification algorithms in a multi-layered machine learning system using training data to generate a plurality of base models, wherein the computer readable program code that when executed cause the electronic device to train the base classification algorithms comprise computer readable program code that when executed cause the electronic device to use different training data or different machine learning techniques to specialize different base models differently such that the different base models are non-linear and complementary to one another, and wherein each of the base models is generated using a different one of the base classification algorithms”. The training of classifiers using different training data, outputs from different models, or different machine learning techniques are well-understood, routine, conventional activity as per Menahem et al. - US 20090182696 A1.

In reference to Claim 19. The claim recites the additional limitations of “train the meta classification algorithm using outputs from the base classification algorithms or the base models based on the training data to generate the fusion model”. The training of classifiers using different training data, outputs from different models, or different machine learning techniques are well-understood, routine, conventional activity as per Menahem et al. - US 20090182696 A1.

In reference to Claim 20. The claim recites the additional limitations of “training each of the base classification algorithms using samples of training data that do not carry an equal amount of information as the other base classification algorithms”. The training of classifiers using different training data, outputs from different models, or different machine learning techniques are well-understood, routine, conventional activity as per Menahem et al. - US 20090182696 A1. The claim recites the additional limitations of “select the training data used by each of at least one of the base classification algorithms based on an uncertainty associated with at least one other of the base classification algorithms”. The additional limitations are directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a).

In reference to Claim 21. The claim recites the additional limitations of “the training data used by a specific one of the base classification algorithms is selected based on the least-confident strategy comprising:
            
                
                    
                        x
                    
                    
                        *
                    
                
                =
                
                    
                        a
                        r
                        g
                        m
                        a
                        x
                    
                    
                        x
                        ,
                        M
                    
                
                (
                1
                -
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                y
                            
                            ~
                        
                    
                    
                        x
                    
                
                )
            
        
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,             
                
                    
                        y
                    
                    ~
                
            
         represents a probable output for the training data x over the base model M, and P represents a probability function”. The additional limitations are directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a).

In reference to Claim 22. The claim recites the additional limitations of “the training data used by a specific one of the base classification algorithms is selected based on the least-reliable strategy comprising:
            
                
                    
                        x
                    
                    
                        *
                    
                
                =
                
                    
                        a
                        r
                        g
                        m
                        i
                        n
                    
                    
                        x
                        ,
                        M
                    
                
                (
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                
                                    
                                        y
                                    
                                    
                                        2
                                    
                                
                            
                            ~
                        
                    
                    
                        x
                    
                
                -
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                
                                    
                                        y
                                    
                                    
                                        1
                                    
                                
                            
                            ~
                        
                    
                    
                        x
                    
                
                )
            
        
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,             
                
                    
                        
                            
                                y
                            
                            
                                1
                            
                        
                    
                    ~
                
            
         and             
                
                    
                        
                            
                                y
                            
                            
                                2
                            
                        
                    
                    ~
                
            
         represent two probable outputs for the training data x over the base model M, and P represents a probability function”. The additional limitations are directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a).

In reference to Claim 23. The claim recites the additional limitations of “the training data used by a specific one of the base classification algorithms is selected based on the most output entropy strategy comprising:
            
                
                    
                        x
                    
                    
                        *
                    
                
                =
                
                    
                        a
                        r
                        g
                        m
                        a
                        x
                    
                    
                        x
                        ,
                        M
                    
                
                (
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                y
                            
                            ~
                        
                    
                    
                        x
                    
                
                l
                o
                g
                
                    
                        P
                    
                    
                        M
                    
                
                
                    
                        
                            
                                y
                            
                            ~
                        
                    
                    
                        x
                    
                
                )
            
        
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,             
                
                    
                        y
                    
                    ~
                
            
         represents a probable output for the training data x over the base model M, and P represents a probability function”. The additional limitations are directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper; MPEP 2106.04(a).

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 3, 4, 9, 11, 12, 17, and 19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Menahem et al. (hereinafter Menahem) US 20090182696 A1.
In reference to Claim 1. Menahem teach a method comprising:
“training a plurality of base classification algorithms in a multi-layered machine learning system using training data to generate a plurality of base models, wherein training the base classification algorithms comprises using different training data or different machine learning techniques to specialize different base models differently such that the different base models are non-linear and complementary to one another, and wherein each of the base models is generated using a different one of the base classification algorithms” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0068]-[0088] discloses training the Base Classifiers using training data. Menahem further discloses training using different algorithms and using different training data. Menahem further discloses the different base models are non-linear, the base models are generated using a different base classification algorithm in a multi-layered machine learning system);
“providing input data to the base models to generate a plurality of intermediate outputs, each intermediate output comprising a base prediction” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0069]-[0088] discloses providing input data to base models to generate a plurality of intermediate outputs, and discloses the Base Classifiers outputs which comprise a prediction);
“processing the intermediate outputs using a fusion model to generate a final output associated with the input data, wherein the fusion model defines how the base predictions are combined using goodness factors associated with the base models, wherein the goodness factors identify average performances of the base models, and wherein the fusion model is generated using a meta classification algorithm in the multi-layered machine learning system” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0068]-[0088] discloses the Super Classifier (i.e. fusion mode) which processes the outputs from the Base Classifiers to generate a final decision. Examiner notes that Fig. 5 clearly shows the Super Classifier is generated based on the Meta Classifiers, the Meta Classifiers are generated based on the Specialist Classifiers, and the Specialist Classifiers are generated based on the Base Classifiers. Menahem further discloses the Super Classifier defined how the previous predictions are combined using an average performance; see at least Fig. 14, Fig. 15, Fig. 21, and Fig. 22).

In reference to Claim 3. Menahem teaches the method of Claim 1 (as mentioned above), further comprising:
Menahem further discloses:
“training the meta classification algorithm using outputs from the base classification algorithms or the base models based on the training data to generate the fusion model” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0069]-[0088] discloses the Super Classifier (i.e. fusion mode) which processes the outputs from the Base Classifiers to generate a final decision. Examiner notes that Fig. 5 clearly shows training the Super Classifier based on the Meta Classifiers, training the Meta Classifiers based on the Specialist Classifiers, and training the Specialist Classifiers based on the Base Classifiers).

In reference to Claim 4. Menahem teaches the method of Claim 1 (as mentioned above), wherein training the base classification algorithms comprises:
Menahem further discloses:
“training each of the base classification algorithms using samples of training data that do not carry an equal amount of information as the other base classification algorithms” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0069]-[0088] discloses training using different algorithms and using different training data. Examiner notes that the different training data do not carry an equal amount of information).

In reference to Claim 9. Menahem teaches an electronic device comprising:
“at least one memory” (Menahem in at least ¶ [0137] disclose the computers, it is understood that computers have memory and a processor); and
“at least one processor coupled to the at least one memory, the at least one processor” (Menahem in at least ¶ [0137] disclose the computers, it is understood that computers have memory and a processor) configured to:
“train a plurality of base classification algorithms in a multi-layered machine learning system using training data to generate a plurality of base models, wherein, to train the base classification algorithms, the at least one processor is configured to use different training data or different machine learning techniques to specialize different base models differently such that the different base models are non-linear and complementary to one another, and wherein each of the base models is generated using a different one of the base classification algorithms” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0068]-[0088] discloses training the Base Classifiers using training data. Menahem further discloses training using different algorithms and using different training data. Menahem further discloses the different base models are non-linear, the base models are generated using a different base classification algorithm in a multi-layered machine learning system);
“process input data to the base models to generate a plurality of intermediate outputs, each intermediate output comprising a base prediction” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0069]-[0088] discloses providing input data to base models to generate a plurality of intermediate outputs, and discloses the Base Classifiers outputs which comprise a prediction);
“process the intermediate outputs using a fusion model to generate a final output associated with the input data, wherein the fusion model defines how the base predictions are combined using goodness factors associated with the base models, wherein the goodness factors identify average performances of the base models, and wherein the fusion model is generated using a meta classification algorithm in the multi-layered machine learning system” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0068]-[0088] discloses the Super Classifier (i.e. fusion mode) which processes the outputs from the Base Classifiers to generate a final decision. Examiner notes that Fig. 5 clearly shows the Super Classifier is generated based on the Meta Classifiers, the Meta Classifiers are generated based on the Specialist Classifiers, and the Specialist Classifiers are generated based on the Base Classifiers. Menahem further discloses the Super Classifier defined how the previous predictions are combined using an average performance; see at least Fig. 14, Fig. 15, Fig. 21, and Fig. 22).

In reference to Claim 11. Menahem teaches the electronic device of Claim 9 (as mentioned above), wherein the at least one processor is further configured to implement the multi-layered machine learning system to:
Menahem further discloses:
“train the meta classification algorithm using outputs from the base classification algorithms or the base models based on the training data to generate the fusion model” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0069]-[0088] discloses the Super Classifier (i.e. fusion mode) which processes the outputs from the Base Classifiers to generate a final decision. Examiner notes that Fig. 5 clearly shows training the Super Classifier based on the Meta Classifiers, training the Meta Classifiers based on the Specialist Classifiers, and training the Specialist Classifiers based on the Base Classifiers).

In reference to Claim 12. Menahem teaches the electronic device of Claim 9 (as mentioned above), wherein, to train the base classification algorithms, the at least one processor is configured to:
Menahem further discloses:
“train each of the base classification algorithms using samples of training data that do not carry an equal amount of information as the other base classification algorithms” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0069]-[0088] discloses training using different algorithms and using different training data. Examiner notes that the different training data do not carry an equal amount of information).

In reference to Claim 17. Menahem teaches a non-transitory computer readable medium containing computer readable program code (Menahem in at least ¶ [0137] disclose the computers, it is understood that computers have memory and a processor) that, when executed, causes an electronic device to:
“train a plurality of base classification algorithms in a multi-layered machine learning system using training data to generate a plurality of base models, wherein the computer readable program code that when executed cause the electronic device to train the base classification algorithms comprise computer readable program code that when executed cause the electronic device to use different training data or different machine learning techniques to specialize different base models differently such that the different base models are non-linear and complementary to one another, and wherein each of the base models is generated using a different one of the base classification algorithms” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0068]-[0088] discloses training the Base Classifiers using training data. Menahem further discloses training using different algorithms and using different training data. Menahem further discloses the different base models are non-linear, the base models are generated using a different base classification algorithm in a multi-layered machine learning system);
“process input data to the base models to generate a plurality of intermediate outputs, each intermediate output comprising a base prediction” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0069]-[0088] discloses providing input data to base models to generate a plurality of intermediate outputs, and discloses the Base Classifiers outputs which comprise a prediction);
“process the intermediate outputs using a fusion model to generate a final output associated with the input data, wherein the fusion model defines how the base predictions are combined using goodness factors associated with the base models, wherein the goodness factors identify average performances of the base models, and wherein the fusion model is generated using a meta classification algorithm in the multi-layered machine learning system” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0068]-[0088] discloses the Super Classifier (i.e. fusion mode) which processes the outputs from the Base Classifiers to generate a final decision. Examiner notes that Fig. 5 clearly shows the Super Classifier is generated based on the Meta Classifiers, the Meta Classifiers are generated based on the Specialist Classifiers, and the Specialist Classifiers are generated based on the Base Classifiers. Menahem further discloses the Super Classifier defined how the previous predictions are combined using an average performance; see at least Fig. 14, Fig. 15, Fig. 21, and Fig. 22).

In reference to Claim 19. Menahem teaches the non-transitory computer readable medium of Claim 17 (as mentioned above), further containing computer readable program code that, when executed, causes the electronic device to:
Menahem further discloses:
“train the meta classification algorithm using outputs from the base classification algorithms or the base models based on the training data to generate the fusion model” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0069]-[0088] discloses the Super Classifier (i.e. fusion mode) which processes the outputs from the Base Classifiers to generate a final decision. Examiner notes that Fig. 5 clearly shows training the Super Classifier based on the Meta Classifiers, training the Meta Classifiers based on the Specialist Classifiers, and training the Specialist Classifiers based on the Base Classifiers).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 5-8, 13-16, and 20-23 are rejected under 35 U.S.C. 103 as being unpatentable over Menahem et al. (hereinafter Menahem) US 20090182696 A1 in view of Burr Settles (hereinafter Settles) “Active Learning Literature Survey”.
In reference to Claim 5. Menahem teaches the method of Claim 1 (as mentioned above), wherein:
Menahem does not explicitly disclose:
“the training data used by each of at least one of the base classification algorithms is selected based on an uncertainty associated with at least one other of the base classification algorithms”.
However, Settles discloses:
“the training data used by each of at least one of the base classification algorithms is selected based on an uncertainty associated with at least one other of the base classification algorithms” (Settles in at least § 3.1 pgs. 12-15 discloses selecting a classification algorithm based on an uncertainty).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Menahem and Settles. Menahem teaches a method for improving stacking schema for classification tasks, according to which predictive models are built, based on stacked-generalization meta-classifiers. Settles teaches using a probabilistic model for binary classification using uncertainty sampling. One of ordinary skill would have motivation to combine Menahem and Settles because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to Claim 6. Menahem and Settles teach the method of Claim 5 (as mentioned above), wherein:
Settles further teaches:
“the training data used by a specific one of the base classification algorithms is selected based on a least-confident strategy comprising:
                
                    
                        
                            x
                        
                        
                            *
                        
                    
                    =
                    
                        
                            a
                            r
                            g
                            m
                            a
                            x
                        
                        
                            x
                            ,
                            M
                        
                    
                    (
                    1
                    -
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    y
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    )
                
            
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,                                 
                                    
                                        
                                            y
                                        
                                        ~
                                    
                                
                             represents a probable output for the training data x over the base model M, and P represents a probability function” (Settles in at least § 3.1 pgs. 12-15 discloses the limitation in at least the “least confident” equation                                 
                                    
                                        
                                            x
                                        
                                        
                                            L
                                            C
                                        
                                        
                                            *
                                        
                                    
                                    =
                                    
                                        
                                            a
                                            r
                                            g
                                            m
                                            a
                                            x
                                        
                                        
                                            x
                                        
                                    
                                    (
                                    1
                                    -
                                    
                                        
                                            P
                                        
                                        
                                            θ
                                        
                                    
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                ~
                                            
                                        
                                        
                                            x
                                        
                                    
                                    )
                                
                            ).

In reference to Claim 7. Menahem and Settles teach the method of Claim 5 (as mentioned above), wherein:
Settles further teaches:
“the training data used by a specific one of the base classification algorithms is selected based on a least-reliable strategy comprising:
                
                    
                        
                            x
                        
                        
                            *
                        
                    
                    =
                    
                        
                            a
                            r
                            g
                            m
                            i
                            n
                        
                        
                            x
                            ,
                            M
                        
                    
                    (
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    
                                        
                                            y
                                        
                                        
                                            2
                                        
                                    
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    -
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    
                                        
                                            y
                                        
                                        
                                            1
                                        
                                    
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    )
                
            
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,                                 
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                
                                                    1
                                                
                                            
                                        
                                        ~
                                    
                                
                             and                                 
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                
                                                    2
                                                
                                            
                                        
                                        ~
                                    
                                
                             represent two probable outputs for the training data x over the base model M, and P represents a probability function” (Settles in at least § 3.1 pgs. 12-15 discloses the limitation in at least the “margin sampling” equation                                 
                                    
                                        
                                            x
                                        
                                        
                                            M
                                        
                                        
                                            *
                                        
                                    
                                    =
                                    
                                        
                                            a
                                            r
                                            g
                                            m
                                            i
                                            n
                                        
                                        
                                            x
                                        
                                    
                                    (
                                    
                                        
                                            P
                                        
                                        
                                            θ
                                        
                                    
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            1
                                                        
                                                    
                                                
                                                ~
                                            
                                        
                                        
                                            x
                                        
                                    
                                    -
                                    
                                        
                                            P
                                        
                                        
                                            θ
                                        
                                    
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            2
                                                        
                                                    
                                                
                                                ~
                                            
                                        
                                        
                                            x
                                        
                                    
                                    )
                                
                            ).

In reference to Claim 8. Menahem and Settles teach the method of Claim 5 (as mentioned above), wherein:
Settles further teaches:
“the training data used by a specific one of the base classification algorithms is selected based on a most output entropy strategy comprising:
                
                    
                        
                            x
                        
                        
                            *
                        
                    
                    =
                    
                        
                            a
                            r
                            g
                            m
                            a
                            x
                        
                        
                            x
                            ,
                            M
                        
                    
                    (
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    y
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    l
                    o
                    g
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    y
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    )
                
            
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,                                 
                                    
                                        
                                            y
                                        
                                        ~
                                    
                                
                             represents a probable output for the training data x over the base model M, and P represents a probability function” (Settles in at least § 3.1 pgs. 12-15 discloses he limitation in at least the “entropy” equation                                 
                                    
                                        
                                            x
                                        
                                        
                                            H
                                        
                                        
                                            *
                                        
                                    
                                    =
                                    
                                        
                                            a
                                            r
                                            g
                                            m
                                            a
                                            x
                                        
                                        
                                            x
                                        
                                    
                                    -
                                    
                                        
                                            ∑
                                            
                                                i
                                            
                                        
                                        
                                            (
                                            
                                                
                                                    P
                                                
                                                
                                                    θ
                                                
                                            
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            i
                                                        
                                                    
                                                
                                                
                                                    x
                                                
                                            
                                            l
                                            o
                                            g
                                            
                                                
                                                    P
                                                
                                                
                                                    θ
                                                
                                            
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            i
                                                        
                                                    
                                                
                                                
                                                    x
                                                
                                            
                                        
                                    
                                
                            ).

In reference to Claim 13. Menahem teaches the electronic device of Claim 9 (as mentioned above), wherein, to train the base classification algorithms, the at least one processor is configured to:
Menahem does not explicitly disclose:
“select the training data used by each of at least one of the base classification algorithms based on an uncertainty associated with at least one other of the base classification algorithms”.
However, Settles discloses:
“select the training data used by each of at least one of the base classification algorithms based on an uncertainty associated with at least one other of the base classification algorithms” (Settles in at least § 3.1 pgs. 12-15 discloses selecting a classification algorithm based on an uncertainty).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Menahem and Settles. Menahem teaches a method for improving stacking schema for classification tasks, according to which predictive models are built, based on stacked-generalization meta-classifiers. Settles teaches using a probabilistic model for binary classification using uncertainty sampling. One of ordinary skill would have motivation to combine Menahem and Settles because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to Claim 14. Menahem and Settles teach the electronic device of Claim 13 (as mentioned above), wherein, to train the base classification algorithms, the at least one processor is configured to:
Settles further teaches:
“select the training data used by a specific one of the base classification algorithms based on a least-confident strategy comprising:
                
                    
                        
                            x
                        
                        
                            *
                        
                    
                    =
                    
                        
                            a
                            r
                            g
                            m
                            a
                            x
                        
                        
                            x
                            ,
                            M
                        
                    
                    (
                    1
                    -
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    y
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    )
                
            
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,                                 
                                    
                                        
                                            y
                                        
                                        ~
                                    
                                
                             represents a probable output for the training data x over the base model M, and P represents a probability function” (Settles in at least § 3.1 pgs. 12-15 discloses the limitation in at least the “least confident” equation                                 
                                    
                                        
                                            x
                                        
                                        
                                            L
                                            C
                                        
                                        
                                            *
                                        
                                    
                                    =
                                    
                                        
                                            a
                                            r
                                            g
                                            m
                                            a
                                            x
                                        
                                        
                                            x
                                        
                                    
                                    (
                                    1
                                    -
                                    
                                        
                                            P
                                        
                                        
                                            θ
                                        
                                    
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                ~
                                            
                                        
                                        
                                            x
                                        
                                    
                                    )
                                
                            ).

In reference to Claim 15. Menahem and Settles teach the electronic device of Claim 13 (as mentioned above), wherein, to train the base classification algorithms, the at least one processor is configured to:
Settles further teaches:
“select the training data used by a specific one of the base classification algorithms based on a least-reliable strategy comprising:
                
                    
                        
                            x
                        
                        
                            *
                        
                    
                    =
                    
                        
                            a
                            r
                            g
                            m
                            i
                            n
                        
                        
                            x
                            ,
                            M
                        
                    
                    (
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    
                                        
                                            y
                                        
                                        
                                            2
                                        
                                    
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    -
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    
                                        
                                            y
                                        
                                        
                                            1
                                        
                                    
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    )
                
            
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,                                 
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                
                                                    1
                                                
                                            
                                        
                                        ~
                                    
                                
                             and                                 
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                
                                                    2
                                                
                                            
                                        
                                        ~
                                    
                                
                             represent two probable outputs for the training data x over the base model M, and P represents a probability function” (Settles in at least § 3.1 pgs. 12-15 discloses the limitation in at least the “margin sampling” equation                                 
                                    
                                        
                                            x
                                        
                                        
                                            M
                                        
                                        
                                            *
                                        
                                    
                                    =
                                    
                                        
                                            a
                                            r
                                            g
                                            m
                                            i
                                            n
                                        
                                        
                                            x
                                        
                                    
                                    (
                                    
                                        
                                            P
                                        
                                        
                                            θ
                                        
                                    
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            1
                                                        
                                                    
                                                
                                                ~
                                            
                                        
                                        
                                            x
                                        
                                    
                                    -
                                    
                                        
                                            P
                                        
                                        
                                            θ
                                        
                                    
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            2
                                                        
                                                    
                                                
                                                ~
                                            
                                        
                                        
                                            x
                                        
                                    
                                    )
                                
                            ).

In reference to Claim 16. Menahem and Settles teach the electronic device of Claim 13 (as mentioned above), wherein, to train the base classification algorithms, the at least one processor is configured to:
Settles further teaches:
“select the training data used by a specific one of the base classification algorithms based on a most output entropy strategy comprising:
                
                    
                        
                            x
                        
                        
                            *
                        
                    
                    =
                    
                        
                            a
                            r
                            g
                            m
                            a
                            x
                        
                        
                            x
                            ,
                            M
                        
                    
                    (
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    y
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    l
                    o
                    g
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    y
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    )
                
            
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,                                 
                                    
                                        
                                            y
                                        
                                        ~
                                    
                                
                             represents a probable output for the training data x over the base model M, and P represents a probability function” (Settles in at least § 3.1 pgs. 12-15 discloses he limitation in at least the “entropy” equation                                 
                                    
                                        
                                            x
                                        
                                        
                                            H
                                        
                                        
                                            *
                                        
                                    
                                    =
                                    
                                        
                                            a
                                            r
                                            g
                                            m
                                            a
                                            x
                                        
                                        
                                            x
                                        
                                    
                                    -
                                    
                                        
                                            ∑
                                            
                                                i
                                            
                                        
                                        
                                            (
                                            
                                                
                                                    P
                                                
                                                
                                                    θ
                                                
                                            
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            i
                                                        
                                                    
                                                
                                                
                                                    x
                                                
                                            
                                            l
                                            o
                                            g
                                            
                                                
                                                    P
                                                
                                                
                                                    θ
                                                
                                            
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            i
                                                        
                                                    
                                                
                                                
                                                    x
                                                
                                            
                                        
                                    
                                
                            ).

In reference to Claim 20. Menahem teach the non-transitory computer readable medium of Claim 17 (as mentioned above), wherein the computer readable program code that when executed causes the electronic device to train the base classification algorithms comprises: computer readable program code that when executed causes the electronic device to:
Menahem further discloses:
“train each of the base classification algorithms using samples of training data that do not carry an equal amount of information as the other base classification algorithms” (Menahem in at least Fig. 5, Fig. 7, Fig. 8, Figs. 14-16, ¶ [0005]-[0007], ¶ [0065], and ¶ [0069]-[0088] discloses training using different algorithms and using different training data. Examiner notes that the different training data do not carry an equal amount of information); and

Menahem does not explicitly disclose:
“select the training data used by each of at least one of the base classification algorithms based on an uncertainty associated with at least one other of the base classification algorithms”.
However, Settles discloses:
“select the training data used by each of at least one of the base classification algorithms based on an uncertainty associated with at least one other of the base classification algorithms” (Settles in at least § 3.1 pgs. 12-15 discloses selecting a classification algorithm based on an uncertainty).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Menahem and Settles. Menahem teaches a method for improving stacking schema for classification tasks, according to which predictive models are built, based on stacked-generalization meta-classifiers. Settles teaches using a probabilistic model for binary classification using uncertainty sampling. One of ordinary skill would have motivation to combine Menahem and Settles because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to Claim 21. Menahem and Settles teach the non-transitory computer readable medium of Claim 20 (as mentioned above), wherein the computer readable program code that when executed causes the electronic device to train the base classification algorithms comprises:
Settles further teaches:
“computer readable program code that when executed causes the electronic device to select the training data used by a specific one of the base classification algorithms is selected based on a least-confident strategy comprising:
                
                    
                        
                            x
                        
                        
                            *
                        
                    
                    =
                    
                        
                            a
                            r
                            g
                            m
                            a
                            x
                        
                        
                            x
                            ,
                            M
                        
                    
                    (
                    1
                    -
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    y
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    )
                
            
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,                                 
                                    
                                        
                                            y
                                        
                                        ~
                                    
                                
                             represents a probable output for the training data x over the base model M, and P represents a probability function” (Settles in at least § 3.1 pgs. 12-15 discloses the limitation in at least the “least confident” equation                                 
                                    
                                        
                                            x
                                        
                                        
                                            L
                                            C
                                        
                                        
                                            *
                                        
                                    
                                    =
                                    
                                        
                                            a
                                            r
                                            g
                                            m
                                            a
                                            x
                                        
                                        
                                            x
                                        
                                    
                                    (
                                    1
                                    -
                                    
                                        
                                            P
                                        
                                        
                                            θ
                                        
                                    
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                ~
                                            
                                        
                                        
                                            x
                                        
                                    
                                    )
                                
                            ).

In reference to Claim 22. Menahem and Settles teach the non-transitory computer readable medium of Claim 20 (as mentioned above), wherein the computer readable program code that when executed causes the electronic device to train the base classification algorithms comprises:
Settles further teaches:
“computer readable program code that when executed causes the electronic device to select the training data used by a specific one of the base classification algorithms is selected based on a least-reliable strategy comprising:
                
                    
                        
                            x
                        
                        
                            *
                        
                    
                    =
                    
                        
                            a
                            r
                            g
                            m
                            i
                            n
                        
                        
                            x
                            ,
                            M
                        
                    
                    (
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    
                                        
                                            y
                                        
                                        
                                            2
                                        
                                    
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    -
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    
                                        
                                            y
                                        
                                        
                                            1
                                        
                                    
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    )
                
            
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,                                 
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                
                                                    1
                                                
                                            
                                        
                                        ~
                                    
                                
                             and                                 
                                    
                                        
                                            
                                                
                                                    y
                                                
                                                
                                                    2
                                                
                                            
                                        
                                        ~
                                    
                                
                             represent two probable outputs for the training data x over the base model M, and P represents a probability function” (Settles in at least § 3.1 pgs. 12-15 discloses the limitation in at least the “margin sampling” equation                                 
                                    
                                        
                                            x
                                        
                                        
                                            M
                                        
                                        
                                            *
                                        
                                    
                                    =
                                    
                                        
                                            a
                                            r
                                            g
                                            m
                                            i
                                            n
                                        
                                        
                                            x
                                        
                                    
                                    (
                                    
                                        
                                            P
                                        
                                        
                                            θ
                                        
                                    
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            1
                                                        
                                                    
                                                
                                                ~
                                            
                                        
                                        
                                            x
                                        
                                    
                                    -
                                    
                                        
                                            P
                                        
                                        
                                            θ
                                        
                                    
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            2
                                                        
                                                    
                                                
                                                ~
                                            
                                        
                                        
                                            x
                                        
                                    
                                    )
                                
                            ).

In reference to Claim 23. Menahem and Settles teach the non-transitory computer readable medium of Claim 20 (as mentioned above), wherein the computer readable program code that when executed causes the electronic device to train the base classification algorithms comprises:
Settles further teaches:
“computer readable program code that when executed causes the electronic device to select the training data used by a specific one of the base classification algorithms is selected based on a most output entropy strategy comprising:
                
                    
                        
                            x
                        
                        
                            *
                        
                    
                    =
                    
                        
                            a
                            r
                            g
                            m
                            a
                            x
                        
                        
                            x
                            ,
                            M
                        
                    
                    (
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    y
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    l
                    o
                    g
                    
                        
                            P
                        
                        
                            M
                        
                    
                    
                        
                            
                                
                                    y
                                
                                ~
                            
                        
                        
                            x
                        
                    
                    )
                
            
where x represents a set of training data, x* represents samples of the training data from the set to be used by the specific base classification algorithm, M represents the base model associated with the specific base classification algorithm,                                 
                                    
                                        
                                            y
                                        
                                        ~
                                    
                                
                             represents a probable output for the training data x over the base model M, and P represents a probability function” (Settles in at least § 3.1 pgs. 12-15 discloses he limitation in at least the “entropy” equation                                 
                                    
                                        
                                            x
                                        
                                        
                                            H
                                        
                                        
                                            *
                                        
                                    
                                    =
                                    
                                        
                                            a
                                            r
                                            g
                                            m
                                            a
                                            x
                                        
                                        
                                            x
                                        
                                    
                                    -
                                    
                                        
                                            ∑
                                            
                                                i
                                            
                                        
                                        
                                            (
                                            
                                                
                                                    P
                                                
                                                
                                                    θ
                                                
                                            
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            i
                                                        
                                                    
                                                
                                                
                                                    x
                                                
                                            
                                            l
                                            o
                                            g
                                            
                                                
                                                    P
                                                
                                                
                                                    θ
                                                
                                            
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            i
                                                        
                                                    
                                                
                                                
                                                    x
                                                
                                            
                                        
                                    
                                
                            ).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Viker A. Lamardo whose telephone number is (571)270-5871. The examiner can normally be reached Mon. - Fri. 9 AM - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann J. Lo can be reached on (571)272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/VIKER A LAMARDO/Primary Examiner, Art Unit 2126