DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
The present application is being examined under the claims filed on 07/16/2019.
Claims 1-43 are rejected.
Claims 1-43 are pending.

Drawings
The Drawings filed on 07/16/2019 are acceptable for examination purposes.

Specification
The Specification filed on 12/05/2019 are acceptable for examination purposes.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –




Claims 1-3, 7, 8, 10-15, 17-21, 24-28, 30-35, and 37-41 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Bonissone et al. (hereinafter Bonissone) US 20140188768 A1.
In reference to claim 1. Bonissone teaches a method for building and using an operational ensemble of machine learning systems that is robust against adversarial attacks (examiner notes that the system being robust against adversarial attacks is the intended result of the positively recited steps below and it’s not given patentable weight), the method comprising:
“training, with a computer system that comprises one or more processor units, a base ensemble having a plurality of machine-learning ensemble members such that the ensemble members have diversity with regard to sensitivity to changes in input variables, wherein the base ensemble comprises                                 
                                    N
                                    >
                                    1
                                
                             different subsets of the plurality of machine-learning ensemble members, wherein each of the N subsets comprises one or more ensemble members of the plurality of machine-earning ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0056], ¶ [0059], and ¶ [0067]-[0072] training an ensemble classifier by selecting the most diverse set of models from the locally dominant models; examiner notes that an ensemble is a collection of models);
“including, by the computer system, P of the N subsets of the ensemble members in the operational ensemble, where                                 
                                    2
                                    ≤
                                    P
                                    ≤
                                    N
                                
                            , based on whether the subsets pass a performance measure test and a diversity measure test, wherein the diversity measure test is based on a diversity measure for the subsets relative to each of the other subsets of the ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] training an ensemble classifier by 
“performing an operational machine-learning task with the operational ensemble on a data item” (Bonissone in at least Fig. 3),
wherein performing the operational machine-learning task comprises:
“selecting, by the computer system, one of the P subsets of the ensemble members in the operational ensemble” (Bonissone in at least Fig. 3);
“processing, by the computer system, the data item with the selected subset of the ensemble members to generate a final result for the machine-learning task for the data item” (Bonissone in at least Fig. 3).

In reference to claim 2. Bonissone teaches the method of claim 1 (as mentioned above), wherein including the P subsets in the operational ensemble comprises:
Bonissone further discloses:
“computing, by the computer system, a performance measure of a first (n=1) subset of the ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] training an ensemble classifier by selecting the most diverse set of models from the locally dominant models; examiner notes that an ensemble is a collection of models. In at least ¶ [0059]-[0072] Bonissone discloses that each model selected for the ensemble meets a performance and diversity metric);
for                                 
                                    n
                                    =
                                    2
                                     
                                    t
                                    o
                                     
                                    J
                                    ,
                                     
                                    w
                                    h
                                    e
                                    r
                                    e
                                     
                                    P
                                    ≤
                                    J
                                    ≤
                                    N
                                
                            , iteratively:
“computing, by the computer system, a performance measure for the n-th subset of the ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ 
“computing, by the computer system, the diversity measure for the n-th subset of the ensemble members relative to each of the                                 
                                    n
                                    =
                                    1
                                    ,
                                     
                                    …
                                    ,
                                     
                                    (
                                    n
                                    -
                                    1
                                    )
                                
                             subsets of the ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] training an ensemble classifier by selecting the most diverse set of models from the locally dominant models; examiner notes that an ensemble is a collection of models. In at least ¶ [0059]-[0072] Bonissone discloses that each model selected for the ensemble meets a performance and diversity metric);
“determining, by the computer system, whether to include the n-th subset of the ensemble members in the operational ensemble based on the performance and diversity measures for the n-th subset of the ensemble members, such that following the                                 
                                    n
                                    =
                                    j
                                
                             iteration, the operational ensemble comprises the P subsets of the ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] training an ensemble classifier by selecting the most diverse set of models from the locally dominant models; examiner notes that an ensemble is a collection of models. In at least ¶ [0059]-[0072] Bonissone discloses that each model selected for the ensemble meets a performance and diversity metric).

In reference to claim 3. Bonissone teaches the method of claim 1 (as mentioned above), wherein, upon a condition that the selected subset comprises multiple ensemble members, the step of processing the data item comprises:
Bonissone further discloses:
“processing the data item with each of the multiple ensemble members of the selected subset” (Bonissone in at least Fig. 3);
“combining a result from each of the multiple ensemble members to generate the final result” (Bonissone in at least Fig. 3).

In reference to claim 7. Bonissone teaches the method of claim 1 (as mentioned above), further comprising:
Bonissone further discloses:
“prior to training the base ensemble, building, by the computer system, the base ensemble from a base machine-learning network” (Bonissone in at least ¶ [0047], ¶ [0058], ¶ [0063], ¶ [0066] discloses prior to training the base ensemble, building, by the computer system, the base ensemble from a base machine-learning network. Examiner notes that building before training is being interpreted as determining which models are going to be part of the ensemble).

In reference to claim 8. Bonissone teaches the method of claim 7 (as mentioned above), wherein building the base ensemble comprises:
Bonissone further discloses:
“selecting, by the computer system, r selected network elements of a base-machine learning network, where                                 
                                    r
                                    >
                                    1
                                
                            ” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-                                
                                    r
                                    >
                                    1
                                
                            );
“making, by the computer system, M copies of a base machine-learning network, where                                 
                                    2
                                    ≤
                                    M
                                    ≤
                                    
                                        
                                            2
                                        
                                        
                                            r
                                        
                                    
                                
                            ” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses making, by the computer system, M copies of a base machine-learning network, where                                 
                                    2
                                    ≤
                                    M
                                    ≤
                                    
                                        
                                            2
                                        
                                        
                                            r
                                        
                                    
                                
                            . See at least ¶ [0062]);
“training, by the computer system, each of the M copies of the base machine-learning network such that each of the M copies of the base machine-learning network is trained to change its learned parameters in a different direction than any of the other M copies” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses training, by the computer system, each of the M copies of the base machine-learning network such that each of the M copies of the base machine-learning network is trained to change its learned parameters in a different direction than any of the other M copies);
“combining, by the computer system, the M copies of the base machine-learning network into the base ensemble” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses combining, by the computer system, the M copies of the base machine-learning network into the base ensemble).

In reference to claim 10. Bonissone teaches the method of claim 1 (as mentioned above), wherein training the base ensemble such that the ensemble members have diversity with regard to sensitivity to changes in input variables comprises:

“training each of the N subsets of the ensemble members with primary and secondary objectives, wherein the secondary objective is different for each of the N sets of ensemble members” (Bonissone in at least ¶ [0062]-[0064] discloses training each of the N subsets of the ensemble members with primary and secondary objectives, wherein the secondary objective is different for each of the N sets of ensemble members).

In reference to claim 11. Bonissone teaches the method of claim 1 (as mentioned above), further comprising:
Bonissone further discloses:
“for each subset of ensemble members that comprises more than one ensemble member of the base network, training the set comprises jointly training the ensemble members of the subset” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses for each subset of ensemble members that comprises more than one ensemble member of the base network, training the set comprises jointly training the ensemble members of the subset).

In reference to claim 12. Bonissone teaches the method of claim 11 (as mentioned above), wherein jointly training the ensemble members comprises:
Bonissone further discloses:
“adding a joint optimization network to the ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses the optimization to the ensemble members).

In reference to claim 13. Bonissone teaches the method of claim 10 (as mentioned above), wherein:
Bonissone further discloses:
“for each of the                                 
                                    n
                                    =
                                    1
                                    ,
                                     
                                    …
                                    ,
                                     
                                    N
                                
                             subsets of the ensemble members, the secondary objective for the n-th subset of ensemble members trains the n-th subset of ensemble members such that partial derivatives of a differentiable function attempt to match a target input sensitivity value for each input variable for each training data item” (Bonissone in at least ¶ [0062]-[0064] discloses for each of the                                 
                                    n
                                    =
                                    1
                                    ,
                                     
                                    …
                                    ,
                                     
                                    N
                                
                             subsets of the ensemble members, the secondary objective for the n-th subset of ensemble members trains the n-th subset of ensemble members such that partial derivatives of a differentiable function attempt to match a target input sensitivity value for each input variable for each training data item);
“the differentiable function is different from a loss function for the primary objective” (Bonissone in at least ¶ [0062]-[0064] discloses the differentiable function is different from a loss function for the primary objective).

In reference to claim 14. Bonissone teaches the method of claim 13 (as mentioned above), wherein:
Bonissone further discloses:
“the target input sensitivity value is a vector that is different for each of the N sets of ensemble members” (Bonissone in at least ¶ [0062]-[0064] and ¶ [0082] discloses the target input sensitivity value is a vector that is different for each of the N sets of ensemble members).

In reference to claim 15. Bonissone teaches the method of claim 13 (as mentioned above), wherein training the N subsets with the primary objectives comprises, for each of the                         
                            n
                            =
                            1
                            ,
                             
                            …
                            ,
                             
                            N
                        
                     subsets:
Bonissone further discloses:
for each of a plurality of training data examples:
“computing, by the computer system, output values of the n-th subset” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses computing, by the computer system, output values of the n-th subset);
“computing, by the computer system, a partial derivative of the differentiable function of the output values for the n-th subset with respect to an input variable” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses computing, by the computer system, a partial derivative of the differentiable function of the output values for the n-th subset with respect to an input variable);
“computing, by the computer system, a partial derivative of the secondary objective for the n-th subset, wherein the secondary objective is a function of one or more computed partial derivatives of the differentiable function” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses computing, by the computer system, a partial derivative of the secondary objective for the n-th subset, wherein the secondary objective is a function of one or more computed partial derivatives of the differentiable function);
“updating, by the computer system, a learned parameter for the n-th subset based on, in part, the computed partial derivatives of the secondary objective” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses updating, by the computer system, a learned parameter for the n-th subset based on, in part, the computed partial derivatives of the secondary objective).

In reference to claim 17. Bonissone teaches the method of claim 2 (as mentioned above), wherein the steps of computing the performance measure and the diversity measure for the n-th subset comprises:
Bonissone further discloses:
“computing, by the computer system, a value of an objective of an output of the n-th subset for each of a plurality of selected data items” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses computing, by the computer system, a value of an objective of an output of the n-th subset for each of a plurality of selected data items);
“accumulating, by the computer system, performance data for the n-th subset obtained for all of the selected data items” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses accumulating, by the computer system, performance data for the n-th subset obtained for all of the selected data items);
“computing, by the computer system, a diversity measure of input sensitivity for the n-th subset” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ 

In reference to claim 18. Bonissone teaches the method of claim 17 (as mentioned above), wherein:
Bonissone further discloses:
“the performance measure of the n-th subset is computed based on the accumulated performance data for the n-th subset” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses the performance measure of the n-th subset is computed based on the accumulated performance data for the n-th subset);
“the first subset of the ensemble members that passes a performance measure test is included in the operational set” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses the first subset of the ensemble members that passes a performance measure test is included in the operational set);
“the performance measure test is based on the performance measure” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses the performance measure test is based on the performance measure).

In reference to claim 19. Bonissone teaches the method of claim 18 (as mentioned above), wherein:
Bonissone further discloses:
“each subset after the first subset that passes both the performance measure test and a diversity test are included in the operational set, such that there are P subsets in the operational set, where                                 
                                    2
                                    ≤
                                    P
                                    ≤
                                    J
                                
                            ” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] training an ensemble classifier by selecting the most diverse set of models from the locally dominant models; examiner notes that an ensemble is a collection of models. In at least ¶ [0059]-[0072] Bonissone discloses that each model selected for the ensemble meets a performance and diversity metric).

In reference to claim 20. Bonissone teaches the method of claim 19 (as mentioned above), wherein:
Bonissone further discloses:
“the diversity test for the n-th subset is based the diversity measure for the n-th subset” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] training an ensemble classifier by selecting the most diverse set of models from the locally dominant models; examiner notes that an ensemble is a collection of models. In at least ¶ [0059]-[0072] Bonissone discloses that each model selected for the ensemble meets a performance and diversity metric).

In reference to claim 21. Bonissone teaches the method of claim 20 (as mentioned above), wherein the diversity test comprises:
Bonissone further discloses:
“a correlation of a classification gradient for the n-th subset to a classification gradient of each subset already included in the operational set” (Bonissone in at least Fig. 1, Fig. 2, Figs. 

In reference to claim 24. Bonissone teaches a computer system for building and using an operational ensemble of machine learning systems that is robust against adversarial attacks (examiner notes that the system being robust against adversarial attacks is the intended result of the positively recited steps below and it’s not given patentable weight), the computer system comprising one or more processor units (Bonissone Fig. 1) that are programmed to:
“train a base ensemble having a plurality of machine-learning ensemble members such that the ensemble members have diversity with regard to sensitivity to changes in input variables, wherein the base ensemble comprises                                 
                                    N
                                    >
                                    1
                                
                             different subsets of the plurality of machine-learning ensemble members, wherein each of the N subsets comprises one or more ensemble members of the plurality of machine-learning ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0056], ¶ [0059], and ¶ [0067]-[0072] training an ensemble classifier by selecting the most diverse set of models from the locally dominant models; examiner notes that an ensemble is a collection of models);
“include P of the N subsets of the ensemble members in the operational ensemble, where                                 
                                    2
                                    ≤
                                    P
                                    ≤
                                    N
                                
                            , based on whether the subsets pass a performance measure test and a diversity measure test, wherein the diversity measure test is based on a diversity measure for the subsets relative to each of the other subsets of the ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] training an ensemble classifier by selecting the most diverse set of 
perform an operational machine-learning task with the operational ensemble on a data item (Bonissone in at least Fig. 3) by:
“selecting one of the P subsets of the ensemble members in the operational ensemble” (Bonissone in at least Fig. 3);
“processing the data item with the selected subset of the ensemble members to generate a final result for the machine-learning task for the data item” (Bonissone in at least Fig. 3).

In reference to claim 25. Bonissone teaches the computer system of claim 24 (as mentioned above), wherein the one or more processor units of the computer system are programmed to include the P subsets in the operational ensemble by:
Bonissone further discloses:
“computing a performance measure of a first (n=1) subset of the ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] training an ensemble classifier by selecting the most diverse set of models from the locally dominant models; examiner notes that an ensemble is a collection of models. In at least ¶ [0059]-[0072] Bonissone discloses that each model selected for the ensemble meets a performance and diversity metric);
for                                 
                                    n
                                    =
                                    2
                                     
                                    t
                                    o
                                     
                                    J
                                    ,
                                     
                                    w
                                    h
                                    e
                                    r
                                    e
                                     
                                    P
                                    ≤
                                    J
                                    ≤
                                    N
                                
                            , iteratively:
“computing a performance measure for the n-th subset of the ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ 
“computing the diversity measure for the n-th subset of the ensemble members relative to each of the                                 
                                    n
                                    =
                                    1
                                    ,
                                     
                                    …
                                    ,
                                     
                                    (
                                    n
                                    -
                                    1
                                    )
                                
                             subsets of the ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] training an ensemble classifier by selecting the most diverse set of models from the locally dominant models; examiner notes that an ensemble is a collection of models. In at least ¶ [0059]-[0072] Bonissone discloses that each model selected for the ensemble meets a performance and diversity metric);
“determining whether to include the n-th subset of the ensemble members in the operational ensemble based on the performance and diversity measures for the n- th subset of the ensemble members, such that following the                                 
                                    n
                                    =
                                    j
                                
                             iteration, the operational ensemble comprises the P subsets of the ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] training an ensemble classifier by selecting the most diverse set of models from the locally dominant models; examiner notes that an ensemble is a collection of models. In at least ¶ [0059]-[0072] Bonissone discloses that each model selected for the ensemble meets a performance and diversity metric).

In reference to claim 26. Bonissone teaches the computer system of claim 24 (as mentioned above), wherein, upon a condition that the selected subset comprises multiple ensemble members, the computer system processes the data item by:
Bonissone further discloses:
“processing the data item with each of the multiple ensemble members of the selected subset” (Bonissone in at least Fig. 3);
“combining a result from each of the multiple ensemble members to generate the final result” (Bonissone in at least Fig. 3).

In reference to claim 27. Bonissone teaches the computer system of claim 24 (as mentioned above), wherein the one or more processor units of the computer system are further programmed to:
Bonissone further discloses:
“prior to training the base ensemble, build the base ensemble from a base machine-learning network” (Bonissone in at least ¶ [0047], ¶ [0058], ¶ [0063], ¶ [0066] discloses prior to training the base ensemble, building, by the computer system, the base ensemble from a base machine-learning network. Examiner notes that building before training is being interpreted as determining which models are going to be part of the ensemble).

In reference to claim 28. Bonissone teaches the computer system of claim 27 (as mentioned above), wherein the one or more processor units are programmed to build the base ensemble by:
Bonissone further discloses:
“selecting r selected network elements of a base-machine learning network, where                                 
                                    r
                                    >
                                    1
                                
                            ” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶                                 
                                    r
                                    >
                                    1
                                
                            );
“making M copies of a base machine-learning network, where                                 
                                    2
                                    ≤
                                    M
                                    ≤
                                    
                                        
                                            2
                                        
                                        
                                            r
                                        
                                    
                                
                            ” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses making, by the computer system, M copies of a base machine-learning network, where                                 
                                    2
                                    ≤
                                    M
                                    ≤
                                    
                                        
                                            2
                                        
                                        
                                            r
                                        
                                    
                                
                            . See at least ¶ [0062]);
“training each of the M copies of the base machine-learning network such that each of the M copies of the base machine-learning network is trained to change its learned parameters in a different direction than any of the other M copies” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses training, by the computer system, each of the M copies of the base machine-learning network such that each of the M copies of the base machine-learning network is trained to change its learned parameters in a different direction than any of the other M copies);
“combining the M copies of the base machine-learning network into the base ensemble” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses combining, by the computer system, the M copies of the base machine-learning network into the base ensemble).

In reference to claim 30. Bonissone teaches the computer system of claim 24 (as mentioned above), wherein the one or more processor units are programmed to train the base ensemble such that the ensemble members have diversity with regard to sensitivity to changes in input variables by:
Bonissone further discloses:
“training each of the N subsets of the ensemble members with primary and secondary objectives, wherein the secondary objective is different for each of the N sets of ensemble members” (Bonissone in at least ¶ [0062]-[0064] discloses training each of the N subsets of the ensemble members with primary and secondary objectives, wherein the secondary objective is different for each of the N sets of ensemble members).

In reference to claim 31. Bonissone teaches the computer system of claim 24 (as mentioned above), wherein the one or more processor units are programmed further to:
Bonissone further discloses:
“for each subset of ensemble members that comprises more than one ensemble member of the base network, jointly train the ensemble members of the subset” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses for each subset of ensemble members that comprises more than one ensemble member of the base network, training the set comprises jointly training the ensemble members of the subset).

In reference to claim 32. Bonissone teaches the computer system of claim 31 (as mentioned above), wherein the one or more processor units are programmed to jointly train the ensemble members by:
Bonissone further discloses:
“adding a joint optimization network to the ensemble members” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses the optimization to the ensemble members).

In reference to claim 33. Bonissone teaches the computer system of claim 30 (as mentioned above), wherein:
Bonissone further discloses:
“for each of the                                 
                                    n
                                    =
                                    1
                                    ,
                                     
                                    …
                                    ,
                                     
                                    N
                                
                             subsets of the ensemble members, the secondary objective for the n-th subset of ensemble members trains the n-th subset of ensemble members such that partial derivatives of a differentiable function attempt to match a target input sensitivity value for each input variable for each training data item” (Bonissone in at least ¶ [0062]-[0064] discloses for each of the                                 
                                    n
                                    =
                                    1
                                    ,
                                     
                                    …
                                    ,
                                     
                                    N
                                
                             subsets of the ensemble members, the secondary objective for the n-th subset of ensemble members trains the n-th subset of ensemble members such that partial derivatives of a differentiable function attempt to match a target input sensitivity value for each input variable for each training data item);
“the differentiable function is different from a loss function for the primary objective” (Bonissone in at least ¶ [0062]-[0064] discloses the differentiable function is different from a loss function for the primary objective).

In reference to claim 34. Bonissone teaches the computer system of claim 33 (as mentioned above), wherein:
Bonissone further discloses:
“the target input sensitivity value is a vector that is different for each of the N sets of ensemble members” (Bonissone in at least ¶ [0062]-[0064] and ¶ [0082] discloses the target input sensitivity value is a vector that is different for each of the N sets of ensemble members).

In reference to claim 35. Bonissone teaches the computer system of claim 33 (as mentioned above), wherein the one or more processor units are programmed to train the N subsets with the primary objectives by, for each of the                         
                            n
                            =
                            1
                            ,
                             
                            …
                            ,
                             
                            N
                        
                     subsets:
Bonissone further discloses:
for each of a plurality of training data examples:
“computing, by the computer system, output values of the n-th subset” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses computing, by the computer system, output values of the n-th subset);
“computing, by the computer system, a partial derivative of the differentiable function of the output values for the n-th subset with respect to an input variable” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses computing, by the computer system, a partial derivative of the differentiable function of the output values for the n-th subset with respect to an input variable);
“computing, by the computer system, a partial derivative of the secondary objective for the n-th subset, wherein the secondary objective is a function of one or more computed partial derivatives of the differentiable function” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses computing, by the computer system, a partial derivative of the secondary objective for the n-th subset, wherein the secondary objective is a function of one or more computed partial derivatives of the differentiable function);
“updating, by the computer system, a learned parameter for the n-th subset based on, in part, the computed partial derivatives of the secondary objective” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses updating, by the computer system, a learned parameter for the n-th subset based on, in part, the computed partial derivatives of the secondary objective).

In reference to claim 37. Bonissone teaches the computer system of claim 25 (as mentioned above), wherein the one or more processor units are programmed to compute the measure performance and the diversity measure for the n-th subset by:
Bonissone further discloses:
“computing, by the computer system, a value of an objective of an output of the n-th subset for each of a plurality of selected data items” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses computing, by the computer system, a value of an objective of an output of the n-th subset for each of a plurality of selected data items);
“accumulating, by the computer system, performance data for the n-th subset obtained for all of the selected data items” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses accumulating, by the computer system, performance data for the n-th subset obtained for all of the selected data items);
“computing, by the computer system, a diversity measure of input sensitivity for the n-th subset” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ 

In reference to claim 38. Bonissone teaches the computer system of claim 37 (as mentioned above), wherein:
Bonissone further discloses:
“the performance measure of the n-th subset is computed based on the accumulated performance data for the n-th subset” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses the performance measure of the n-th subset is computed based on the accumulated performance data for the n-th subset);
“the first subset of the ensemble members that passes a performance measure test is included in the operational set” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses the first subset of the ensemble members that passes a performance measure test is included in the operational set);
“the performance measure test is based on the performance measure” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] discloses the performance measure test is based on the performance measure).

In reference to claim 39. Bonissone teaches the computer system of claim 38 (as mentioned above), wherein:
Bonissone further discloses:
“each subset after the first subset that passes both the performance measure test and a diversity test are included in the operational set, such that there are P subsets in the operational set, where                                 
                                    2
                                    ≤
                                    P
                                    ≤
                                    J
                                
                            ” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] training an ensemble classifier by selecting the most diverse set of models from the locally dominant models; examiner notes that an ensemble is a collection of models. In at least ¶ [0059]-[0072] Bonissone discloses that each model selected for the ensemble meets a performance and diversity metric).

In reference to claim 40. Bonissone teaches the computer system of claim 39 (as mentioned above), wherein:
Bonissone further discloses:
“the diversity test for the n-th subset is based the diversity measure for the n-th subset” (Bonissone in at least Fig. 1, Fig. 2, Figs. 4-8, Abstract, ¶ [0005]-[0007], ¶ [0014], ¶ [0029], ¶ [0030], ¶ [0056], and ¶ [0059]-[0072] training an ensemble classifier by selecting the most diverse set of models from the locally dominant models; examiner notes that an ensemble is a collection of models. In at least ¶ [0059]-[0072] Bonissone discloses that each model selected for the ensemble meets a performance and diversity metric).

In reference to claim 41. Bonissone teaches the computer system of claim 40 (as mentioned above), wherein the diversity test comprises:
Bonissone further discloses:
“a correlation of a classification gradient for the n-th subset to a classification gradient of each subset already included in the operational set” (Bonissone in at least Fig. 1, Fig. 2, Figs. .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 4 and 5 are rejected under 35 U.S.C. 103 as being unpatentable over Bonissone et al. (hereinafter Bonissone) US 20140188768 A1 in view of Chen et al. (hereinafter Chen) US 20080228680 A1.
In reference to claim 4. Bonissone teaches the method of claim 1 (as mentioned above), wherein:
	Bonissone does not explicitly disclose:
“the at least one of the plurality of ensemble members comprises a neural network”.

However, Chen discloses:
“the at least one of the plurality of ensemble members comprises a neural network” (Chen in at least ¶ [0045], ¶ [0046], ¶ [0061]-[0068] discloses “neural network ensemble 702 formed by selecting multiple neural networks from candidate pool”).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Bonissone and Chen. Bonissone teaches a system for creating customized model ensembles on demand. Chen teaches a system to create a pool of neural networks trained on a first portion of a sparse data set; generate for each of various multi-objective functions a set of neural network ensembles that minimize the multi-objective function; select a local ensemble from each set of ensembles based on data not included in said first portion of said sparse data set; and combine a subset of the local ensembles to form a global ensemble. One of ordinary skill would have motivation to combine Bonissone and Chen because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 5. Bonissone teaches the method of claim 1 (as mentioned above), wherein:
	Bonissone does not explicitly disclose:
“the each of the plurality of ensemble members comprises a neural network”.

However, Chen discloses:
“the each of the plurality of ensemble members comprises a neural network” (Chen in at least ¶ [0045], ¶ [0046], ¶ [0061]-[0068] discloses “neural network ensemble 702 formed by selecting multiple neural networks from candidate pool”).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Bonissone and Chen. Bonissone teaches a system for creating customized model ensembles on demand. Chen teaches a system to create a pool of neural networks trained on a first portion of a sparse data set; generate for each of various multi-objective functions a set of neural network ensembles that minimize the multi-objective function; select a local ensemble from each set of ensembles based on data not included in said first portion of said sparse data set; and combine a subset of the local ensembles to form a global ensemble. One of ordinary skill would have motivation to combine Bonissone and Chen because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Bonissone et al. (hereinafter Bonissone) US 20140188768 A1 in view of Chen et al. (hereinafter Chen2) US 20070011114 A1.
In reference to claim 6. Bonissone teaches the method of claim 1 (as mentioned above), wherein:
Bonissone does not explicitly disclose:
“the each of the plurality of ensemble members is a machine learning system training by back propagation of partial derivatives”.

However, Chen2 discloses:
“the each of the plurality of ensemble members is a machine learning system training by back propagation of partial derivatives” (Chen2 in at least ¶ [0032]-[0034] “the neural networks are based on a back-propagation architecture (backpropagation networks, or "BPN), one of the architectural parameters is the number of nodes in the hidden layer”).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Bonissone and Chen2. Bonissone teaches a system for creating customized model ensembles on demand. Chen2 teaches a system for creating and using robust neural network ensembles. One of ordinary skill would have motivation to combine Bonissone and Chen2 because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

Claims 9 and 29 are rejected under 35 U.S.C. 103 as being unpatentable over Bonissone et al. (hereinafter Bonissone) US 20140188768 A1 in view of Gurwicz et al. (hereinafter Gurwicz) US 20190042917 A1.
In reference to claim 9. Bonissone teaches the method of claim 8 (as mentioned above), wherein:

“the base machine-learning network comprises a base neural network”;
“the base neural network comprises a plurality of nodes and plurality of directed arcs”;
“each directed arc is between two nodes of the base neural network”;
“the                                 
                                    t
                                
                             selected network elements comprise                                 
                                    u
                                
                             nodes of the base neural network and                                 
                                    v
                                
                             directed arcs of the base neural network, where                                 
                                    u
                                
                             and                                 
                                    v
                                
                             are integers greater than or equal to zero,                                 
                                    u
                                    +
                                    v
                                    =
                                    t
                                
                            ”.

However, Gurwicz discloses:
“the base machine-learning network comprises a base neural network” (Gurwicz in at least ¶ [0017], ¶ [0018], ¶ [0024]-[0027], ¶ [0032]-[0034], and ¶ [0040]-[0044] discloses the base machine-learning network comprises a base neural network);
“the base neural network comprises a plurality of nodes and plurality of directed arcs” (Gurwicz in at least ¶ [0017], ¶ [0018], ¶ [0024]-[0027], ¶ [0032]-[0034], and ¶ [0040]-[0044] discloses the base neural network comprises a plurality of nodes and plurality of directed arcs);
“each directed arc is between two nodes of the base neural network” (Gurwicz in at least ¶ [0017], ¶ [0018], ¶ [0024]-[0027], ¶ [0032]-[0034], and ¶ [0040]-[0044] discloses each directed arc is between two nodes of the base neural network);
“the                                 
                                    t
                                
                             selected network elements comprise                                 
                                    u
                                
                             nodes of the base neural network and                                 
                                    v
                                
                             directed arcs of the base neural network, where                                 
                                    u
                                
                             and                                 
                                    v
                                
                             are integers greater than or equal to zero,                                 
                                    u
                                    +
                                    v
                                    =
                                    t
                                
                            ” (Gurwicz in at least ¶ [0017], ¶ [0018], ¶ [0024]-[0027], ¶ [0032]-[0034], and ¶ [0040]-[0044] discloses the                                 
                                    t
                                
                             selected network elements comprise                                 
                                    u
                                
                             nodes of the                                 
                                    v
                                
                             directed arcs of the base neural network, where                                 
                                    u
                                
                             and                                 
                                    v
                                
                             are integers greater than or equal to zero,                                 
                                    u
                                    +
                                    v
                                    =
                                    t
                                
                            ).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Bonissone and Gurwicz. Bonissone teaches a system for creating customized model ensembles on demand. Gurwicz teaches an ensemble of different neural network topologies, leading to improved classification capabilities. One of ordinary skill would have motivation to combine Bonissone and Gurwicz because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 29. Bonissone teaches the computer system of claim 28 (as mentioned above), wherein:
Bonissone does not explicitly disclose:
“the base machine-learning network comprises a base neural network”;
“the base neural network comprises a plurality of nodes and plurality of directed arcs”;
“each directed arc is between two nodes of the base neural network”;
“the                                 
                                    t
                                
                             selected network elements comprise                                 
                                    u
                                
                             nodes of the base neural network and                                 
                                    v
                                
                             directed arcs of the base neural network, where                                 
                                    u
                                
                             and                                 
                                    v
                                
                             are integers greater than or equal to zero,                                 
                                    u
                                    +
                                    v
                                    =
                                    t
                                
                            ”.


“the base machine-learning network comprises a base neural network” (Gurwicz in at least ¶ [0017], ¶ [0018], ¶ [0024]-[0027], ¶ [0032]-[0034], and ¶ [0040]-[0044] discloses the base machine-learning network comprises a base neural network);
“the base neural network comprises a plurality of nodes and plurality of directed arcs” (Gurwicz in at least ¶ [0017], ¶ [0018], ¶ [0024]-[0027], ¶ [0032]-[0034], and ¶ [0040]-[0044] discloses the base neural network comprises a plurality of nodes and plurality of directed arcs);
“each directed arc is between two nodes of the base neural network” (Gurwicz in at least ¶ [0017], ¶ [0018], ¶ [0024]-[0027], ¶ [0032]-[0034], and ¶ [0040]-[0044] discloses each directed arc is between two nodes of the base neural network);
“the                                 
                                    t
                                
                             selected network elements comprise                                 
                                    u
                                
                             nodes of the base neural network and                                 
                                    v
                                
                             directed arcs of the base neural network, where                                 
                                    u
                                
                             and                                 
                                    v
                                
                             are integers greater than or equal to zero,                                 
                                    u
                                    +
                                    v
                                    =
                                    t
                                
                            ” (Gurwicz in at least ¶ [0017], ¶ [0018], ¶ [0024]-[0027], ¶ [0032]-[0034], and ¶ [0040]-[0044] discloses the                                 
                                    t
                                
                             selected network elements comprise                                 
                                    u
                                
                             nodes of the base neural network and                                 
                                    v
                                
                             directed arcs of the base neural network, where                                 
                                    u
                                
                             and                                 
                                    v
                                
                             are integers greater than or equal to zero,                                 
                                    u
                                    +
                                    v
                                    =
                                    t
                                
                            ).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Bonissone and Gurwicz. Bonissone teaches a system for creating customized model ensembles on demand. Gurwicz teaches an ensemble of different neural network topologies, leading to improved classification capabilities. One of ordinary skill would have motivation to combine Bonissone and Gurwicz because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of .

Claims 16 and 36 are rejected under 35 U.S.C. 103 as being unpatentable over Bonissone et al. (hereinafter Bonissone) US 20140188768 A1 in view of Chen et al. (hereinafter Chen) US 20080228680 A1 in view of Chen et al. (hereinafter Chen2) US 20070011114 A1.
In reference to claim 16. Bonissone teaches the method of claim 15 (as mentioned above), wherein:
Bonissone does not explicitly disclose:
“each of the N subsets comprises a neural network”;
“the output-values of the n-th subset are computed through a forward computation through the neural network of n-th subset”;
“the partial derivative of the secondary objective for the n-th subset is computed through a forward propagation through the neural network of the n-th subset”.

However, Chen discloses:
“each of the N subsets comprises a neural network” (Chen in at least ¶ [0045], ¶ [0046], ¶ [0061]-[0068] discloses “neural network ensemble 702 formed by selecting multiple neural networks from candidate pool”);
“the output-values of the n-th subset are computed through a forward computation through the neural network of n-th subset” (Chen in at least ¶ [0058] discloses the output-
“the partial derivative of the secondary objective for the n-th subset is computed through a forward propagation through the neural network of the n-th subset” (Chen in at least ¶ [0058] discloses the partial derivative of the secondary objective for the n-th subset is computed through a forward propagation through the neural network of the n-th subset).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Bonissone and Chen. Bonissone teaches a system for creating customized model ensembles on demand. Chen teaches a system to create a pool of neural networks trained on a first portion of a sparse data set; generate for each of various multi-objective functions a set of neural network ensembles that minimize the multi-objective function; select a local ensemble from each set of ensembles based on data not included in said first portion of said sparse data set; and combine a subset of the local ensembles to form a global ensemble. One of ordinary skill would have motivation to combine Bonissone and Chen because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

Bonissone and Chen do not explicitly disclose:
“the partial derivative of the differential function of the output values for the n-th subset is computed in a back-propagation through the neural network of n-th subset”;

However, Chen2 discloses:
“the partial derivative of the differential function of the output values for the n-th subset is computed in a back-propagation through the neural network of n-th subset” (Chen2 in at least ¶ [0032]-[0034] “the neural networks are based on a back-propagation architecture (backpropagation networks, or "BPN), one of the architectural parameters is the number of nodes in the hidden layer”);
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Bonissone, Chen, and Chen2. Bonissone teaches a system for creating customized model ensembles on demand. Chen teaches a system to create a pool of neural networks trained on a first portion of a sparse data set; generate for each of various multi-objective functions a set of neural network ensembles that minimize the multi-objective function; select a local ensemble from each set of ensembles based on data not included in said first portion of said sparse data set; and combine a subset of the local ensembles to form a global ensemble. Chen2 teaches a system for creating and using robust neural network ensembles. One of ordinary skill would have motivation to combine Bonissone, Chen, and Chen2 because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 36. Bonissone teaches the computer system of claim 35 (as mentioned above), wherein:
Bonissone does not explicitly disclose:
“each of the N subsets comprises a neural network”;
“the output-values of the n-th subset are computed through a forward computation through the neural network of n-th subset”;
“the partial derivative of the secondary objective for the n-th subset is computed through a forward propagation through the neural network of the n-th subset”.

However, Chen discloses:
“each of the N subsets comprises a neural network” (Chen in at least ¶ [0045], ¶ [0046], ¶ [0061]-[0068] discloses “neural network ensemble 702 formed by selecting multiple neural networks from candidate pool”);
“the output-values of the n-th subset are computed through a forward computation through the neural network of n-th subset” (Chen in at least ¶ [0058] discloses the output-values of the n-th subset are computed through a forward computation through the neural network of n-th subset);
“the partial derivative of the secondary objective for the n-th subset is computed through a forward propagation through the neural network of the n-th subset” (Chen in at least ¶ [0058] discloses the partial derivative of the secondary objective for the n-th subset is computed through a forward propagation through the neural network of the n-th subset).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Bonissone and Chen. Bonissone teaches a system for creating customized model ensembles on demand. Chen teaches a system to create a pool of neural networks 

Bonissone and Chen do not explicitly disclose:
“the partial derivative of the differential function of the output values for the n-th subset is computed in a back-propagation through the neural network of n-th subset”;

However, Chen2 discloses:
“the partial derivative of the differential function of the output values for the n-th subset is computed in a back-propagation through the neural network of n-th subset” (Chen2 in at least ¶ [0032]-[0034] “the neural networks are based on a back-propagation architecture (backpropagation networks, or "BPN), one of the architectural parameters is the number of nodes in the hidden layer”);
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Bonissone, Chen, and Chen2. Bonissone teaches a system for creating .

Claims 22, 23, 42, and 43 are rejected under 35 U.S.C. 103 as being unpatentable over Bonissone et al. (hereinafter Bonissone) US 20140188768 A1 in view of VIRKAR et al. (hereinafter VIRKAR) US 20100063948 A1.
In reference to claim 22. Bonissone teaches the method of claim 21 (as mentioned above), wherein the performance test comprises:
Bonissone does not explicitly disclose:
“a one-sided null hypothesis test that the n-th subset performs at least as well as an average performance of other subsets that have the same number of ensemble members at the n-th subset”.


“a one-sided null hypothesis test that the n-th subset performs at least as well as an average performance of other subsets that have the same number of ensemble members at the n-th subset” (VIRKAR in at least ¶ [0125], ¶ [0171], ¶ [0173], and ¶ [0185] discloses a one-sided null hypothesis test that the n-th subset performs at least as well as an average performance of other subsets that have the same number of ensemble members at the n-th subset).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Bonissone and VIRKAR. Bonissone teaches a system for creating customized model ensembles on demand. VIRKAR teaches selecting the best measures of performance, and combining them into an effective ensemble, will allow automated parameter selection. One of ordinary skill would have motivation to combine Bonissone and VIRKAR because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 23. Bonissone teaches the method of claim 1 (as mentioned above), selecting one of the P subsets comprises:
Bonissone does not explicitly disclose:
“randomly selecting, by the computer system, one of the P subsets of the ensemble members in the operational ensemble”.

However, VIRKAR discloses:
“randomly selecting, by the computer system, one of the P subsets of the ensemble members in the operational ensemble” (VIRKAR in at least ¶ [0117] discloses randomly selecting, by the computer system, one of the P subsets of the ensemble members in the operational ensemble).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Bonissone and VIRKAR. Bonissone teaches a system for creating customized model ensembles on demand. VIRKAR teaches selecting the best measures of performance, and combining them into an effective ensemble, will allow automated parameter selection. One of ordinary skill would have motivation to combine Bonissone and VIRKAR because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 42. Bonissone teaches the computer system of claim 41 (as mentioned above), wherein the performance test comprises:
Bonissone does not explicitly disclose:
“a one-sided null hypothesis test that the n-th subset performs at least as well as an average performance of other subsets that have the same number of ensemble members at the n-th subset”.

However, VIRKAR discloses:
“a one-sided null hypothesis test that the n-th subset performs at least as well as an average performance of other subsets that have the same number of ensemble members at the n-th subset” (VIRKAR in at least ¶ [0125], ¶ [0171], ¶ [0173], and ¶ [0185] discloses a one-sided null hypothesis test that the n-th subset performs at least as well as an average performance of other subsets that have the same number of ensemble members at the n-th subset).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Bonissone and VIRKAR. Bonissone teaches a system for creating customized model ensembles on demand. VIRKAR teaches selecting the best measures of performance, and combining them into an effective ensemble, will allow automated parameter selection. One of ordinary skill would have motivation to combine Bonissone and VIRKAR because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 43. Bonissone teaches the computer system of claim 24 (as mentioned above), wherein the one or more processor units select one of the P subsets by:
Bonissone does not explicitly disclose:
“randomly selecting, by the computer system, one of the P subsets of the ensemble members in the operational ensemble”.

However, VIRKAR discloses:
“randomly selecting, by the computer system, one of the P subsets of the ensemble members in the operational ensemble” (VIRKAR in at least ¶ [0117] discloses randomly selecting, by the computer system, one of the P subsets of the ensemble members in the operational ensemble).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Bonissone and VIRKAR. Bonissone teaches a system for creating customized model ensembles on demand. VIRKAR teaches selecting the best measures of performance, and combining them into an effective ensemble, will allow automated parameter selection. One of ordinary skill would have motivation to combine Bonissone and VIRKAR because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Viker A. Lamardo whose telephone number is (571)270-5871.  The examiner can normally be reached on Mon. - Fri. 9 AM - 5 PM.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann J. Lo can be reached on (571)272-9767.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/VIKER A LAMARDO/Examiner, Art Unit 2126