Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This communication is a Final office action on merit.  Claims 1-20, after amendment, are presently pending and have been considered below.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/25/2018, 10/26/2018 and 8/19/2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or
    nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.


Claims 1-4, 9-11, 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over US 2017/0068888 A1, Chung et al. (hereinafter Chung) in view of US 2017/0286809 A1, Pankanti et al. (hereinafter Pankanti) and further in view of US 2003/0063780 A1, Gutta et al. (hereinafter Gutta).


As to claim 1, Chung discloses an inter-classifier neural network, the inter-classifier neural network comprising: an input layer (Fig 1; pars 0021-0022, input layer); an output layer (Fig 1; pars 0021-0022, output layer); and a plurality of hidden layers constructed from hidden layers of different independent neural networks (Figs 1, 3; pars 0021-0022, one or more hidden layers being constructed between the input and output layers), wherein each of the independent neural networks is pre-trained to perform a classification task, and wherein the plurality of hidden layers is coupled to the input layer and the output layer (Figs 1-2; pars 0021-0022, pre-trained neural network for a cost sensitive classifier).  
Chung does not expressly disclose neural networks being independent from each other, and a neural network configured to perform a network selection function to select at least one of the independent neural networks as a best candidate to perform a classification. 
Pankanti, in the same or similar field of endeavor, further teaches different neural networks being trained independently (e.g. independent neural networks) for image classification (pars 0017, 0019, 0021, 0046-0047, 0051-0052). 
Gutta, in the same or similar field of endeavor, additionally teaches that is has been a common practice in the application of neural networks to train many different candidate networks and select the best based on performance on an independent validation set (par 0006).

Therefore, consider Chung, Pankanti, and Gutta’s teachings as a whole, it would have been obvious to one of skill in the art before the filing date of invention to incorporate Pankanti and Gutta’s teachings about first training neural networks independently to select the best performed one in Chung’s teachings to improve the training efficiency and performance.

As to claim 2, Chung as modified discloses the inter-classifier neural network of claim 1, wherein the plurality of hidden layers are convolutional layers (Chung: pars 0018, CNN; Pankanti: Figs 4, 6; pars 0042, 0044).  

As to claim 3, Chung as modified discloses the inter-classifier neural network of claim 1, wherein the different independent neural networks have different neural network architectures (Pankanti: Fig 5; pars 0038-0042, first and second training systems possess different neural network configurations/architectures). 
 
As to claim 4, Chung as modified discloses the inter-classifier neural network of claim 3, wherein the different neural network architectures comprises a single shot detector (SSD) neural network, a convolutional neural network (CNN) (Chung: par 0018; Pankanti: par 0042), regions with CNN (R-CNN), a faster R-CNN, a YOLO neural network, a neural network based on deformable part models, or a recurrent neural network.  

As to claim 9, Chung as modified discloses the inter-classifier neural network of claim 8, wherein the portion of the layer of the first neural network is selected by dropping out a plurality of neurons from the layer of the first neural network (Chung: pars 0022, 0050, 0089, layer removal).  

As to claim 10, Chung discloses a method for training an inter-classifier neural network, the method comprising: 
(a) selecting one or more hidden layers from a plurality of neural networks to construct an inter-classifier neural network engine (pars 0014, 0021-0022, 0025, one or more hidden layers being utilized or selected as part of the cost-sensitive deep neural network classifier), wherein each of the plurality of neural network is pre-trained to perform a classification task (Figs 1, 3; pars 0014, 0017-0018, 0021, 0025, a pre-trained neural network classifier with multiple hidden layers); 
(b) classifying a portion of a training data set using the inter-classifier neural network engine comprising of hidden layers selected from the plurality of neural networks (Figs 1-2; pars 0016, 0018, 0021-0022, 0046); 
(c) determining a performance score of the portion of the training data set that was classified (abstract; pars 0014, 0045, 0049, 0116); 
(d) re-selecting one or more layers from one or more networks of the plurality of neural networks to replace one or more layers of the inter-classifier neural network (pars 0022, 0045, 0049-0050, replacing certain layers of the neural network).
Chung does not expressly teach (e) repeating stages (b), (c), (d), until the performance score reaches a predetermined threshold.  
Pankanti, in the same or similar field of endeavor, further teaches (e) repeating stages (b), (c), and (d) until the performance score reaches a predetermined threshold (par 0050, repeat training and validation process until the accuracy of the neural network is above a predetermined acceptable value). 
Additionally, Gutta, in the same or similar field of endeavor, additionally teaches (f) selecting at least one of neural network of the plurality of neural networks as a best candidate to perform a particular classification based on the performance score (see rejection in claim 1).

Therefore, consider Chung, Pankanti and Gutta’s teachings as a whole, it would have been obvious to one of skill in the art before the filing date of invention to incorporate Pankanti’s teachings of training adaptation and termination with a pre-determined criterion and Gutta’s teachings on training network selection in Chung’s method to improve the training efficiency and performance.

As to claim 11, Chung as modified discloses the method of claim 10, wherein selecting the one or more layers from each of the plurality of neural networks comprises selecting all features within a selected layer (Chung: pars 0030, 0033, 0050, capturing input feature and generating a representation for the next layer as input; Pankanti: pars 0018-0020, 0042, each layer includes one or more feature detectors to capture/select features).  

As to claim 17, Chung as modified discloses the method of claim 10, wherein selecting the one or more layers from each of the plurality of neural networks comprises selecting a portion of a selected layer of a first neural network (Chung: pars 0021-0022, 0090, 0098).  

As to claim 18, Chung as modified discloses the method of claim 17, wherein selecting the portion of the selected layer of the first neural network comprises dropping out a plurality of neurons within the selected layer of first neural network (Chung: par 0022, 0050).  

As to claim 19, Chung as modified discloses the method of claim 18, wherein the plurality of neural networks comprises two or more of a single shot detector (SSD) neural network, a convolutional neural network (CNN) (Chung: par 0018; Pankanti: par 0042), regions with CNN (R-CNN), a faster R-CNN, a YOLO neural network, a neural network based on deformable part models, or a recurrent neural network.  

Claims 5-8, 12 are rejected under 35 U.S.C. 103 as being unpatentable over Chung in view of Pankanti and Gutta and further in view of US2018/0129930 A1, Shin et al. (hereinafter Shin).


As to claim 5, Chung as modified discloses the inter-classifier neural network of claim 1, wherein the inter-classifier neural network is trained, using a training data set, by replacing one or more of the plurality of hidden layers (Chung: pars 0045, 0049-0050, 0092, 0096, replacing hidden layers for fine tuning the training performance and parameter optimization to converge to relatively better local or global optima (e.g. above certain performance threshold) with one or more layers selected from the different independent neural networks until a performance score is above a threshold (Pankanti: pars 0016, 0024, 0042, 0049-0050, determining the training performance (accuracy) above predetermined acceptable value).  
In addition, Shin further teaches replacing at least one non-consecutive hidden layers in a deep neural network (DNN) (pars 0022-0024, 0041).  
Therefore, consider Chung as modified and Shin’s teachings as a whole, it would have been obvious to one of skill in the art before the filing date of invention to incorporate Shin’s teachings in replacing hidden layers of a DNN in Chung as modified’s inter-classifier neural network to allow DNN learning with a two-stage learning scheme with parameter learned. 

As to claim 6, Chung as modified discloses the inter-classifier neural network of claim 5, wherein replacing one or more of the plurality of hidden layers comprises selecting adjacent layers or layers farthest away from each other from a first neural network (Chung: pars 0022, 0030, 0045, 0050, 0096; Shin: pars 0022-0024, replacing non-consecutive hidden layers in a DNN).  

As to claim 7, Chung as modified discloses the inter-classifier neural network of claim 5, wherein replacing one or more of the plurality of hidden layers comprises using Bayesian optimization to select one or more layers from a first neural network of the plurality of neural networks to replace one or more of the plurality of hidden layers of the inter-classifier neural network (Chung: pars 0016, 0023, 0045, 0050, 0058, applying Bayes classifier/estimator; Shin: pars 0022-0024).    

As to claim 8, Chung as modified discloses the inter-classifier neural network of claim 1, wherein the inter-classifier neural network is trained, using a training data set, by replacing a portion of a layer of the plurality of hidden layers with a portion of a layer from a first neural network of the plurality of neural networks (Chung: pars 0022, 0045, 0050; Shin: pars 0022-0024).  

As to claim 12, Chung as modified discloses the method of claim 11, wherein selecting the one or more layers from each of the plurality of neural networks comprises selecting adjacent layers or layers farthest away from each other (Chung: pars 0022, 0030, 0045, 0050, 0096; Shin: pars 0022-0024, replacing non-consecutive hidden layers in a DNN).  


Claims 13-16, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Chung in view of Pankanti and Gutta and further in view of US2021/0004682 A1, Gong et al. (hereinafter Gong).


As to claim 13, Chung as modified discloses the method of claim 11, wherein selecting the one or more layers from each of the plurality of neural networks comprises selecting the one or more layers from a first neural network using Bayesian optimization (Chung: pars 0016, 0023, 0030, 0045, 0050, 0058, applying Bayes-Optimal decision).  Nevertheless, Gong, in the same or similar field of endeavor, additionally teaches selecting number of layers, number of nodes in each layer etc. using Bayesian optimization (par 0162). Therefore, consider Chung as modified and Gong’s teachings as a whole, it would have been obvious to one of skill in the art before the filing date of invention to incorporate Gong’s teachings in layer selection using Bayesian optimization in Chung as modified’s teachings for selecting proper hidden layers in a deep neural network.

As to claim 14, Chung as modified discloses the method of claim 13, wherein selecting the one or more layers from the first neural network using Bayesian optimization comprises: selecting, randomly, two or more layers from the first neural network (Chung: pars 0030, 0045, 0050, 0056, 0058, identifying K layers applying Bayes-Optimal Decision for prediction to minimize with randomly selected weights of layers); finding a maximum of an acquisition function of the Bayesian optimization (Chung: Fig 5, Objective Function Calculation Circuit; Fig 8, minimizing an objective function (e.g. maximizing an acquisition function); pars 0014, 0017, 0029, 0040-0042, 0045-0049, 0058; Gong: pars 0157, 0162); selecting a first new layer based on the maximum of the acquisition function (Chung: pars 0045, 0050, 0058; Gong: pars 0157, 0160, 162); and updating an approximation of an objective function of the Bayesian optimization (Chung: pars 0030, 0045, 0050, 0058; Gong: pars 0152, 0158).  

As to claim 15, Chung as modified discloses the method of claim 14, further comprising: finding a new maximum of the acquisition function (Chung: pars 0045, 0049-0050, 0116); selecting a second new layer based on the new maximum of the acquisition function (Chung: pars 0097, 0103, 0116; Gong: 162); updating a second approximation of the objective function of the Bayesian optimization (Chung: pars 0030, 0045, 0050, 0058); repeating a process comprising a) finding a new maximum (Gong: pars 0157, 0160), b) selecting a new layer (Chung: pars 0097, 0103, 0116; Gong: par 0162), and c) updating a new approximation until a termination criterion is reached (Chung: pars 0029, 0049, 0059; Gong: 0157, 0160, 0162); eliminating selected layers having a performance score below a predetermined threshold (Chung: par 0050); using layers that are not eliminated to generate one or more layers of the inter-classifier neural network (Chung: pars 0050, removing layers during deep neural network training). Note that neural network learning/training is an adaptive process which aims to achieve local or global optimization (cost function/error minimization or performance function maximization).

As to claim 16, Chung as modified discloses the method of claim 15, further comprising: evaluating the cost function of the inter-classifier neural network having layers from two or more of the plurality of neural networks; and retraining the inter-classifier neural network by replacing one or more layers with one or more new layers from the plurality of neural networks (Chung: pars 0045, 0050, 0096; Gong: par 0162).  

As to claim 20, Chung as modified discloses a method for training an inter-classifier neural network, the method comprising: (a) selecting one or more hidden layers from a plurality of neural networks to construct one or more hidden layers of the inter-classifier neural network engine, wherein each of the plurality of neural network is pre-trained to perform a classification task; (b) classifying contents of a training data set using the inter-classifier neural network; and repeating (a) and (b) until a performance score of the inter-classifier neural network is above a predetermined threshold (see rejection and motivation statement in claim 1), and) (c) selecting at least one of neural network of the plurality of neural networks as a best candidate to perform a particular classification based on the performance score (see rejection in claim 1). but does not expressly teach wherein the performance score is determined by comparing the classification result with ground truth data of the training data set.  Gong, in the same or similar field of endeavor, further teaches machine learned model may be trained by optimizing an objective function using ground-truth dataset (par 0156).  Therefore, consider Chung as modified and Gong’s teachings as a whole, it would have been obvious to one of skill in the art before the filing date of invention to incorporate Gong’s teachings in objective function optimization using ground-truth dataset in Chung as modified’s teachings as an optimization criterion.

Response to Arguments
Applicant’s arguments have been considered but they are moot in light of new ground(s) of rejection.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Examiner’s Note
Examiner has cited particular column, line number, paragraphs and/or figure(s) in the reference(s) as applied to the claims for the convenience of the Applicant. Although the specified citations are representative of the teachings of the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the reference(s) in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner. 

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Qun Shen whose telephone number is (571) 270-7927.  The examiner can normally be reached on Mon-Friday from 9:00-5:00. If attempts to reach the examiner by telephone are unsuccessful, the examiner's Supervisor, Vincent Rudolph can be reached on (571) 272-8243.  The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.  Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/QUN SHEN/
Primary Examiner, Art Unit 2661