DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 07/22/2022 has been entered. 

Response to Arguments
Applicant’s arguments, filed on 06/13/2022, with respect to 35 U.S.C. 103 have been fully considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 3, 6, 13, 17-18, and 20-22 are rejected under 35 U.S.C. 103 as being unpatentable over Shoaib et al. (US 2016/0217390 A1) in view of Hench (US 2010/0017351 A1).

Regarding claim 1.
Shoaib teaches a system, comprising: a memory that stores computer executable components; a processor that executes computer executable components stored in the memory (see ¶ 29, "memory 108 can store instructions executable by the processor(s) 104 including an operating system (OS) 112, a machine learning module 114, and programs or applications 116 that are loadable and executable by processor(s)”), 
wherein the computer executable components comprise: a determination component that: determines a first output for a first neural network of a set of neural networks based on first input data sub-sampled from an input (see ¶ 46, “complexity of the data may be determined based on the confidence levels of each of the particular machine learning models. For example, if the confidence level of an output class label resulting from the application of a particular machine learning model on data is beyond a threshold value, then CA module 502 may determine that the data has less than a particular complexity. On the other hand, if the confidence level of an output class label resulting from the application of a particular machine learning model on data is less than a threshold value, then CA module 502 may determine that the data has greater than a particular complexity. In the latter case, CA module 502 may apply one or more subsequent machine learning models (of increasing complexity) to the data until an output class label resulting from the application of a particular machine learning model is beyond a threshold value.”, also see ¶ 78), 
and iteratively, until an optimally complex neural network is determined, performs a process (see ¶ 46, CA module 502 may apply one or more subsequent machine learning models (of increasing complexity) to the data until an output class label resulting from the application of a particular machine learning model is beyond a threshold value.”, also see ¶ 78, “Process 1400 may continue iteratively if, for example, the second level of complexity is not able to classify the input value. Then the machine learning model may apply a third level of complexity (more complex than the second level of complexity), and so on.”) comprising: 
determines a second output for a second neural network of the set of neural networks based on , wherein the second neural network has greater complexity than the first neural network and previous second neural networks from the earlier iterations of the process, and  (see ¶ 46, “SE classifier 500 includes a complexity assessment (CA) module 502 that determines complexity of input data 504 received by the SE classifier. For example, as described in detail below, CA module 502 may determine complexity of data by applying various machine learning models to the data. Each of the machine learning models, respectively, is able to categorize data having less than particular levels of complexity”, also see ¶ 78, “the machine learning model may apply a second level of complexity of the machine learning model to the input value. The second level of complexity is more complex than the first level of complexity.”), 
and in response to a determination that the second output has consensus, according to a consensus criterion, with at least one of the first output of the first neural network or any previous second outputs from the previous second neural networks from the earlier iterations of the process,  (see ¶ 56, “Consensus module 806 of the ith stage receives output from + biased classifier 802 and − biased classifier 804 to produce output that is either a class label or input to a subsequent classifier stage (i+1). Whether the output is a class label or input to a subsequent classifier stage may be based, at least in part, on two criteria. First, if the biased classifiers 802 and 804 predict the same class (e.g., ++ or −−), then consensus module 806 determines a consensus and the corresponding label (e.g., + or −) is produced as output.”).  

Shoaib does not explicitly teach: determines a second output for a second neural network of the set of neural networks based on second input data sub-sampled from the input, wherein the second input data is different from and has a larger size than the first input data and previous second input data from earlier iterations of the process, wherein the second neural network has greater complexity than the first neural network and previous second neural networks from the earlier iterations of the process, and complexity is a function of at least one of input pixel count or hidden layer sizes; determines that the second neural network is the optimally complex neural network. 
Hench teaches determines a second output for a second neural network of the set of neural networks based on second input data sub-sampled from the input, wherein the second input data is different from and has a larger size than the first input data and previous second input data from earlier iterations of the process, wherein the second neural network has greater complexity than the first neural network and previous second neural networks from the earlier iterations of the process, and complexity is a function of at least one of input pixel count or hidden layer sizes; determines that the second neural network is the optimally complex neural network (see figure 8c and ¶ 61, “FIG. 8C, after a one hidden layer network is trained by the algebraic method at operation 820, the algebraic method result initializes a one hidden layer neural network which then trained by the optimization method at operation 830. The resultant values are then used as the initializer in equations (20) for optimization of network having an addition hidden layer at operation 840. That network is then trained with an optimization method at operation 850. In still another embodiment, the method is applied recursively, with hidden layers being added one at a time and being initialized by the previous solution, as further shown in FIG. 8C.”, i.e. adding additional hidden layers corresponds to the different size of hidden layers and also corresponds to the optimization of the neural network based on different inputs).
Both Shoaib and Hench pertain to the problem of machine learning parameter estimation, thus being analogous. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to combine Shoaib and Hench to teach the above limitations. The motivation for doing so would be ”initializing neural network coefficients for training by optimization of the neural network weights, minimizing a difference between the actual signal and the modeled signal based on an objective function containing both function evaluations and derivatives.” (See Hench abstract).

Regarding claim 3.
Shoaib and Hench teaches: the system of claim 1.
Shoaib teaches wherein the first neural network is a least complex neural network of the set of neural network (see ¶ 25, "If the confidence level is beyond a particular threshold value, the output class label produced by the current model is considered to be a final outcome. In this case, the test instance is not processed by any subsequent models in the sequence. Thus, relatively non-complex test instances are processed by only one or the initial few (least complex) model(s) in the sequence"; also see ¶ 48, "The confidence value determines whether or not the input is passed on to a subsequent next stage", i.e. examiner maps “machine learning model” in Shoaib to neural network in the claim because applicant merely recites the term).

Regarding claim 6.
Shoaib and Hench teaches: the system of claim 1.
Shoaib further teaches: wherein an input size of the second input dataset is increased relative to a previous input size of a previous second input dataset at each iteration of the process (see ¶ 78, “Process 1400 may continue iteratively if, for example, the second level of complexity is not able to classify the input value. Then the machine learning model may apply a third level of complexity (more complex than the second level of complexity), and so on.” Examiner notes that a third level of complexity of input “input size of the second input” is more complex “increased” than the second level of complexity of input “a previous second input” in iterative process.)

Claim 13 recites a computer-implemented method to perform the system recited in claim 1. Therefore the rejection of claim 1 above applies equally here.


Regarding claim 17.
Shoaib and Hench teaches: the method of claim 13.
Shoaib further teaches: further comprising determination of a consensus profile that comprises a distribution of consensus points for the first input data, the previous second input data, and the second input data (see ¶ 61, "SE classifier stage 900 may include a global consensus (GC) module 908, which aggregates outputs from all LC modules of the binary-class classifiers 902-906."; also see ¶ 73, "The consensus probability determines the confidence of the biased classifier stage while operating on the various training instances").

Regarding claim 18.
Shoaib and Hench teaches: the method of claim 13.
Shoaib further teaches: wherein an input size of the second input data is increased relative to a previous input size of a previous second input data at each iteration of the process (see ¶ 78, “Process 1400 may continue iteratively if, for example, the second level of complexity is not able to classify the input value. Then the machine learning model may apply a third level of complexity (more complex than the second level of complexity), and so on.” Examiner notes that a third level of complexity of input “input size of the second input” is more complex “increased” than the second level of complexity of input “a previous second input” in iterative process.).

Claim 20 recites a computer program product comprising a computer readable storage medium to perform the system recited in claim 1. Therefore the rejection of claim 1 above applies equally here. Shoaib also teaches the addition elements of claim 20 not recited in claim 1 comprising a computer readable storage medium having program instructions embodied therewith (see ¶ 33, “Computer readable media may include computer storage media and/or communication media. Computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules, or other data”), the program instructions executable by processor to cause the processor (see 29, "memory 108 can store instructions executable by the processor(s) 104 including an operating system (OS) 112, a machine learning module 114, and programs or applications 116 that are loadable and executable by processor(s)") to: generate a trained set of neural networks for respective inputs of varying sizes (see Fig. 13 and ¶ 16 "a training process for a scalable-effort classifier of a machine learning model, according to various example embodiments"; also see ¶ 1, “In the training phase, typical input examples are used to build decision models that characterize the data. In the testing phase, the learned model is applied to new data instances in order to infer different properties such as relevance and similarity”; also see ¶ 2, “a first classifier stage involves the simplest machine learning models and is able to classify input data that is relatively simple. Subsequent classifier stages have increasingly complex machine learning models and are able to classify more complex input data”).
Claim 21 recites a computer program product comprising a computer readable storage medium to perform the system recited in claim 18. Therefore the rejection of claim 18 above applies equally here.

Regarding claim 22. 
Shoaib and Hench teaches: the computer program product of claim 20.
Shoaib further teaches: wherein the consensus criterion comprises an error threshold (see ¶ 56, “Whether the output is a class label or input to a subsequent classifier stage may be based, at least in part, on two criteria. First, if the biased classifiers 802 and 804 predict the same class (e.g., ++ or −−), then consensus module 806 determines a consensus and the corresponding label (e.g., + or −) is produced as output. Second, if the biased classifiers 802 and 804 predict different classes (e.g., +− or −+), then consensus module 806 determines no consensus (NC). In this case input Ii to classifier stage 800 is considered to be too difficult to be classified by classifier stage 800 and the next-stage input Ii+1 is produced and provided to the next-stage classifier (not illustrated in FIG. 8).” also see ¶ 68, “Negative consensus threshold values for a classifier stage may lead to input test instances being labeled by the stage even if the biased classifiers of the stage disagree on the individual class assignments. This may occur, for example, if confidence values (e.g., in the contradictory predictions) of each of the biased classifiers is jointly greater than the consensus threshold. In this fashion, the consensus threshold may directly control the fraction of inputs classified by a stage. To achieve computational efficiency, the consensus threshold value may be optimized during training time such that the consensus threshold value minimizes the total number of misclassifications.” Examiner notes that two consensus criteria comprises the consensus threshold minimizing misclassifications “error”.).

Claims 2 and 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Shoaib et al. (US 2016/0217390 A1) in view of Hench (US 2010/0017351 A1) and further in view of Ko et al. (US 20170278386 A1).

Regarding claim 2. 
Shoaib and Hench teach the system of claim 1, 
Shoaib further teaches further comprising a training component that generates the set of neural networks trained for respective input data sub-sampled from the input of varying [input pixel counts] (see ¶ 73,  "FIG. 13 is a block diagram of a training process for an SE classifier of a machine learning model, according to various example embodiments. A set of training instances 1302 may be provided, via a combiner 1304, to a machine learning operation that generates classifiers 1306 that are biased based, at least in part, on the training instances. Train-biased classifiers 1306 may be subsequently used to compute consensus probability 1308. ", also see ¶¶ 21, 43 and 45).
Shoaib in view of Hench does not explicitly teach: input pixel counts.
However, Ko teaches: input pixel counts (see ¶ 86, “the number of neural layers which configure the neural networks may vary depending on the size of the image input.”).
Shoaib, Hench and Ko pertain to the problem of machine learning parameter estimation, thus being analogous. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to combine Shoaib, Hench and Ko to teach the above limitations. The motivation for doing so would be in order to increase accuracy and efficiency of the classification of the object, “the traffic information collecting apparatus generates the object classification information using the pre-processed image to increase accuracy and efficiency of the classification of the object.”(See Ko ¶ 69).

Claim 14 recites a computer-implemented method to perform the system recited in claim 2. Therefore the rejection of claim 2 above applies equally here.


Regarding claim 15
Shoaib, Hench and Ko teaches: the method of claim 14.
Shoaib further teaches wherein the first neural network is a least complex neural network of the set of neural network (see ¶ 25, "If the confidence level is beyond a particular threshold value, the output class label produced by the current model is considered to be a final outcome. In this case, the test instance is not processed by any subsequent models in the sequence. Thus, relatively non-complex test instances are processed by only one or the initial few (least complex) model(s) in the sequence"; also see ¶ 48, "The confidence value determines whether or not the input is passed on to a subsequent next stage", i.e. examiner maps “machine learning model” in Shoaib to neural network in the claim because applicant merely recites the term).

Claims 4-5, 8-12, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Shoaib et al. (US 2016/0217390 A1) in view of Hench (US 2010/0017351 A1) and further in view of Sarah (US 2016/0239736 A1).

Regarding claim 4.
Shoaib and Hench teach the system of claim 1.
Shoaib in view of Hench does not explicitly teach: further comprising an architecture component that forms a chain of increasingly complex classifiers by subsampling feature sizes of a most complex neural network of the set of neural networks based on at least one parameter comprising one or more of: successively decreasing sub-sampling rates, wherein decreasing the sub-sampling rates is increasing the complexity by successively reducing interval between selected feature components or bit precision of the input.
However, Sarah teaches: further comprising an architecture component that forms a chain of increasingly complex classifiers by subsampling feature sizes of a most complex neural network of the set of neural networks based on at least one parameter comprising one or more of: successively decreasing sub-sampling rates, wherein decreasing the sub-sampling rates is increasing the complexity by successively reducing interval between selected feature components or bit precision of the input (see ¶ 64, “Decreasing the pooling window size may cause fewer values to be included in the operation. As less pooling is performed, less data may be sub-sampled and more information may be preserved and passed to the next layer in the DCN. The amount of information lost in the pooling layer may decrease, but the number of computations that are performed may increase.”, also see ¶ 79, “For instance, the architecture may be changed by increasing the number of convolution layers of the classifier, decreasing the stride of one or more convolution filters, by adding pooling layers or fully-connected layers, or by adjusting the size of the convolution layers or pooling layers.”).
Shoaib, Hench and Sarah pertain to the problem of machine learning parameter estimation, thus being analogous. It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to combine Shoaib, Hench and Sarah to teach the above limitations. The motivation for doing so would be in order to obtain a more accurate classification for the previous input, “Thus, a more accurate classification may be obtained for the previous input” (See Sarah ¶ 75).

Regarding claim 5.
Shoaib in view of Hench and Sarah teaches: the system of claim 4.
Shoaib teaches: further comprising a profile component that determines a consensus profile that comprises a distribution of consensus points for the first input data, the previous second input data, and the second input data (see ¶ 61, "SE classifier stage 900 may include a global consensus (GC) module 908, which aggregates outputs from all LC modules of the binary-class classifiers 902-906."; also see ¶ 73, "The consensus probability determines the confidence of the biased classifier stage while operating on the various training instances").

Regarding claim 8.
Shoaib in view of Hench and Sarah teaches: the system of claim 4.
Shoaib further teaches: wherein a consensus profile is determined for the first input data, the previous second input data, and the second input data (see ¶ 57, "For all input test instances that are either below C+ or above C−, both biased classifiers provide identical class labels and thus a consensus, which may be determined by CA module").

Regarding claim 9.
Shoaib in view of Hench and Sarah teaches: the system of claim 4.
Shoaib further teaches: wherein a consensus profile across the first input data, the previous second input data, and the second input data is determined in an error-free state (see ¶ 61, "SE classifier stage 900 may include a global consensus (GC) module 908, which aggregates outputs from all LC modules of the binary-class classifiers 902-906."; also see ¶ 55,  "SE classifier stage 800 may, for example, be used for a binary classification algorithm with two possible class outcomes + and −. + biased classifier 802 and − biased classifier 804 may be trained to detect one particular class with high accuracy. For example, + biased classifier 802 is biased towards class + (denoted by C+). Thus, + biased classifier 802 may relatively frequently mispredict class labels for test instances from class −, but seldom mispredict class labels for test instances from class +"; also see ¶ 61, "GC module 908 may have a particular functionality such that if there is positive consensus (e.g., ++) in exactly one LC module, then GC module 908 outputs a class label corresponding to the consenting binary-classification unit (e.g., one of binary-class classifiers 902-906)"; also see ¶ 56, "First, if the biased classifiers 802 and 804 predict the same class (e.g., ++ or −−), then consensus module 806 determines a consensus and the corresponding label (e.g., + or −) is produced as output"; Examiner note: applicant discloses that the errors can be of different types. So, error-free can be considered where two classifiers predict the same class (e.g., ++ or −−).).

Regarding claim 10.
Shoaib in view of Hench and Sarah teaches: the system of claim 9.
Shoaib further teaches: wherein a consensus profile across the first input data, the previous second input data, and the second input data is determined in presence of errors (see ¶ 61, "SE classifier stage 900 may include a global consensus (GC) module 908, which aggregates outputs from all LC modules of the binary-class classifiers 902-906…On the other hand, if more than one LC module provides consensus, then the next SE classifier stage is invoked"; also see ¶ 56, "Second, if the biased classifiers 802 and 804 predict different classes (e.g., +− or −+), then consensus module 806 determines no consensus (NC). In this case input Ii to classifier stage 800 is considered to be too difficult to be classified by classifier stage 800 and the next-stage input Ii+1 is produced and provided to the next-stage classifier"; Examiner note: applicant discloses that the errors can be of different types. So, error can be considered where two classifiers predict the different class (e.g. +- or −+).).

Regarding claim 11.
Shoaib in view of Hench and Sarah teaches: the system of claim 10.
Shoaib further teaches: Shoaib further teaches: wherein a determination is made that a delta (Fig. 9, output from LC in element 902) of the profile consensus without error (Fig. 9, e.g., ++ in LC.0 in element 1000) is within a pre-determined threshold (Fig. 9, element LC, note that the threshold is a very small number larger than zero) of the consensus profile in presence of errors (Fig. 9, e.g., C+ in element 902), and wherein an error (Fig. 10, element OUTPUT, e.g., NC) is reported if the delta is greater than the threshold or a no-error (Fig. 10, element OUTPUT, e.g., CLASS 0) is reported if the delta is less than the threshold (see ¶ 67, "component classifier outputs may be combined over a continuum to either relax or tighten the consensus operations by using a consensus threshold, denoted by δ. In feature space 700, δ=0 and classifiers 702 and 704 not modified. In feature space 1100, δ<0 and classifiers 702 and 704 (shown as dashed lines) are modified by δ to be classifiers 1102 and 1104. In feature space 1200, δ>0 and classifiers 702 and 704 (shown as dashed lines) are modified by δ to be classifiers 1202 and 1204"; Examiner note: Shoaib teaches error as (+- or −+) and error-free as (++ or −−) in claims 9 and 10; Examiner considers that classifier 1104 as shown in Fig. 11 produces + biased label (no error) in presence of errors (+- or −+)  and δ<0 as delta is less than pre-determined threshold). 

Regarding claim 12.
Shoaib in view of Hench and Sarah teaches: the system of claim 11.
Shoaib further teaches: Shoaib further teaches: wherein a minimum permissible operating voltage is determined as a function of the delta exceeding the threshold (see ¶ 20, "SE machine learning may expend computational effort (e.g., computational time and energy) that is proportional to the difficulty of the data. This approach provides a number of benefits, including faster computations and energy savings, as compared to fixed-effort machine learning"; also see ¶ 25, "If the confidence level is beyond a particular threshold value, the output class label produced by the current model is considered to be a final outcome…This approach provides a resource management technique for achieving scalability in computational effort at runtime"; Examiner note: for examination purposes, examiner has interpreted "voltage" as "energy" with the broadest reasonable interpretation.).

Claim 16 recites a computer-implemented method to perform the system recited in claim 4. Therefore the rejection of claim 4 above applies equally here.


Relevant Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure listed below:
Doctor et al. (US 8515884 B2): teaches a data input step to input data from a plurality of first data sources into a first data bank, analyzing said input data by means of a first adaptive artificial neural network (ANN), the neural network including a plurality of layers having at least an input layer, one or more hidden layers and an output layer, each layer comprising a plurality of interconnected neurons, the number of hidden neurons utilized being adaptive, the ANN determining the most important input data and defining therefrom a second ANN, deriving from the second ANN a plurality of Type-1 fuzzy sets for each first data source representing the data source.
Williams et al. (US 10034645 B1): teaches to identify and classify complex networks using image data as appropriate to the requirements of specific applications.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to IMAD M KASSIM whose telephone number is (571)272-2958. The examiner can normally be reached mon-fri 730-500.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J. Huntley can be reached on (303) 297 - 4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/IMAD KASSIM/Examiner, Art Unit 2129                                                                                                                                                                                                        
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129