DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Status
Claims 1-6, 8-18, and 20 are remain pending in this application.

Response to Arguments
Applicant’s arguments, filed on 04/08/2021, with respect to 35 U.S.C. 112(f) have been fully considered but they are not persuasive.
Applicant argues that “none of claims 1-5 recite the term “means” or “step.” Further, independent claim 1 recites a processor and a memory, which are clearly structural components. Furthermore, contrary to assertions in the Office Action, the recited components are not merely generic placeholders as evidenced by their detailed descriptions throughout the instant specification. The recited components are followed by recitation of sufficient acts for performing functions of the respective components.”
Examiner respectfully disagrees to applicant’s argument because of the following reasons: 
Even though the claim limitations in claims 1-5 do not use the “means” or “step”, they use the term “component” as a generic placeholder (also called a nonce term or a non-structural term having no specific structural 
A processor and a memory are not sufficient to denote their structure in compute-implemented means-plus-function limitations. For a computer-implemented 35 U.S.C. 112(f) claim limitation, the specification must disclose an algorithm for performing the claimed specific computer function, or else the claim is indefinite under 35 U.S.C. 112(b) (b). See Net MoneyIN, Inc. v. Verisign. Inc., 545 F.3d 1359, 1367 (Fed. Cir. 2008). See also In re Aoyama, 656 F.3d 1293, 1297, 99 USPQ2d 1936, 1939 (Fed. Cir. 2011) ("[W]hen the disclosed structure is a computer programmed to carry out an algorithm, ‘the disclosed structure is not the general purpose computer, but rather that special purpose computer programmed to perform the disclosed algorithm.’") (Quoting WMS Gaming, Inc. v. Int’l Game Tech., 184 F.3d 1339, 1349, 51 USPQ2d 1385, 1391 (Fed. Cir. 1999)). (See MEPE 2181 II.B)
The specification for the claim limitations does not provide a description sufficient to inform one of ordinary skill in the art that the term denotes structure. 

Applicant’s arguments, filed on 04/08/2021, with respect to 35 U.S.C. 112(a) have been fully considered but they are not persuasive.
Applicant argues that “The respective components recited in claims 1-5 are structural components as shown in Figures 1-5 and described by associated paragraphs, and the specification clear states that components can be hardware, software or a combination thereof. See paragraph [0053].”
Examiner respectfully disagrees to applicant’s argument because Figures 1-5 and par [0053] do not provide sufficient structure of the “component” claim limitations.
Original claims may lack written description when the claims define the invention in functional language specifying a desired result but the specification does not sufficiently describe how the function is performed or the result is achieved. For software, this can occur when the algorithm or steps/procedure for performing the computer function are not explained at all or are not explained in sufficient detail (simply restating the function recited in the claim is not necessarily sufficient). In other words, the algorithm or steps/procedure taken to perform the function must be described with sufficient detail so that one of ordinary skill in the art would understand how the inventor intended the function to be performed. (See MPEP 2160.01 I)

on 04/08/2021, with respect to 35 U.S.C. 112(b) have been fully considered but they are not persuasive.
Applicant argues that “The term “successively larger” in claim 1 is not indefinite after the herein amendment to this claim. In particular, the claim recites a set of neural networks, and a determination component that determines output for successively larger neural networks of the set, which clearly means that output is successively determined for larger neural networks within the set relative to other neural networks of the set.”
Examiner respectfully disagrees to applicant’s argument. The claim is not indefinite if the specification provides examples or teachings that can be used to measure a degree even without a precise numerical measurement (e.g., a figure that provides a standard for measuring the meaning of the term of degree). If the specification does not provide some standard for measuring that degree, a determination must be made as to whether one of ordinary skill in the art could nevertheless ascertain the scope of the claim (e.g., a standard that is recognized in the art for measuring the meaning of the term of degree). For example, in Ex parte Oetiker, 23 USPQ2d 1641 (Bd. Pat. App. & Inter. 1992), the phrases "relatively shallow," "of the order of," "the order of about 5mm," and "substantial portion" were held to be indefinite because the specification lacked some standard for measuring the degrees intended. (See MPEP 2173.05(b))
The term “successively larger” is indefinite term, and the specification does not provide examples or teachings that can be used to measure a degree of “successively larger” neural networks of a set. The specification does not provide, for example, how . 

Applicant’s arguments, filed on 04/08/2021, with respect to 35 U.S.C. 103 have been fully considered but they are not persuasive. 
Applicant argues that neither Shoaib nor Gruenstein, alone or in combination disclose or suggest the claim limitations “a consensus component that determines consensus between a first neural network and a second neural network of the set” and  “wherein if consensus between the first and second neural network is achieved, then the second neural network is selected as optimal size.”
Examiner respectfully disagrees to applicant’s argument. First, although Applicant argues that the claim limitation “a consensus component that determines consensus between a first neural network and a second neural network of the set” is not taught by Shoaib and Gruenstein, Examiner did not assert that the claim limitation is taught by Shoaib and Gruenstein. Instead, Examiner asserts that the claim limitation is taught by Takeo. Second, although Applicant argues that the claim limitation “a consensus component that determines consensus between a first neural network and a second neural network of the set” is not taught by Shoaib and Gruenstein, the claim limitation is taught by Gruenstein because Gruenstein teaches that a first neural first neural network and a second neural network] analyze a data set and determine whether the data set comprises a digital representation of a feature [determine consensus] from the set of features in [0133] and [0134]. So, it can be considered that consensus is reached where two or more neural networks yield the same data set as output. Examiner considers size and complexity are interchangeably used based on Applicant’s disclosure of “the second neural network of the consensus set is deemed of optimal size or complexity” in [0039], which indicates that size and complexity mean the same matter. As an analogy, a statement of “a value is 2 or 8/4” means the two terms are the same. Thus, the term “optimal size” in the claim language is considered as equivalent to “optimal complexity.” Similarly, Examiner also considers that input size and complexity are interchangeably used.
 
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 

Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “determination component”, “consensus component”, “training component”, “selection component”, “architecture component”, and “profile component” in claims 1-5.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to 
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-5 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, 
Claim 1 recites determination component that determines output for successively larger neural networks of the set. The specification in par. [0004], [0028], [0038], and Fig.7 merely recites the function and does not identify any specific structure. Therefore, claims 1 are rejected under 112(a) lack of written description.
Claim 1 recites consensus component that determines consensus between a first neural network and a second neural network of the set. The specification in par. [0003], [0028], [0038], and Fig.8 merely recites the function and does not identify any specific structure. Therefore, claims 1 are rejected under 112(a) lack of written description.
Claim 2 recites training component that generates a trained set of neural networks for respective inputs of varying sizes. The specification in par. [0028] merely recites the function and does not identify any specific structure. Therefore, claims 2 are rejected under 112(a) lack of written description.
Claim 3 recites selection component that determines output for smallest sized input of a set of inputs on a least complex neural network of the set. The specification in par. [0028] merely recites the function and does not identify any specific structure. Therefore, claims 3 are rejected under 112(a) lack of written description.
Claim 4 recites architecture component that forms a chain of increasingly complex classifiers by subsampling feature sizes of a large neural network based on at least one parameter comprising one or more of: successively decreasing rates or bit precision. The specification in par. [0028] and [0038] merely recites the function and 
Claim 5 recites profile component that determines a consensus profile that comprises a distribution of consensus points for the set of inputs. The specification in par. [0028] and fig. 7 merely recites the function and does not identify any specific structure. Therefore, claims 5 are rejected under 112(a) lack of written description.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The term "successively larger" in claim 1 is a relative term which renders the claim indefinite.  The term "successively larger" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
Claims 2-12 depend on claim 1 and inherent the same issue and are rejected under the same reasoning as above.
The term "least complex" in claim 3 is a relative term which renders the claim indefinite.  The term "least complex" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
In claim 4, the limitation of "decreasing" is not specified, for example, how much decreasing rates or bit precision which renders the claim indefinite. 
The term "large" in claim 4 is a relative term which renders the claim indefinite.  The term "large" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. 
The term "successively" in claim 4 is a relative term which renders the claim indefinite.  The term "successively" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
The term "successively larger" in claim 13 is a relative term which renders the claim indefinite.  The term "successively larger" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. 
Claims 14-19 depend on claim 13 and inherent the same issue and are rejected under the same reasoning as above.
The term "least complex" in claim 15 is a relative term which renders the claim indefinite.  The term "least complex" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
In claim 16, the limitation of "decreasing" is not specified how much decreasing rates or bit precision which renders the claim indefinite.
The term "large" in claim 16 is a relative term which renders the claim indefinite.  The term "large" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
The term "successively" in claim 16 is a relative term which renders the claim indefinite.  The term "successively" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
The term "least complex" in claim 20 is a relative term which renders the claim indefinite.  The term "least complex" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
The term "successively larger" in claim 20 is a relative term which renders the claim indefinite.  The term "successively larger” and “size” are not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1-3, 13-15, 17-18, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shoaib (US 20160217390 A1), in view of Gruenstein (US 20150340032 A1), further in view of Takeo (US 5278755 A).

(Currently Amended) Regarding claim(s) 1 and 13
Taking claim 1 as an exemplary, Shoaib teaches a system, comprising: a memory that stores computer executable components; a processor that executes computer executable components stored in the memory (Par. [0029], "memory 108 can store instructions executable by the processor(s) 104 including an operating system (OS) 112, a machine learning module 114, and programs or applications 116 that are loadable and executable by processor(s)"), 
(Par. [0078], "the machine learning model determines whether the first level of complexity is able to classify the input value"; "if the first level of complexity is not able to classify the input value, the machine learning model may apply a second level of complexity of the machine learning model to the input value. The second level of complexity is more complex than the first level of complexity"; Examiner note: Shoaib teaches “the first level of complexity” and “the second level of complexity” as machine learning models having different level of complexity. So, the first and second level of complexity can be considered as successively larger machine learning models.); and
Shoaib does not explicitly teach: 
a consensus component that determines consensus between a first neural network and a second neural network of the set.
Gruenstein teaches: 
a consensus component that determines consensus between a first neural network and a second neural network of the set (Par. [0133], "The process trains a second neural network, comprising a second quantity of nodes greater than the first quantity of nodes, to identify the set of features using a second training set (706), for a second quantity of iterations (708). The second quantity of iterations may be greater than the first quantity of iterations"; Par. [0134], "The process provides the first neural network and the second neural network to a user device that uses both neural networks to analyze a data set and determine whether the data set comprises a digital representation of a feature from the set of features"; Examiner note: applicant discloses that consensus is reached where two or more successive large networks yields the same inference output. Similarly, Gruenstein teaches that both neural networks to analyze a data set and determine whether the data set comprises a digital representation of a feature from the set of features. So, it can be considered that consensus is reached where two or more neural networks yield the same data set comprising a digital representation of a feature from the set of features.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the modeling technique selecting a classifier for energy-efficient machine learning of Shoaib to incorporate the consensus technique yielding the same data set from two or more neural networks of Gruenstein. The motivation/suggestion for doing this would be for the purpose of increasing accuracy of neural networks with less false positive rate (Gruenstein, par. [0005]).
Shoaib in view of Gruenstein and Shao does not explicitly teach:
wherein if consensus between the first and second neural network is achieved, then the second neural network is selected as optimal size.
Takeo teaches wherein if consensus between the first and second neural network is achieved the second neural network is selected as optimal size (Col. 7, ll. 25-40, "With the method for determining an image point in an object image in accordance with the present invention, a plurality of different neural networks are prepared for a plurality of different image recording menus. Each of the neural networks receives an image signal and generates outputs which represent an image point. A neural network, which is optimum for the predetermined image recording menu, is selected from the plurality of the neural networks. The selected neural network is employed in obtaining outputs, which represent the image point located in the region inside of the object image. Therefore, even if the image recording menu changes, an image point located in the region inside of an object image can be determined, which image point is optimum for the new radiation image recording menu"; Examiner note: applicant discloses that consensus is reached where two or more successive large networks yields the same inference output. Similarly, Takeo teaches that each of the neural networks generates outputs which represent an image point. So, it can be considered that consensus is reached where each of neural networks generate outputs which represent the image point. Examiner further notes that applicant discloses the second neural network of the consensus set is deemed of optimal size or complexity in [0039], and Examiner considers size and complexity are interchangeably used.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the modeling technique selecting a classifier for energy-efficient machine learning of Shoaib in view of Gruenstein to incorporate Takeo’s technique selecting an optimal size of a neural network among multiple neural networks where they generate the same outputs. The motivation/suggestion for doing this would be increasing the accuracy in determining an image point located in the region inside of an object image by the neural network (Takeo, Col. 6, ll. 39-41).


(Currently Amended) Regarding claim(s) 2 and 14
Taking claim 2 as an exemplary, Shoaib in view of Gruenstein and Takeo teaches the method of claim 1.
Shoaib teaches further comprising a training component that generates a trained set of neural networks for respective inputs of varying sizes (Fig. 13 and par. [0016], "a training process for a scalable-effort classifier of a machine learning model, according to various example embodiments"; Par. [0001], “In the training phase, typical input examples are used to build decision models that characterize the data. In the testing phase, the learned model is applied to new data instances in order to infer different properties such as relevance and similarity”; Par. [0002], “a first classifier stage involves the simplest machine learning models and is able to classify input data that is relatively simple. Subsequent classifier stages have increasingly complex machine learning models and are able to classify more complex input data”; Examiner note: examiner considers “inputs of varying sizes” as input data having different complexities.).
As per the non-exemplary claims(s) 14, this/these claim(s) has/have similar limitations and is/are rejected based on the reasons given above.

(Currently Amended) Regarding claim(s) 3

Shoaib teaches further comprising a selection component that determines output for smallest sized input of a set of inputs on a least complex neural network of the set (Par. [0025], "If the confidence level is beyond a particular threshold value, the output class label produced by the current model is considered to be a final outcome. In this case, the test instance is not processed by any subsequent models in the sequence. Thus, relatively non-complex test instances are processed by only one or the initial few (least complex) model(s) in the sequence"; Par. [0048], "The confidence value determines whether or not the input is passed on to a subsequent next stage"; Examiner note: under the broadest reasonable interpretation, examiner maps “output class label” in Shoaib to output in the claim, “test instances” in Shoaib to inputs in the claim. Examiner further maps “machine learning model” in Shoaib to neural network in the claim because applicant merely recites the term).

(Currently Amended) Regarding claim(s) 15
Shoaib in view of Gruenstein and Takeo teaches the method of claim 14.
Shoaib teaches further comprising a selection component that determines output for smallest sized input of a set of inputs on a least complex neural network of the set (Par. [0025], "If the confidence level is beyond a particular threshold value, the output class label produced by the current model is considered to be a final outcome. In this case, the test instance is not processed by any subsequent models in the sequence. Thus, relatively non-complex test instances are processed by only one or the initial few (least complex) model(s) in the sequence"; Par. [0048], "The confidence value determines whether or not the input is passed on to a subsequent next stage"; Examiner note: under the broadest reasonable interpretation, examiner maps “output class label” in Shoaib to output in the claim, “test instances” in Shoaib to inputs in the claim. Examiner further maps “machine learning model” in Shoaib to neural network in the claim because applicant merely recites the term).

Regarding claim(s) 17
Shoaib in view of Gruenstein and Takeo teaches the method of claim 13. 
Shoaib teaches further comprising determination of a consensus profile that comprises a distribution of consensus points for the set of inputs (Par. [0061], "SE classifier stage 900 may include a global consensus (GC) module 908, which aggregates outputs from all LC modules of the binary-class classifiers 902-906."; Par. [0073], "The consensus probability determines the confidence of the biased classifier stage while operating on the various training instances").

(Currently Amended) Regarding claim(s) 18
Shoaib in view of Gruenstein and Takeo teaches the method of claim 13. 
Shoaib further teaches if consensus between the first and second neural is not achieved then input size relative to size of prior inputs is increased (Par. [0066], “identical labels from two biased classifiers of a stage imply consensus whereas contradicting labels imply non-consensus (NC). However, the biased classifiers may produce labels based, at least in part, on class probabilities associated with the labels. This provides an opportunity to design a slightly different consensus measure (or confidence value) called the "consensus threshold", which may, at least partially, control the number of test instances processed by a stage”; Par. [0067], “relatively small consensus threshold values for a classifier stage generally result in increasing the fraction of the input test instances that will be classified by the stage”; Examiner note: examiner maps “relatively small consensus threshold” in Shoaib to non-consensus and “increasing the fraction of the input test instances” in Shoaib to increasing input size.).

(Currently Amended) Regarding claim(s) 20
Shoaib teaches a computer program product comprising a computer readable storage medium having program instructions embodied therewith (Par. [0033], “Computer readable media may include computer storage media and/or communication media. Computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules, or other data”), the program instructions executable by processor to cause the processor (Par. [0029], "memory 108 can store instructions executable by the processor(s) 104 including an operating system (OS) 112, a machine learning module 114, and programs or applications 116 that are loadable and executable by processor(s)") to:
(Fig. 13 and par. [0016], "a training process for a scalable-effort classifier of a machine learning model, according to various example embodiments"; Par. [0001], “In the training phase, typical input examples are used to build decision models that characterize the data. In the testing phase, the learned model is applied to new data instances in order to infer different properties such as relevance and similarity”; Par. [0002], “a first classifier stage involves the simplest machine learning models and is able to classify input data that is relatively simple. Subsequent classifier stages have increasingly complex machine learning models and are able to classify more complex input data”);
determine output for smallest sized of the inputs on a least complex neural network of the set; determine output for successively larger sized neural networks of the set (Par. [0025], "If the confidence level is beyond a particular threshold value, the output class label produced by the current model is considered to be a final outcome. In this case, the test instance is not processed by any subsequent models in the sequence. Thus, relatively non-complex test instances are processed by only one or the initial few (least complex) model(s) in the sequence"; Par. [0048], "The confidence value determines whether or not the input is passed on to a subsequent next stage"; Examiner note: under the broadest reasonable interpretation, examiner maps “output class label” in Shoaib to output in the claim, “test instances” in Shoaib to inputs in the claim. Examiner further maps “machine learning model” in Shoaib to neural network in the claim because applicant merely recites the term); and 

determine consensus between a first neural network and a second neural network of the set.
Gruenstein teaches: 
determine consensus between a first neural network and a second neural network of the set (Par. [0133], "The process trains a second neural network, comprising a second quantity of nodes greater than the first quantity of nodes, to identify the set of features using a second training set (706), for a second quantity of iterations (708). The second quantity of iterations may be greater than the first quantity of iterations"; Par. [0134], "The process provides the first neural network and the second neural network to a user device that uses both neural networks to analyze a data set and determine whether the data set comprises a digital representation of a feature from the set of features"; Examiner note: applicant discloses that consensus is reached where two or more successive large networks yields the same inference output. Similarly, Gruenstein teaches that both neural networks to analyze a data set and determine whether the data set comprises a digital representation of a feature from the set of features. So, it can be considered that consensus is reached where two or more neural networks yield the same data set comprising a digital representation of a feature from the set of features.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the modeling technique selecting a classifier for energy-efficient machine learning of Shoaib to incorporate the consensus (Gruenstein, par. [0005]).
Shoaib in view of Gruenstein and Shao does not explicitly teach:
wherein if consensus between the first and second neural network is achieved, then the second neural network is selected as optimal size.
Takeo teaches wherein if consensus between the first and second neural network is achieved the second neural network is selected as optimal size (Col. 7, ll. 25-40, "With the method for determining an image point in an object image in accordance with the present invention, a plurality of different neural networks are prepared for a plurality of different image recording menus. Each of the neural networks receives an image signal and generates outputs which represent an image point. A neural network, which is optimum for the predetermined image recording menu, is selected from the plurality of the neural networks. The selected neural network is employed in obtaining outputs, which represent the image point located in the region inside of the object image. Therefore, even if the image recording menu changes, an image point located in the region inside of an object image can be determined, which image point is optimum for the new radiation image recording menu"; Examiner note: applicant discloses that consensus is reached where two or more successive large networks yields the same inference output. Similarly, Takeo teaches that each of the neural networks generates outputs which represent an image point. So, it can be considered that consensus is reached where each of neural networks generate outputs which represent the image point.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the modeling technique selecting a classifier for energy-efficient machine learning of Shoaib in view of Gruenstein to incorporate Takeo’s technique selecting an optimal size of a neural network among multiple neural networks where they generate the same outputs. The motivation/suggestion for doing this would be increasing the accuracy in determining an image point located in the region inside of an object image by the neural network (Takeo, Col. 6, ll. 39-41).

Claim(s) 4-6, 8-12, and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shoaib (US 20160217390 A1), in view of Gruenstein (US 20150340032 A1) and Takeo, and further in view of Shao (CN 108229534 A).

Regarding claim(s) 4
Shoaib in view of Gruenstein and Takeo teaches the method of claim 1.
Shoaib in view of Gruenstein and Takeo does not explicitly teach:
further comprising an architecture component that forms a chain of increasingly complex classifiers by subsampling feature sizes of a large neural network based on at least one parameter comprising one or more of: successively decreasing rates, wherein decreasing rates is increasing complexity by successively reducing interval between selected feature components or bit precision.
(Abstract, ll. 4-7,“the size of the second neural network model is smaller than the size of each of the first neural network model, a first difference obtaining the characteristic layer of the at least two first neural network model extraction and the second feature layer of the neural network model between extracted features”; Description, par. 17, "the first neural network model can also be referred to large neural network model, it is possible to select the neural network model with high precision and high robustness as the neural network model of the embodiment of the invention, the second neural network model can also be referred to a small neural network model").
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the technique of multiple classifier stages having increasing complexity of Shoaib in view of Gruenstein and Takeo to incorporate Shao’s selection technique of a neural network model among different size of neural network models. Shao’s technique is to select small neural network from at least two different neural networks where they are trained and output the same. The motivation/suggestion for doing this would be for the purpose of improving the performance of small neural network model because of saving the occupation of storage and calculation resource (Shao, Description, par. 104).

Regarding claim(s) 5
Shoaib in view of Gruenstein and Takeo, further in view of Shao teaches the method of claim 4.
Shoaib teaches further comprising a profile component that determines a consensus profile that comprises a distribution of consensus points for the set of inputs (Par. [0061], "SE classifier stage 900 may include a global consensus (GC) module 908, which aggregates outputs from all LC modules of the binary-class classifiers 902-906."; Par. [0073], "The consensus probability determines the confidence of the biased classifier stage while operating on the various training instances").

(Currently Amended) Regarding claim(s) 6
Shoaib in view of Gruenstein and Takeo, further in view of Shao teaches the method of claim 4.
Shoaib teaches wherein if consensus between the first and second neural network is not achieved then input size is increased relative to a prior input size (Par. [0066], “identical labels from two biased classifiers of a stage imply consensus whereas contradicting labels imply non-consensus (NC). However, the biased classifiers may produce labels based, at least in part, on class probabilities associated with the labels. This provides an opportunity to design a slightly different consensus measure (or confidence value) called the "consensus threshold", which may, at least partially, control the number of test instances processed by a stage”; Par. [0067], “relatively small consensus threshold values for a classifier stage generally result in increasing the fraction of the input test instances that will be classified by the stage”; Examiner note: examiner maps “relatively small consensus threshold” in Shoaib to non-consensus and “increasing the fraction of the input test instances” in Shoaib to increasing input size.).

Regarding claim(s) 8
Shoaib in view of Gruenstein and Takeo, further in view of Shao teaches the method of claim 4.
Shoaib teaches wherein consensus is determined for all inputs (Par. [0057], "For all input test instances that are either below C+ or above C−, both biased classifiers provide identical class labels and thus a consensus, which may be determined by CA module").

Regarding claim(s) 9
Shoaib in view of Gruenstein and Takeo, further in view of Shao teaches the method of claim 4.
Shoaib teaches wherein a consensus profile across all inputs is determined in an error-free state (Par. [0061], "SE classifier stage 900 may include a global consensus (GC) module 908, which aggregates outputs from all LC modules of the binary-class classifiers 902-906."; Par. [0055], "SE classifier stage 800 may, for example, be used for a binary classification algorithm with two possible class outcomes + and −. + biased classifier 802 and − biased classifier 804 may be trained to detect one particular class with high accuracy. For example, + biased classifier 802 is biased towards class + (denoted by C+). Thus, + biased classifier 802 may relatively frequently mispredict class labels for test instances from class −, but seldom mispredict class labels for test instances from class +"; Par. [0061], "GC module 908 may have a particular functionality such that if there is positive consensus (e.g., ++) in exactly one LC module, then GC module 908 outputs a class label corresponding to the consenting binary-classification unit (e.g., one of binary-class classifiers 902-906)"; Par. [0056], "First, if the biased classifiers 802 and 804 predict the same class (e.g., ++ or −−), then consensus module 806 determines a consensus and the corresponding label (e.g., + or −) is produced as output"; Examiner note: applicant discloses that the errors can be of different types. So, error-free can be considered where two classifiers predict the same class (e.g., ++ or −−).).

Regarding claim(s) 10
Shoaib in view of Gruenstein and Takeo, further in view of Shao teaches the method of claim 9.
Shoaib teaches wherein a consensus profile across all inputs is determined in presence of errors (Par. [0061], "SE classifier stage 900 may include a global consensus (GC) module 908, which aggregates outputs from all LC modules of the binary-class classifiers 902-906."; Par. [0061], "On the other hand, if more than one LC module provides consensus, then the next SE classifier stage is invoked"; Par. [0056], "Second, if the biased classifiers 802 and 804 predict different classes (e.g., +− or −+), then consensus module 806 determines no consensus (NC). In this case input Ii to classifier stage 800 is considered to be too difficult to be classified by classifier stage 800 and the next-stage input Ii+1 is produced and provided to the next-stage classifier"; Examiner note: applicant discloses that the errors can be of different types. So, error can be considered where two classifiers predict the different class (e.g. +- or −+).).

Regarding claim(s) 11
Shoaib in view of Gruenstein and Takeo, further in view of Shao teaches the method of claim 10.
Shoaib teaches wherein a determination is made that a delta (Fig. 9, output from LC in element 902) of the profile consensus without error (Fig. 9, e.g., ++ in LC.0 in element 1000) is within a pre-determined threshold (Fig. 9, element LC, note that the threshold is a very small number larger than zero) of the consensus profile in presence of errors (Fig. 9, e.g., C+ in element 902), and wherein an error (Fig. 10, element OUTPUT, e.g., NC) is reported if the delta is greater than the threshold or a no-error (Fig. 10, element OUTPUT, e.g., CLASS 0) is reported if the delta is less than the threshold (Par. [0067], "component classifier outputs may be combined over a continuum to either relax or tighten the consensus operations by using a consensus threshold, denoted by δ. In feature space 700, δ=0 and classifiers 702 and 704 not modified. In feature space 1100, δ<0 and classifiers 702 and 704 (shown as dashed lines) are modified by δ to be classifiers 1102 and 1104. In feature space 1200, δ>0 and classifiers 702 and 704 (shown as dashed lines) are modified by δ to be classifiers 1202 and 1204"; Examiner note: Shoaib teaches error as (+- or −+) and error-free as (++ or −−) in claims 9 and 10; Examiner considers that classifier 1104 as shown in Fig. 11 produces + biased label (no error) in presence of errors (+- or −+)  and δ<0 as delta is less than pre-determined threshold). 

Regarding claim(s) 12
Shoaib in view of Gruenstein and Takeo, further in view of Shao teaches the method of claim 11.
Shoaib teaches wherein a minimum permissible operating voltage is determined as a function of the delta exceeding the threshold (Par. [0020], "SE machine learning may expend computational effort (e.g., computational time and energy) that is proportional to the difficulty of the data. This approach provides a number of benefits, including faster computations and energy savings, as compared to fixed-effort machine learning"; Par. [0025], "If the confidence level is beyond a particular threshold value, the output class label produced by the current model is considered to be a final outcome"; Par. [0025], "This approach provides a resource management technique for achieving scalability in computational effort at runtime"; Examiner note: for examination purposes, examiner has interpreted "voltage" as "energy" with the broadest reasonable interpretation.).

Regarding claim(s) 16

Shoaib in view of Gruenstein and Takeo does not explicitly teach:
further comprising forming a chain of increasingly complex classifiers by subsampling feature sizes of a large neural network based on at least one parameter comprising one or more of: successively decreasing rates, wherein decreasing rates is increasing complexity by successively reducing interval between selected feature components or bit precision.
Shao teaches further comprising forming a chain of increasingly complex classifiers by subsampling feature sizes of a large neural network based on at least one parameter comprising one or more of: successively decreasing rates, wherein decreasing rates is increasing complexity by successively reducing interval between selected feature components or bit precision (Abstract, ll. 4-7,“the size of the second neural network model is smaller than the size of each of the first neural network model, a first difference obtaining the characteristic layer of the at least two first neural network model extraction and the second feature layer of the neural network model between extracted features”; Description, par. 17, "the first neural network model can also be referred to large neural network model, it is possible to select the neural network model with high precision and high robustness as the neural network model of the embodiment of the invention, the second neural network model can also be referred to a small neural network model").
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the modeling technique selecting a classifier for energy-efficient machine learning of Shoaib in view of Gruenstein and  (Shao, Description, par. 104).

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure listed below:
Scarborough et al. (US-7472097-B1): teaches using a consensus group in employee selection where multiple neural networks predict the same outcome.
Sun et al. (US-20180157976-A1): teaches determination method for convolutional neural network model using three determination units (complexity of a database including multiple samples, a classification capability, to acquire classification capability of each of the multiple candidate CNN models)

Conclusion
Applicant’s amendment necessitated the new ground(s) of rejection presented in this Office Action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP   
§ 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAEYONG J PARK whose telephone number is (571) 272-3898. The examiner can normally be reached on M-F 9:00 a.m. - 6:00 p.m.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached at (571) 272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you 

/JAEYONG J PARK/Examiner, Art Unit 4164    
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126