DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 6/23/2022 has been entered.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the claims and remarks filed on 4/25/2022 and entered via the request for continued examination (RCE) filed on 6/23/2022. Claims 3-4, 6-7, 9, 12-13, 15-16 and 18-19  are pending and have been examined.

Response to Amendment
In the amendment, claims 3-7, 12-13 and 15-16 were amended, claims 5 and 14 were cancelled, and no claims were added. Claims 2 and 11 were previously cancelled in the amendment filed 12/17/2021 and claims 1, 8, 10 and 17 were previously cancelled in the amendment filed 2/9/2021. Thus, claims 3-4, 6-7, 9, 12-13, 15-16 and 18-19 are pending and have been examined.
Applicant’s amendments to the claims have overcome the rejections of claims 3-7, 9, 12-16 and 18-19 under 35 U.S.C. 112(b) previously set forth in the final office action mailed on 2/25/2022 (hereinafter “the previous office action”). The cancellation of claims 5 and 14 have rendered the rejections of these claims moot. 

Response to Arguments
Applicant's arguments filed 4/25/2022 with respect to the rejections of claims 3-7, 9, 12-16 and 18-19 under 35 U.S.C. 112(b), has been fully considered, and are persuasive. The cancellation of claims 5 and 14 have rendered the rejections of these claims under 35 U.S.C. 112(b) moot.
Applicant's arguments filed 4/25/2022 with respect to the rejections of claims 3-7, 9, 12-16 and 18-19 under 35 U.S.C. 103, has been fully considered, and are persuasive in part. The cancellation of claims 5 and 14 have rendered the rejections of these claims under 35 U.S.C. 103 moot.
However, as indicated in applicant’s remarks (see, e.g., applicant’s remarks, pages 5-6 noting cancellation of claims 5 and 14 and listing similar subject matter recited in amended independent claim 7) and as discussed below, subject matter of cancelled dependent claims 5 and 14 has been incorporated into independent claims 7 and 16, respectively, which remain rejected under 35 U.S.C. 103 using the combination of references (i.e., the combination of Mishra, Britto, Mansilla, Brun and Valencia) previously applied to claims 5 and 14). 
With reference to amended independent claim 7, applicant asserts “Britto fails to disclose the feature ‘calculate, based on the another similarity and yet another similarity between each sample and a class to which the sample belongs, a task complexity score for the classification task to be used for selection of a classifier for the classification task' as currently recited independent claim 7. Further, by way of example, independent claim 7 currently recites: ‘wherein the one or more processing circuits is further configured to calculate a sample complexity score for each sample, and acquire the task complexity score for the classification task in a form of digits by taking a weighted average of sample complexity scores of the samples’ Britto fails to disclose such a task complexity score.” (applicant’s remarks, page 7, emphasis in original).
Next, applicant seemingly references unclaimed embodiments in applicant’s specification, but not features recited in any of the pending claims, by alleging that “the calculated similarity is a quantized specific digit, and thus the task complexity score is also a digit. On this basis, the selection of the classifier according to the task complexity score can be very accurate. In addition, the solution of the present application requires no training in advance, and thus can reduce the calculating load significantly.” (applicant’s remarks, pages 7-8). Applicant’s arguments and remarks directed to disclosed, but unclaimed embodiments not recited in any claims are unpersuasive. That is, applicant’s apparent attempt to overcome the prior art rejections and distinguish the cited references by pointing to unclaimed examples from the specification are not persuasive.
Applicant then generally concludes that the “Other reference documents fail to disclose or imply the above mentioned technical features either. Therefore, the currently claim 7 possesses prominent substantive features and represents notable technical progress over the cited references and involves an inventive step.” and “none of the cited references disclose or imply every one of the features of current independent claim 7. Thus, current independent claim 7 involves an inventive step over the cited references.” (applicant’s remarks, page 8).
Accordingly, applicant asserts that the claim limitation “calculate, based on the another similarity and yet another similarity between each sample and a class to which the sample belongs, a task complexity score for the classification task to be used for selection of a classifier for the classification task” recited in amended independent claims 7 and 16 and the new limitation, which was largely recited in now-cancelled claims 5 and 14, “wherein the calculating, based on the similarities, the task complexity score for the classification task comprises: calculating a sample complexity score for each sample, and acquiring the task complexity score for the classification task in a form of digits by taking a weighted average of the sample complexity scores of the samples” added to claims 7 and 16 are not taught in the portions of Mishra, Britto, Mansilla and Brun references cited in the previous Office Action (Valencia was applied to dependent claims 5-6 and 14-15).
The examiner respectfully disagrees with applicant’s assertions, and points applicant to the below discussion of Britto and Valencia.
First, regarding applicant’s arguments vis-à-vis the limitation “calculate, based on the another similarity and yet another similarity between each sample and a class to which the sample belongs, a task complexity score for the classification task to be used for selection of a classifier for the classification task” recited in amended claims 7 and 16, the examiner respectfully disagrees with applicant’s assertions, and points applicant to the below discussion of Britto.
Regarding the limitation calculating “based on the similarities, another similarity representing similarities between each sample and classes to which the sample does not belong” recited in amended independent claims 7 and 16, the examiner points to page 3670 and Algorithm 5 of Britto, which explicitly disclose that “Ψ is defined as the k-nearest neighbors of the unknown pattern in the training set. Then, the similarity function is used as a filter to preselect from Ψ, the samples for which the classifiers present similar behavior to that observed for the unknown sample t” [i.e., the unknown sample is a training sample that does not belong to a class], “Compute the vector MCBt as the class labels assigned to t” and “for each sample ψj in Ψ do Compute MCBψj as the class label assigned to ψ … Compute Sim as the similarity between MCBt and MCBψ” [i.e., compute/calculate another similarity Sim as/representing similarities between each sample ψj and classes/class labels in MCBt to which ψj does not belong].
With regard to the calculating “based on the another similarity and yet another similarity between each sample and a class to which the sample belongs, a task complexity score for the classification task to be used for selection of a classifier for the classification task” limitation recited in claims 7 and 16, as explained in the section 103 rejections below, “a task complexity score for the classification task”, under the broadest reasonable interpretation (BRI), in light of the specification, is a record or measure of complexity of a classification task or problem.
With continued reference to the above-noted calculating a similarity score limitation, the examiner points to pages 3672 and 3676 of Britto, which explicitly disclose “us[ing] an adaptive classifier ensemble selection … [that] selects the ensemble with the optimal complexity for each test pattern from the initial pool of classifiers [i.e., used for selection of a classifier for the classification task], … Compute MOC(Oi) as the model with optimal complexity by using Ψ … Select the ensemble EoC*t and the weights for each classifier using MOC(oi)” [i.e., computing/calculating classifier model complexity based on each sample from Ψ], “we try to reveal how such a contribution could be related to the problem complexity” [i.e., classification problem/task complexity], “a set of complexity measures is used to describe the difficulty of a classification problem and relate it to the observed DS performance” [i.e., related to/based on the similarities] and “We then implemented a set of complexity measures for classification problems [3], composed of two measures of overlap between single feature values (F1 and F2), two measures of separability of classes (N2 and N3) … The measures used are described below based on their generalization to problems with multiple classes” [i.e., calculate complexity measure/score for the classification problem/task based on similarities/separability measures – including another similarity between each sample and a class in the multiple classes to which the sample belongs].
Second, regarding the limitation “calculate a sample complexity score for each sample” added to amended independent claims 7 and 16, aside from repeating the claim language - see e.g., paragraphs 42, 59 and 68, applicant’s specification does not define or provide examples of “a sample complexity score”. The plain meaning of complexity is the state or quality of being complex; intricacy. See https://www.dictionary.com/browse/complexity. Further, the plain meaning of “score” is the record of points or strokes made. See https://www.dictionary.com/browse/score. Therefore, “a sample complexity score”, under the BRI, in light of the specification, is any record or measure of complexity of a sample. 
With continued reference to the above-noted “calculate a sample complexity score for each sample” limitation, the examiner points to pages 3672 and 3676-3676 of Britto, which explicitly disclose “Comput[ing] MOC(Oi) as the model with optimal complexity by using Ψ” [i.e., compute/calculate classifier model complexity using samples from Ψ], “We then implemented a set of complexity measures for classification problems [3], composed of two measures of overlap between single feature values (F1 and F2)” [i.e., implement/calculate sample complexity measure/score for complexity of the feature values/samples] and “a relation exists between the data complexity [i.e., data sample complexity] and the observed contribution of the DS approach”.
Third, regarding the new limitation “acquire the task complexity score for the classification task in a form of digits” added to claims 7 and 16, aside from repeating the previous claim language and mentioning “a complexity score for the classification task” - see e.g., paragraphs 5-7, 17, 28, 37, 55 and 64, applicant’s specification does not mention, let alone define a “task complexity score for the classification task in a form of digits”. Paragraph 19 of applicant’s specification states “the complexity is represented by a complexity score, so that the complexity of the classification task can be accurately measured in the form of digits” and paragraphs 33 and 42 of applicant’s specification disclose examples wherein “the complexity score for the classification task is obtained by taking a weighted average of the complexity scores of the respective samples”. The plain meaning of complexity is the state or quality of being complex; intricacy. See https://www.dictionary.com/browse/complexity. Further, the plain meaning of “score” is the record of points or strokes made. See https://www.dictionary.com/browse/score. Therefore, a “task complexity score for the classification task in a form of digits”, under the broadest reasonable interpretation (BRI), in light of the specification, is any numerical record, measure or average of complexity of a classification task or problem.
With continued reference to the above-noted calculating and acquiring limitations, the examiner points to FIG. 8 of Britto (reproduced below) showing “Pairwise combination of the complexity measures F1, N2, N3 and T2, considering the datasets” [i.e., numerical complexity measures F1, N2, N3 and T2 scores in the form of digits] and pages 3670, 3672 and 3677-3678 and Algorithm 5, which explicitly disclose that “we try to reveal how such a contribution could be related to the problem complexity”, “a set of complexity measures is used to describe the difficulty of a classification problem” [i.e., describe/acquire classification problem/task complexity], “After implementing the previously described complexity measures, they were applied to the dataset”, “we carried out an analysis in which the complexity measures were combined in a pairwise fashion” and “a relation exists between the data complexity and the observed contribution of the DS … this relation is based on some intrinsic aspects of the classification problem” [i.e., acquire the complexity measure/task complexity score for the classification problem/task – as shown in FIG. 8, reproduced below, the task complexity scores/measures are numeric values, in the form of digits].

    PNG
    media_image1.png
    200
    400
    media_image1.png
    Greyscale

Britto FIG. 8
Lastly, with regard to the “calculating a sample complexity … by taking a weighted average of the sample complexity scores of the samples” limitation added to claims 7 and 16 from cancelled claims 5 and 14, as discussed above, “sample complexity scores”, under the BRI, in light of the specification, are any records or measures of complexities of samples. 
With continued reference to the above-noted calculating by taking a weighted average limitation, the examiner points to paragraphs 133, 136, 139, and 146 of Valencia, which explicitly disclose “us[ing] the full classifier model to generate a family of lean classifier models of varying levels of complexity (or ‘leanness’)”, “determining a priority and/or a complexity associated with the behavior that is to be analyzed” [i.e., determining/acquiring a behavior analysis task complexity, where there are varying complexity levels/scores], “determine a number (N) of unique test conditions [i.e., samples] … that may be tested in boosted decision stumps) that should be evaluated in the lean classifier model … compute or determine a weighted average of the results of applying the collected behavior information to each boosted decision stump in the lean classifier model” [i.e., taking a weighted average of sample results/scores] and “boost[ing] the weight of the incorrectly classified samples/test conditions” [i.e., boosting sample scores].
Further, as detailed below, the combination of Mishra, Britto, Mansilla, Brun and Valencia (i.e., Mishra in view of Britto, Mansilla and Brun and further in view of Valencia) teaches the limitations of amended independent claims 7 and 16 and dependent claims 3-4, 6, 9, 12-13, 15 and 18-19. 
Applicant’s amendments have necessitated the claim rejections under 35 U.S.C. 103 discussed below. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 2-4, 7, 9, 11-13, 16 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Mishra et al. (U.S. Patent Application Pub. No. 20170103267 A1, hereinafter “Mishra”) in view of non-patent literature Britto et al. ("Dynamic selection of classifiers—a comprehensive review." Pattern recognition 47.11 (2014): 3665-3680, hereinafter “Britto”), non-patent literature Mansilla et al. ("Domain of competence of XCS classifier system in complexity measurement space." IEEE Transactions on Evolutionary Computation 9.1 (2005): 82-104, hereinafter “Mansilla”), non-patent literature Brun et al. (“Contribution of data complexity features on dynamic classifier selection." 2016 International Joint Conference on Neural Networks (IJCNN). IEEE, July 2016: 4396-4403, previously furnished with the final Office Action mailed on 4/12/2021 and listed on form PTO-892 accompanying that Office Action, hereinafter “Brun”) and further in view of Valencia et al. (U.S. Patent Application Pub. No. 2016/0253498 A1, hereinafter “Valencia”).
Mishra was filed on December 16, 2016 as a Continuation of Application No. PCT/CA2015/050558, filed on June 18, 2015, which claims priority to U.S. Provisional Application No. 62/014,898, filed on June 20, 2014, and both of these dates are before the effective filing date of this application, i.e., December 1, 2016. Therefore, Mishra constitutes prior art under 35 U.S.C. 102(a)(2). 
With respect to independent claim 7, Mishra discloses the invention as claimed including an apparatus for evaluating complexity of a classification task (see, e.g., paragraphs 32, 41 and 92, “a learning platform 12 to generate and/or improve a set of classifiers 14” [i.e., platform 12 is an apparatus], “samples can be continuously refined through the addition of greater numbers of features and classifiers, in addition to evaluating more linear and non-linear combinations of features that generate more complex classifiers” [i.e., evaluate complexity of classification tasks performed by classifiers involving different numbers of features], “more or fewer stages may be used depending on the … complexity of the video content and the complexity of the computer vision techniques used.”), comprising:
one or more processing circuits, configured to (see, e.g., paragraphs 7 and 91, “learning platform … distributing learning algorithms over a cluster of processors” [i.e., hardware, processors], “a DCS platform since each module can be executed on a distinct computing/processing node such as a distinct CPU” [i.e., one or more CPUs/processors/processing circuits configured to execute algorithms/modules]):
perform classification, with a … classifier, on at least a part of training samples for the classification task (see, e.g., paragraphs 41-42 and 67, “thresholds used to determine positive from negative samples can be continuously refined through the addition of greater numbers of features and classifiers”, “training is used, which uses machine learning to estimate the parameters of an ensemble classifier 14. The classifier 14 can be considered a function approximation, where the parameter of the function uses a machine learning algorithm. For example, given a number of positive and negative samples 42, 44, a function f can be found” [i.e., at least part of training samples 42, 44], “features … are then extracted from the measurements to produce a feature vector used as an input into the classifier. The classifier, using the parameters obtained from the learning system for these features, then produces a detector score” [i.e., classifier 14 performs classification with a classifier on the parameters/part of training samples for the classification task]);
calculate … similarities between the sample and respective classes, respectively, based on a result of the classification (see, e.g., paragraphs 38, 40, 45 and 108, “The unrepresented sample space can be determined by identifying edge cases where the classifier created by the learning platform 12 does not generalize sufficiently to separate classes.” [i.e., a sample in the sample space and respective, separate classes], “The resulting classifier has parameters estimated from a larger sample size … and consequently has increased accuracy at classifying similar objects”, “principal component analysis is a applied on the feature vectors of a large set of similar objects to estimate a invariant feature space” [i.e., similarities between the sample and classes/sets of objects], “For example, in traffic videos, the movement of vehicles and pedestrians may be stored as well as classification of the vehicles … the client 312 can generate reports based on these results.” [i.e., calculate/generate similarities based on a classification result]).
Although Mishra substantially discloses the claimed invention, Mishra is not relied on for explicitly disclosing calculate, with respect to each sample of the at least a part of training samples, similarities between the sample and respective classes, respectively, based on a result of the classification; 
calculate, based on the similarities, another similarity representing similarities between each sample and classes to which the sample does not belong; and
calculate, based on the another similarity and yet another similarity between each sample and a class to which the sample belongs, a task complexity score for the classification task to be used for selection of a classifier for the classification task,
wherein the one or more processing circuits is further configured to calculate a sample complexity score for each sample, and acquire the task complexity score for the classification task in a form of digits by taking a weighted average of sample complexity scores of the samples.
In the same field, analogous art Britto teaches calculate, with respect to each sample of the at least a part of training samples, similarities between the sample and respective classes, respectively, based on a result of the classification (see, e.g., pages 3669-3670, Algorithm 5, “estimate the classifier capability of doing the correct classification. The competence of each classifier is computed considering the support it gives for the correct class of each validation sample” [i.e., based on a result of the classification with respect to each sample of the validation/training samples and correct, respective classes], “region Ψ is defined as the k-nearest neighbors of the unknown pattern in the training set [i.e., at least a part of training samples for the classification task]. Then, the similarity function is used as a filter to preselect from Ψ, the samples for which the classifiers present similar behavior to that observed for the unknown sample t. The remaining samples are used to select the most accurate classifier” [i.e., with respect to each sample], “Compute Sim as the similarity between MCBt and MCBψ”, “vector named MCB (Multiple Classifier Behavior) which can be defined as MCBψ … contains the class labels [i.e., respective classes] assigned to the sample ψ by the M classifiers in the pool. The measure of similarity Sim can be defined as 
    PNG
    media_image2.png
    200
    400
    media_image2.png
    Greyscale
 [i.e., calculate/measure/compute similarities between the samples and classes from Ψ]); 
calculate, based on the similarities, another similarity representing similarities between each sample and classes to which the sample does not belong (see, e.g., page 3670, Algorithm 5, “Ψ is defined as the k-nearest neighbors of the unknown pattern in the training set. Then, the similarity function is used as a filter to preselect from Ψ, the samples for which the classifiers present similar behavior to that observed for the unknown sample t” [i.e., the unknown sample is a training sample that does not belong to a class], “Compute the vector MCBt as the class labels assigned to t”, “for each sample ψj in Ψ do Compute MCBψj as the class label assigned to ψ … Compute Sim as the similarity between MCBt and MCBψ” [i.e., compute/calculate another similarity Sim as/representing similarities between each sample ψj and classes/class labels in MCBt to which ψj does not belong]); and
calculate, based on the another similarity and yet another similarity between each sample and a class to which the sample belongs, a task complexity score for the classification task to be used for selection of a classifier for the classification task (aside from repeating the claim language - see e.g., paragraphs 5-7, 17, 28, 37, 55 and 64, applicant’s specification does not define “a task complexity score for the classification task”. Paragraph 19 of applicant’s specification states “the complexity is represented by a complexity score, so that the complexity of the classification task can be accurately measured in the form of digits” and paragraphs 33 and 42 of applicant’s specification disclose examples wherein “the complexity score for the classification task is obtained by taking a weighted average of the complexity scores of the respective samples”. The plain meaning of complexity is the state or quality of being complex; intricacy. See https://www.dictionary.com/browse/complexity. Further, the plain meaning of “score” is the record of points or strokes made. See https://www.dictionary.com/browse/score. Therefore, “a task complexity score for the classification task”, under the broadest reasonable interpretation (BRI), is a record or measure of complexity of a classification task or problem) (see, e.g., pages 3672 and 3676, “use an adaptive classifier ensemble selection … selects the ensemble with the optimal complexity for each test pattern from the initial pool of classifiers [i.e., used for selection of a classifier for the classification task], … Compute MOC(Oi) as the model with optimal complexity by using Ψ … Select the ensemble EoC*t and the weights for each classifier using MOC(oi)” [i.e., compute/calculate classifier model complexity based on each sample from Ψ], “we try to reveal how such a contribution could be related to the problem complexity” [i.e., classification problem/task complexity], “a set of complexity measures is used to describe the difficulty of a classification problem and relate it to the observed DS performance” [i.e., related to/based on the similarities], “We then implemented a set of complexity measures for classification problems [3], composed of two measures of overlap between single feature values (F1 and F2), two measures of separability of classes (N2 and N3) … The measures used are described below based on their generalization to problems with multiple classes” [i.e., calculate complexity measure/score for the classification problem/task based on similarities/separability measures – including another similarity between each sample and a class in the multiple classes to which the sample belongs]),
wherein the one or more processing circuits is further configured to calculating a sample complexity score for each sample (aside from repeating the claim language - see e.g., paragraphs 42, 59 and 68, applicant’s specification does not define or provide examples of “a sample complexity score”. The plain meaning of complexity is the state or quality of being complex; intricacy. See https://www.dictionary.com/browse/complexity. Further, the plain meaning of “score” is the record of points or strokes made. See https://www.dictionary.com/browse/score. Therefore, “a sample complexity score”, under the BRI, in light of the specification, is any numerical record or measure of complexity of a sample) (see, e.g., pages 3672 and 3676-3676, “Compute MOC(Oi) as the model with optimal complexity by using Ψ” [i.e., compute/calculate classifier model complexity using samples from Ψ], “We then implemented a set of complexity measures for classification problems [3], composed of two measures of overlap between single feature values (F1 and F2)” [i.e., implement/calculate sample complexity measure/score for complexity of the feature values/samples], “a relation exists between the data complexity [i.e., data sample complexity] and the observed contribution of the DS approach”), and acquire the task complexity score for the classification task in a form of digits (aside from repeating the claim language - see e.g., paragraphs 5-7, 17, 28, 37, 55 and 64, applicant’s specification does not define a “task complexity score for the classification task in a form of digits”. Paragraph 19 of applicant’s specification states “the complexity is represented by a complexity score, so that the complexity of the classification task can be accurately measured in the form of digits” and paragraphs 33 and 42 of applicant’s specification disclose examples wherein “the complexity score for the classification task is obtained by taking a weighted average of the complexity scores of the respective samples”. The plain meaning of complexity is the state or quality of being complex; intricacy. See https://www.dictionary.com/browse/complexity. Further, the plain meaning of “score” is the record of points or strokes made. See https://www.dictionary.com/browse/score. Therefore, a “task complexity score for the classification task in a form of digits”, under the broadest reasonable interpretation (BRI), in light of the specification, is any numerical record, measure or average of complexity of a classification task or problem) (see, e.g., FIG. 8 of Britto, which is reproduced below, depicting “Pairwise combination of the complexity measures F1, N2, N3 and T2, considering the datasets” [i.e., numerical complexity measures F1, N2, N3 and T2 scores in the form of digits] and pages 3670, 3672 and 3677-3678 and Algorithm 5, “we try to reveal how such a contribution could be related to the problem complexity”, “a set of complexity measures is used to describe the difficulty of a classification problem” [i.e., describe/acquire classification problem/task complexity], “After implementing the previously described complexity measures, they were applied to the dataset”, “we carried out an analysis in which the complexity measures were combined in a pairwise fashion”, “a relation exists between the data complexity and the observed contribution of the DS … this relation is based on some intrinsic aspects of the classification problem” [i.e., acquire the complexity measure/task complexity score for the classification problem/task – as shown in FIG. 8, reproduced below, the task complexity scores/measures are numeric values, in the form of digits]).

    PNG
    media_image1.png
    200
    400
    media_image1.png
    Greyscale

Britto FIG. 8
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Mishra to incorporate the teachings of Britto to provide multiple classifier systems (MCSs) based on dynamic selection (DS) of classifiers (See, e.g., Britto, pages 3665-3666, Abstract and section 1). Doing so would have allowed Mishra to determine a relation between the DS contribution and the complexity of a classification problem (i.e., the complexity of a classification task) in order to determine when to apply DS, as suggested by Britto (See, e.g., Britto, page 3666). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.
Although Mishra in view of Britto substantially teaches the claimed invention, Mishra in view of Britto is not relied on for teaching that the classifier is a simple center classifier,
wherein the one or more processing circuits is configured to calculate a distance between each sample and a center of each of the one or more classes as the similarity between the sample and the class.
In the same field, analogous art Mansilla teaches a simple center classifier (paragraph 24 of applicant’s specification discloses “In the case of using the simple center classifier … the calculating sub-unit 1012 calculates a distance between each sample and a center of each class as a similarity between the sample and the class, where the distance may be a Euclidean distance”. Therefore, “a simple center classifier”, under the BRI, is a classifier that uses distances between samples and a class center, where the distances may be Euclidean distances) (see, e.g., pages 91-92, “Our study mainly focuses on the boundary complexity of a problem [i.e., complexity of a classification problem/task], where the design of a classifier may play an important role”, “class boundary: … we compute the minimum spanning tree (MST) [29] connecting all training samples, using the Euclidean distances between each pair of points” [i.e., using Euclidean distances between samples and a class boundary], “a (highest order) adherence subset associated with a point is the largest hyper-spherical neighborhood centered around p that contains only points with the same class as that of p … If all points of one class are distributed within a hypersphere centered at a certain point, only one adherence subset will be retained for that class” [i.e., a class center], “Then, the classifier system is trained” [i.e., the classifier uses distances between samples and a class center]),
wherein the one or more processing circuits is configured to calculate a distance between each sample and a center of each of the classes as the similarity between the sample and the class (see, e.g., pages 91-92, “Length of class boundary: This measure refers to the percentage of points in the dataset that lie near the class boundary. To calculate it, we compute the minimum spanning tree (MST) [29] connecting all training samples, using the Euclidean distances between each pair of points.” [i.e., calculate a distance between each training sample and each of the classes], “a (highest order) adherence subset associated with a point is the largest hyper-spherical neighborhood centered around p that contains only points with the same class as that of p … If all points of one class are distributed within a hypersphere centered at a certain point, only one adherence subset will be retained for that class” [i.e., a center point of each class], “This metric computes the dispersion of points within classes relative to the separability between classes” [i.e., computing/calculating similarities by calculating distances between samples and class centers as relative separability between classes/similarities]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Mishra in view of Britto to incorporate the teachings of Mansilla to provide a system for relating the behavior of different classifier schemes to the geometrical complexity of a classification problem (i.e., a complexity of a classification task) where results highlight certain regions of a complexity measurement space where a classifier scheme excels (See, e.g., Mansilla, page 82, Abstract). Doing so would have allowed Mishra in view of Britto to establish a first step toward determining the best classifier scheme for a given classification problem (i.e., a given classification task), as suggested by Mansilla (See, e.g., Mansilla, page 82, Abstract).
Although Mishra in view of Britto and Mansilla substantially teaches the claimed invention, Mishra in view of Britto and Mansilla is not relied on for teaching perform classification, with a … classifier, on at least a part of training samples for the classification task, to obtain a class center of each of one or more classes of the classification task, wherein the class center of a class is an average vector of representation vectors of training samples within the class.
In the same field, analogous art Brun teaches perform classification, with a … classifier, on at least a part of training samples for the classification task (see, e.g., Abstract and pages 4396 and 4399, “we select a classifier trained in subset of data showing similar complexity than that observed in neighborhood of the test instance”, “this similarity in terms of complexity allow us to select a more competent classifier” [i.e., a classifier], “we evaluate the contribution of features related to the level of difficulty of a classification problem” [i.e., evaluating complexity of a classification problem/task], “For each problem [i.e., a classification problem/task] a pool with 100 perceptrons was created using the Bagging technique [3]. Bags containing 10% or 20% of the training samples were used” [i.e., perform classification with a classifier on bags with at least part of training samples used for the classification problem/task]), to obtain a class center of each of one or more classes of the classification task, wherein the class center of a class is an average vector of representation vectors of training samples within the class (applicant’s specification does not explicitly define or describe what is meant by “obtain a class center of” classes of a classification task, much less that such a class center is “an average vector of representation vectors of training samples within the class”. Paragraphs 23-25 of applicant’s specification state “each sample is converted into a representation vector, and all the representation vectors have the same dimension”, “calculating sub-unit 1012 calculates a distance between each sample and a center of each class as a similarity between the sample and the class, where the distance may be a Euclidean distance” and “the center of the class is calculated, for example, the center of the class is an average vector of the representation vectors of the samples in the class.” Paragraph 39 of applicant’s specification states “A distance between each sample and each class center is calculated as a similarity between the sample and the class”. This is the sole mention of any “class center” in the instant specification. Therefore, “the class center of a class” that “is an average vector of representation vectors of training samples within the class”, under the BRI, is an average or center of distances between representations of training samples in a class of a classification problem or task, where the distances may be Euclidean distances) (see, e.g., pages 4397-4398, “The level of difficulty of a classification problem can be estimated using complexity measures applied on the data … measure expresses how separable are two classes according to a specific feature” and a “metric can be interpreted as the distance between the center of two classes, so that the larger its value, larger the separation between the classes.” [i.e., obtain/measure/calculate a center of each of the two classes of the classification problem/task], “for each subset of data generated, a vector composed of M complexity measures is computed … This feature set is used as an M-complexity signature for each data subset (DSi)” [i.e., representation signatures/vectors for subsets of training samples within the class], calculating “Similarity in terms of complexity: Given a testing sample t, the first step is to define its neighborhood Ƴt in the validation dataset … The similarity between the complexity signature sigγt with each training dataset complexity signature sigDSi is done by means of Euclidean distance” [i.e., a signature/average vector of representation vectors/signatures of training samples within the class], “Distance from the predicted class: Let us to consider yj as the class predicted by the classifier Ci for the test instance t, DSi as the dataset used to train Ci, and αij as the centroid of the predicted class yj in the training dataset DSi. We compute the distance of the test instance t to the centroid αij” [i.e., obtain/measure/calculate a distance between each sample t and a centroid αij of class yj/the class center of each class yj]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Mishra in view of Britto and Mansilla to incorporate the teachings of Brun to provide a technique for “select[ing] a classifier trained in subset of data showing similar complexity than that observed in neighborhood of the test instance” by “considering during the classifier evaluation the use of features related to the problem complexity” (i.e., a complexity of a classification problem/task) (See, e.g., Brun, page 4396, Abstract). Doing so would have allowed Mishra in view of Britto and Mansilla “to select a more competent classifier”, as suggested by Brun (See, e.g., Brun, page 4396, Abstract).
Although Mishra in view of Britto, Mansilla and Brun substantially teaches the claimed invention, Mishra in view of Britto, Mansilla and Brun is not relied on for teaching acquire the task complexity … by taking a weighted average of sample complexity scores of the sample.
In the same field, analogous art Valencia teaches acquire the task complexity (see, e.g., paragraphs 133 and 136, “use the full classifier model to generate a family of lean classifier models of varying levels of complexity (or ‘leanness’)”, “determining a priority and/or a complexity associated with the behavior that is to be analyzed” [i.e., determining/acquiring a behavior analysis task complexity]) … by taking a weighted average of sample complexity scores of the sample (as indicated above, “sample complexity scores”, under the BRI, in light of the specification, are any records or measures of complexities of samples) (see, e.g., paragraphs 133, 136, 139, and 146, “use the full classifier model to generate a family of lean classifier models of varying levels of complexity (or ‘leanness’)”, “determining a priority and/or a complexity associated with the behavior that is to be analyzed” [i.e., varying complexity levels/scores], “determine a number (N) of unique test conditions [i.e., samples] … that may be tested in boosted decision stumps) that should be evaluated in the lean classifier model … compute or determine a weighted average of the results of applying the collected behavior information to each boosted decision stump in the lean classifier model” [i.e., taking a weighted average of sample results/scores], “boost the weight of the incorrectly classified samples/test conditions” [i.e., boosting sample scores]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Mishra in view of Britto, Mansilla and Brun to incorporate the teachings of Valencia to generate a behavior vector characterizing observations, collected behavior information, and/or computing device behavior (i.e., samples) and use a full classifier model to generate a family of lean classifier models of varying levels of complexity (i.e., classifiers to handle different levels of classification task complexity) (See, e.g., Valencia, paragraph 133). Doing so would have allowed Mishra in view of Britto, Mansilla and Brun to dynamically determine behaviors to observe in greater detail, and to dynamically determine a precise level of detail required for observations, thus enabling efficiently identification and prevention of problems without requiring the use of a large amount of processor or memory resources, as suggested by Valencia (See, e.g., Valencia, paragraph 164).

With respect to independent claim 16, Mishra discloses the invention as claimed including a method for evaluating complexity of a classification task, comprising: (see, e.g., paragraphs 8, 41, 46 and 92, “a method of performing distributed learning” [i.e., a method], “samples can be continuously refined through the addition of greater numbers of features and classifiers, in addition to evaluating more linear and non-linear combinations of features that generate more complex classifiers” [i.e., evaluate complexity of classification tasks performed by classifiers involving different numbers of features], “the extracted feature vector corresponding to sample si=1:n, using a feature extractor method”, “more or fewer stages may be used depending on the … complexity of the video content and the complexity of the computer vision techniques used.”):
performing classification, with a … classifier, on at least a part of training samples for the classification task (see, e.g., paragraphs 41-42 and 67, “thresholds used to determine positive from negative samples can be continuously refined through the addition of greater numbers of features and classifiers”, “training is used, which uses machine learning to estimate the parameters of an ensemble classifier 14. The classifier 14 can be considered a function approximation, where the parameter of the function uses a machine learning algorithm. For example, given a number of positive and negative samples 42, 44, a function f can be found” [i.e., at least part of training samples 42, 44], “features … are then extracted from the measurements to produce a feature vector used as an input into the classifier. The classifier, using the parameters obtained from the learning system for these features, then produces a detector score” [i.e., performing classification with a classifier 14 on the parameters/part of training samples for the classification task]);
calculating … similarities between the sample and respective classes, respectively, based on a result of the classification (see, e.g., paragraphs 38, 40, 45 and 108, “The unrepresented sample space can be determined by identifying edge cases where the classifier created by the learning platform 12 does not generalize sufficiently to separate classes.” [i.e., a sample in the sample space and respective, separate classes], “The resulting classifier has parameters estimated from a larger sample size … and consequently has increased accuracy at classifying similar objects”, “principal component analysis is a applied on the feature vectors of a large set of similar objects to estimate a invariant feature space” [i.e., similarities between the sample and classes/sets of objects], “For example, in traffic videos, the movement of vehicles and pedestrians may be stored as well as classification of the vehicles … the client 312 can generate reports based on these results.” [i.e., calculate/generate similarities based on a classification result]).
Although Mishra substantially discloses the claimed invention, Mishra is not relied on for explicitly disclosing calculating, with respect to each sample of the at least a part of training samples, similarities between the sample and respective classes, respectively, based on a result of the classification; 
calculating, based on the similarities, another similarity representing similarities between each sample and classes to which the sample does not belong; and
calculating, based on the another similarity and yet another similarity between each sample and a class to which the sample belongs, a task complexity score for the classification task to be used for selection of a classifier for the classification task, 
wherein the calculating, based on the similarities, the task complexity score for the classification task comprises: calculating a sample complexity score for each sample, and acquiring the task complexity score for the classification task in a form of digits.
In the same field, analogous art Britto teaches calculating, with respect to each sample of the at least a part of training samples, similarities between the sample and respective classes, respectively, based on a result of the classification (see, e.g., pages 3669-3670, Algorithm 5, “estimate the classifier capability of doing the correct classification. The competence of each classifier is computed considering the support it gives for the correct class of each validation sample” [i.e., based on a result of the classification with respect to each sample of the validation/training samples and correct, respective classes], “region Ψ is defined as the k-nearest neighbors of the unknown pattern in the training set [i.e., at least a part of training samples for the classification task]. Then, the similarity function is used as a filter to preselect from Ψ, the samples for which the classifiers present similar behavior to that observed for the unknown sample t. The remaining samples are used to select the most accurate classifier” [i.e., with respect to each sample], “Compute Sim as the similarity between MCBt and MCBψ”, “vector named MCB (Multiple Classifier Behavior) which can be defined as MCBψ … contains the class labels [i.e., respective classes] assigned to the sample ψ by the M classifiers in the pool. The measure of similarity Sim can be defined as 
    PNG
    media_image2.png
    200
    400
    media_image2.png
    Greyscale
 [i.e., calculating/measuring/computing similarities between the samples and classes from Ψ]); 
calculating, based on the similarities, another similarity representing similarities between each sample and classes to which the sample does not belong (see, e.g., page 3670, Algorithm 5, “Ψ is defined as the k-nearest neighbors of the unknown pattern in the training set. Then, the similarity function is used as a filter to preselect from Ψ, the samples for which the classifiers present similar behavior to that observed for the unknown sample t” [i.e., the unknown sample is a training sample that does not belong to a class], “Compute the vector MCBt as the class labels assigned to t”, “for each sample ψj in Ψ do Compute MCBψj as the class label assigned to ψ … Compute Sim as the similarity between MCBt and MCBψ” [i.e., computing/calculating another similarity Sim as/representing similarities between each sample ψj and classes/class labels in MCBt to which ψj does not belong]); and
calculating, based on the another similarity and yet another similarity between each sample and a class to which the sample belongs, a task complexity score for the classification task to be used for selection of a classifier for the classification task (as indicated above, “a task complexity score for the classification task”, under the BRI, in light of the specification, is any numerical record or measure of complexity of a classification task or problem) (see, e.g., pages 3672 and 3676, “use an adaptive classifier ensemble selection … selects the ensemble with the optimal complexity for each test pattern from the initial pool of classifiers [i.e., used for selection of a classifier for the classification task], … Compute MOC(Oi) as the model with optimal complexity by using Ψ … Select the ensemble EoC*t and the weights for each classifier using MOC(oi)” [i.e., compute/calculate classifier model complexity based on each sample from Ψ], “we try to reveal how such a contribution could be related to the problem complexity” [i.e., classification problem/task complexity], “a set of complexity measures is used to describe the difficulty of a classification problem and relate it to the observed DS performance” [i.e., related to/based on the similarities], “We then implemented a set of complexity measures for classification problems [3], composed of two measures of overlap between single feature values (F1 and F2), two measures of separability of classes (N2 and N3) … The measures used are described below based on their generalization to problems with multiple classes” [i.e., calculating complexity measure/score for the classification problem/task based on similarities/separability measures – including another similarity between each sample and a class in the multiple classes to which the sample belongs]),
wherein the calculating, based on the similarities, the task complexity score for the classification task comprises: calculating a sample complexity score for each sample (as indicated above, “a sample complexity score”, under the BRI, in light of the specification, is any numerical record or measure of complexity of a sample) (see, e.g., pages 3672 and 3676-3676, “Compute MOC(Oi) as the model with optimal complexity by using Ψ” [i.e., compute/calculate classifier model complexity using samples from Ψ], “We then implemented a set of complexity measures for classification problems [3], composed of two measures of overlap between single feature values (F1 and F2)” [i.e., implement/calculate sample complexity measure/score for complexity of the feature values/samples], “a relation exists between the data complexity [i.e., data sample complexity] and the observed contribution of the DS approach”), and acquiring the task complexity score for the classification task in a form of digits (as indicated above, a “task complexity score for the classification task in a form of digits”, under the BRI, in light of the specification, is any numerical record, measure or average of complexity of a classification task or problem) (see, e.g., FIG. 8 of Britto, which is reproduced below, depicting “Pairwise combination of the complexity measures F1, N2, N3 and T2, considering the datasets” [i.e., numerical complexity measures F1, N2, N3 and T2 scores in the form of digits] and pages 3670, 3672 and 3677-3678 and Algorithm 5, “we try to reveal how such a contribution could be related to the problem complexity”, “a set of complexity measures is used to describe the difficulty of a classification problem” [i.e., describe/acquire classification problem/task complexity], “After implementing the previously described complexity measures, they were applied to the dataset”, “we carried out an analysis in which the complexity measures were combined in a pairwise fashion”, “a relation exists between the data complexity and the observed contribution of the DS … this relation is based on some intrinsic aspects of the classification problem” [i.e., acquire the complexity measure/task complexity score for the classification problem/task – as shown in FIG. 8, reproduced below, the task complexity scores/measures are numeric values, in the form of digits]).

    PNG
    media_image1.png
    200
    400
    media_image1.png
    Greyscale

Britto FIG. 8
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Mishra to incorporate the teachings of Britto to provide multiple classifier systems (MCSs) based on dynamic selection (DS) of classifiers (See, e.g., Britto, pages 3665-3666, Abstract and section 1). Doing so would have allowed Mishra to determine a relation between the DS contribution and the complexity of a classification problem (i.e., the complexity of a classification task) in order to determine when to apply DS, as suggested by Britto (See, e.g., Britto, page 3666). This is an example of “use of known technique to improve similar devices (methods, or products) in the same way.” See MPEP 2143.
Although Mishra in view of Britto substantially teaches the claimed invention, Mishra in view of Britto is not relied on for teaching that the classifier is a simple center classifier,
wherein calculating the similarities comprises calculating a distance between each sample and a class center of each of the one or more classes as the similarity between the sample and the class.
In the same field, analogous art Mansilla teaches a simple center classifier (paragraph 24 of applicant’s specification discloses “In the case of using the simple center classifier … the calculating sub-unit 1012 calculates a distance between each sample and a center of each class as a similarity between the sample and the class, where the distance may be a Euclidean distance”. Therefore, “a simple center classifier”, under the BRI, is a classifier that uses distances between samples and a class center, where the distances may be Euclidean distances) (see, e.g., pages 91-92, “Our study mainly focuses on the boundary complexity of a problem [i.e., complexity of a classification problem/task], where the design of a classifier may play an important role”, “class boundary: … we compute the minimum spanning tree (MST) [29] connecting all training samples, using the Euclidean distances between each pair of points” [i.e., using Euclidean distances between samples and a class boundary], “a (highest order) adherence subset associated with a point is the largest hyper-spherical neighborhood centered around p that contains only points with the same class as that of p … If all points of one class are distributed within a hypersphere centered at a certain point, only one adherence subset will be retained for that class” [i.e., a class center], “Then, the classifier system is trained” [i.e., the classifier uses distances between samples and a class center]),
wherein calculating the similarities comprises calculating a distance between each sample and a class center of each of the one or more classes as the similarity between the sample and the class (see, e.g., pages 91-92, “Length of class boundary: This measure refers to the percentage of points in the dataset that lie near the class boundary. To calculate it, we compute the minimum spanning tree (MST) [29] connecting all training samples, using the Euclidean distances between each pair of points.” [i.e., calculating a distance between each training sample and each of the classes], “a (highest order) adherence subset associated with a point is the largest hyper-spherical neighborhood centered around p that contains only points with the same class as that of p … If all points of one class are distributed within a hypersphere centered at a certain point, only one adherence subset will be retained for that class” [i.e., a center point of each class], “This metric computes the dispersion of points within classes relative to the separability between classes” [i.e., computing/calculating similarities by calculating distances between samples and class centers as relative separability between classes/similarities]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Mishra in view of Britto to incorporate the teachings of Mansilla to provide a system for relating the behavior of different classifier schemes to the geometrical complexity of a classification problem (i.e., a complexity of a classification task) where results highlight certain regions of a complexity measurement space where a classifier scheme excels (See, e.g., Mansilla, page 82, Abstract). Doing so would have allowed Mishra in view of Britto to establish a first step toward determining the best classifier scheme for a given classification problem (i.e., a given classification task), as suggested by Mansilla (See, e.g., Mansilla, page 82, Abstract).
Although Mishra in view of Britto and Mansilla substantially teaches the claimed invention, Mishra in view of Britto and Mansilla is not relied on for teaching performing classification, with a … classifier, on at least a part of training samples for the classification task, to obtain a class center of each of one or more classes of the classification task, wherein the class center of a class is an average vector of representation vectors of training samples within the class.
In the same field, analogous art Brun teaches performing classification, with a … classifier, on at least a part of training samples for the classification task (see, e.g., Abstract and pages 4396 and 4399, “we select a classifier trained in subset of data showing similar complexity than that observed in neighborhood of the test instance”, “this similarity in terms of complexity allow us to select a more competent classifier” [i.e., a classifier], “we evaluate the contribution of features related to the level of difficulty of a classification problem” [i.e., evaluating complexity of a classification problem/task], “For each problem [i.e., a classification problem/task] a pool with 100 perceptrons was created using the Bagging technique [3]. Bags containing 10% or 20% of the training samples were used” [i.e., perform classification with a classifier on bags with at least part of training samples used for the classification problem/task]), to obtain a class center of each of one or more classes of the classification task, wherein the class center of a class is an average vector of representation vectors of training samples within the class (as indicated above, “the class center of a class” that “is an average vector of representation vectors of training samples within the class”, under the BRI, is an average or center of distances between representations of training samples in a class of a classification problem or task, where the distances may be Euclidean distances) (see, e.g., pages 4397-4398, “The level of difficulty of a classification problem can be estimated using complexity measures applied on the data … measure expresses how separable are two classes according to a specific feature” and a “metric can be interpreted as the distance between the center of two classes, so that the larger its value, larger the separation between the classes.” [i.e., obtain/measure/calculate a center of each of the two classes of the classification problem/task], “for each subset of data generated, a vector composed of M complexity measures is computed … This feature set is used as an M-complexity signature for each data subset (DSi)” [i.e., representation signatures/vectors for subsets of training samples within the class], calculating “Similarity in terms of complexity: Given a testing sample t, the first step is to define its neighborhood Ƴt in the validation dataset … The similarity between the complexity signature sigγt with each training dataset complexity signature sigDSi is done by means of Euclidean distance” [i.e., a signature/average vector of representation vectors/signatures of training samples within the class], “Distance from the predicted class: Let us to consider yj as the class predicted by the classifier Ci for the test instance t, DSi as the dataset used to train Ci, and αij as the centroid of the predicted class yj in the training dataset DSi. We compute the distance of the test instance t to the centroid αij” [i.e., obtain/measure/calculate a distance between each sample t and a centroid αij of class yj/the class center of each class yj]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Mishra in view of Britto and Mansilla to incorporate the teachings of Brun to provide a technique for “select[ing] a classifier trained in subset of data showing similar complexity than that observed in neighborhood of the test instance” by “considering during the classifier evaluation the use of features related to the problem complexity” (i.e., a complexity of a classification problem/task) (See, e.g., Brun, page 4396, Abstract). Doing so would have allowed Mishra in view of Britto and Mansilla “to select a more competent classifier”, as suggested by Brun (See, e.g., Brun, page 4396, Abstract).
Although Mishra in view of Britto, Mansilla and Brun substantially teaches the claimed invention, Mishra in view of Britto, Mansilla and Brun is not relied on for teaching acquiring the task complexity … by taking a weighted average of sample complexity scores of the sample.
In the same field, analogous art Valencia teaches acquiring the task complexity (see, e.g., paragraphs 133 and 136, “use the full classifier model to generate a family of lean classifier models of varying levels of complexity (or ‘leanness’)”, “determining a priority and/or a complexity associated with the behavior that is to be analyzed” [i.e., determining/acquiring a behavior analysis task complexity]) … by taking a weighted average of sample complexity scores of the sample (as indicated above, “sample complexity scores”, under the BRI, in light of the specification, are any records or measures of complexities of samples) (see, e.g., paragraphs 133, 136, 139, and 146, “use the full classifier model to generate a family of lean classifier models of varying levels of complexity (or ‘leanness’)”, “determining a priority and/or a complexity associated with the behavior that is to be analyzed” [i.e., varying complexity levels/scores], “determine a number (N) of unique test conditions [i.e., samples] … that may be tested in boosted decision stumps) that should be evaluated in the lean classifier model … compute or determine a weighted average of the results of applying the collected behavior information to each boosted decision stump in the lean classifier model” [i.e., taking a weighted average of sample results/scores], “boost the weight of the incorrectly classified samples/test conditions” [i.e., boosting sample scores]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Mishra in view of Britto, Mansilla and Brun to incorporate the teachings of Valencia to generate a behavior vector characterizing observations, collected behavior information, and/or computing device behavior (i.e., samples) and use a full classifier model to generate a family of lean classifier models of varying levels of complexity (i.e., classifiers to handle different levels of classification task complexity) (See, e.g., Valencia, paragraph 133). Doing so would have allowed Mishra in view of Britto, Mansilla and Brun to dynamically determine behaviors to observe in greater detail, and to dynamically determine a precise level of detail required for observations, thus enabling efficiently identification and prevention of problems without requiring the use of a large amount of processor or memory resources, as suggested by Valencia (See, e.g., Valencia, paragraph 164).

Regarding claims 3 and 12, as discussed above, Mishra in view of Britto, Mansilla, Brun and Valencia teaches the apparatus of claim 7 and the method of claim 16.
Although Mishra substantially discloses the claimed invention, Mishra is not relied on for explicitly disclosing wherein the another similarity representing the similarities between each sample and classes to which the sample does not belong is a maximum value of the similarities between the sample and the classes to which the sample does not belong.
In the same field, analogous art Britto teaches wherein the another similarity representing the similarities between each sample and classes to which the sample does not belong (see, e.g., pages 3669 and 3676, “Calculate LCA(i,j) as the percentage of correct labeled samples of class ωj by the classifier ci on Ψ … Select the best classifier for t as c*t = arg maxi{LCA(i,j)} … calculate the local class accuracy(LCA)” [i.e., a maximum value between the sample t and classes ωj in Ψ to which the sample t does not belong], “We then implemented a set of complexity measures for classification problems [3], composed of two measures of overlap between single feature values (F1 and F2) … (F2): this measure conducts a pairwise calculation of the overlap between the conditional distribution of classes … the overlap, considering two classes, ci and cj, and a T-dimension feature space, is calculated by finding the minimum and maximum values of each feature fk for both classes.” [i.e., the another similarity is a maximum value of the similarities/overlap between the sample and the classes, ci, cj, etc. to which the sample does not belong]). 

Regarding claims 4 and 13, as discussed above, Mishra in view of Britto, Mansilla, Brun and Valencia teaches the apparatus of claim 7 and the method of claim 16.
Although Mishra substantially discloses the claimed invention, Mishra is not relied on for explicitly disclosing wherein the another similarity representing the similarities between each sample and classes to which the sample does not belong is an average value of the similarities between the sample and the classes to which the sample does not belong.
In the same field, analogous art Britto teaches wherein another similarity representing the similarities between each sample and classes to which the sample does not belongs an average value of the similarities between the sample and the classes to which the sample does not belong (see, e.g., pages 3676-3677, “We then implemented a set of complexity measures for classification problems [3], composed of two measures of overlap between single feature values (F1 and F2) … (F1): this well-known measure of class overlapping is calculated over each single feature dimension … where M is the number of classes and μ is the overall mean, while ni, μi and sij are the number of samples, the mean and the jth sample of the class i, respectively”, “(T2): this measure describes the density of spatial distributions of samples by computing the average number of instances per dimension.” [i.e., the another similarity is an overall mean μ/average value of the similarities/overlap measures between the sample and the classes of the M classes to which the sample does not belong]).

Regarding claims 6 and 15, as discussed above, Mishra in view of Britto, Mansilla, Brun and Valencia teaches the apparatus of claim 7 and the method of claim 16.
Although Mishra substantially discloses the claimed invention, Mishra is not relied on for explicitly disclosing wherein weights are adjusted based on a number of samples that are included in each of the classes.
In the same field, analogous art Britto teaches wherein weights are adjusted based on a number of samples that are included in each of the classes (see, e.g., pages 3669 and 3672, “the class posterior probability is weighted by δi, which represents the Euclidian distance between the sample ψi and the unknown pattern t”, “selects an ensemble of classifiers, considers continuous-valued outputs and weighted class supports”, “given the combination weights among the classifiers … Compute MOC(oi) as the model with optimal complexity by using Ψ … Select … the weights for each classifier using MOC(oi)” [i.e., weights are selected/adjusted based on number of samples in ψ in each of the classes]).

Regarding claims 9 and 18, as discussed above, Mishra in view of Britto, Mansilla, Brun and Valencia teaches the apparatus of claim 7 and the method of claim 16.
Mishra further discloses wherein the classifier is further configured to be trained based on the at least a part of training samples (see, e.g., paragraphs 34-35, ““the learning platform 12 can be given training data … the trained classifier can be executed over the dataset on non-labelled data”, “learning platform 12 determines parameters for a classifier that tries to detect patterns in the data … as new data is collected, the classifier can be retrained” [i.e., the classifier is trained based on at least part of training data samples]).

Regarding claim 19, as discussed above, Mishra in view of Britto, Mansilla, Brun and Valencia teaches the method of claim 16.
Examiner’s Note: claim 19, as drafted, depends from claim 16. If applicant intended for claim 19 to be an independent claim, the examiner suggests that one way to do so is to amend the last step of claim 19 to explicitly recite the steps of claim 16 instead of the current recitation of a “non-transitory computer readable storage medium storing computer executable program codes, that when executed by a computer, cause the computer to perform the method of claim 16”.
Mishra further discloses a non-transitory computer readable storage medium storing computer executable program codes (see, e.g., paragraph 115 and claim 20, “any module or component exemplified herein that executes instructions may include or otherwise have access to computer readable media such as storage media, computer storage media … Any such computer storage media may be part of the system 10, any component of or related to the system 10 (e.g., the learning platform 12” [i.e., a computer readable storage medium storing code in system 10/platform 12], “A non-transitory computer readable medium comprising computer executable instructions for performing distributed learning” [i.e., a non-transitory computer readable medium storing computer executable instructions/codes executable by a computer to cause the computer to perform distributed learning]) according to claim 16 (as indicated above, Mishra in view of Britto, Mansilla, Brun and Valencia teaches the method of claim 16, see above citations to Mishra, Britto, Mansilla and Valencia regarding the limitations of claim 16).

Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon, is considered pertinent to applicant's disclosure. 
For example, non-patent literature Ho, Tin Kam ("Data Complexity Analysis for Classifier Combination." 2001, hereinafter “Ho”) discloses several numerical measures of geometrical complexity of different classes of data samples (see, e.g., Ho, pages 11-12 and Table 3).
The examiner requests, in response to this office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the reference cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111 (c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/R.K.B./Examiner, Art Unit 2125

/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125