DETAILED ACTION
This action is in response to the communications filed 03/05/2021 in which claims 1, 8, 15, and 22 are currently amended; claims 7, 14, 21, and 28 are cancelled; and claims 1-6, 8-13, 15-20, and 22-27 are still pending.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted 06/10/2020 has been considered by the examiner.
Response to Arguments
Applicant’s amendments and remarks filed 03/05/2021 have been fully considered by the examiner.
Regarding the rejection of claims 35 USC § 112(a) -New Matter  (see pages 11-12), applicant’s arguments are not persuasive. The guidelines for determining  and treatment of new matter under 35 USC § 112(a) are provided in MPEP § 2163.06; and requires that if new matter is added to the claims, a rejection under  35 USC § 112(a) should be made of record.
Regarding applicant’s remarks that support can be found in PGPub US 2016/0275414 specification paragraphs 0034-0036, the examiner respectfully disagrees. The recitation in the cited paragraphs are directed to use of classifiers to process data sets, and there is no recitation direct to the use of a capacity threshold for generating data set updates as recited by applicant’s amended claim limitations. Therefore, the rejection has been maintained. 


First, Applicant’s argument that the prior art made of record, Masud (US Pub No. 2012/0054184) does not teach the claim limitations as recited in claim 1 amended elements directed to use of training samples deemed redundant.  The examiner has cited new art to address this limitation. See full rejection below. 

Claim Interpretation

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification, as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:

(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Specifically, the following claims:
Claim 8 limitations as highlighted below:
means for receiving, after training the artificial neural network with a first set of existing training samples of an existing training set, a new training sample at the apparatus, the new training sample having a same class as the first set of existing training samples in the existing training set stored in a memory of the apparatus; 
means for calculating at least one of a first similarity metric, a second similarity metric, or a combination thereof for the new training sample, wherein the first similarity metric is associated with a first distance between the new training sample and the first set of existing training samples of the same class as the new training sample, and wherein the second similarity metric is associated with a second distance between the new training sample and a second set of existing training samples, in the training set, of a different class than the new training sample; 
means for selectively updating the first set of existing training samples to include the new training sample based on the first similarity metric or the second similarity metric, wherein updating the first set of existing training samples comprises: selecting an existing training sample from the first set of existing training samples for removal from the memory when updating the first set of existing training samples would cause the memory to exceed a capacity threshold, the existing training sample having been used to train the artificial neural network prior to receiving the new training sample, and the selected existing training sample is determined to be a redundant training sample for having a smallest distance to another existing training sample in comparison to a distance between pairs of existing training samples in the first set of existing training samples stored in the memory; removing the redundant training sample from the memory; and -4- 67731341v.1Docket No. 150792
means for retraining the artificial neural network with the updated first set of existing training samples.
Claim 9 limitations as highlighted below:
means for calculating at least one of a third similarity metric or a fourth similarity metric, wherein the third similarity metric is associated with a candidate training sample from the training set and existing training samples of a same class as the candidate training sample, and wherein the fourth similarity metric is associated with the candidate training sample and existing training samples of a different class than the candidate training sample; and 
means for selectively removing the candidate sample from the memory based at least in part on the at least one of the third similarity metric or the fourth similarity metric.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification, as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
Claim limitations in this application (current claims 8 and 9) that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.  Examiner notes [0052-0053] & [0098], in the published application US Pub. No. 2016/027541, teaching a processor (CPU) 152 carrying out the instructions described in specification for carrying out algorithm shown in FIG. 8.

Claim Rejections - 35 USC § 112- Written Description and New Matter
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.


Claims 1-6, 8-13, 15-20, and 22-27 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Regarding claim 1, the claim recites the limitations “selecting an existing training sample from the first set of existing training samples for removal from the memory when updating the first set of existing training samples would cause the memory to exceed a capacity threshold, the existing training sample having been used to train the artificial neural network prior to receiving the new training sample, and the selected existing training sample is determined to be a redundant training sample for having a smallest distance to another existing training sample in comparison to a distance between pairs of existing training samples in the first set of existing training samples stored in the memory” that is not 
Regarding claims 8, 15, and 22, recite similar limitations as the one noted in claim 1 above and is rejected under the same rationale.

Regarding dependent claims 2-6 that depend of claim 1, the claims do not resolve the deficiencies noted in their independent claim above; therefore, the claims are appropriately rejected.
Regarding dependent claims 9-13 that depend of claim 8, the claims do not resolve the deficiencies noted in their independent claim above; therefore, the claims are appropriately rejected.
Regarding dependent claims 16-20 that depend of claim 15, the claims do not resolve the deficiencies noted in their independent claim above; therefore, the claims are appropriately rejected.
Regarding dependent claims 23-27 that depend of claim 22, the claims do not resolve the deficiencies noted in their independent claim above; therefore, the claims are appropriately rejected.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the 


Claims 1-4, 8-11, 15-18, 22-25, and 28 are rejected under 35 U.S.C. 103 as being unpatentable over by Masud (US Pub No. 2012/0054184), in view of Yang et al. (US Pub. No. 2011/0222724, hereinafter ‘Yang’), and in further view of Law et al. (NPL: "An Adaptive Nearest Neighbor Classification Algorithm for Data Streams", hereinafter 'Law'), and in further view of Diao et al. (NPL: “Feature Selection Inspired Classifier Ensemble Reduction”, hereinafter ‘Diao’).

Regarding independent claim 1 limitations, Masud teaches a method of managing memory usage of an …, comprising: (Masud teaches a method for managing memory usage efficiently, per examples disclosed, in [0091]-[0093] & disclosed below):
receiving, after training … with a first set of existing training samples of an existing training set, a new training sample at a device, (Maud teaches after training classifiers as trained models classifying new data stream points from incoming data, in [0043]-[0045]: A goal of data stream classification may be to train classification models based on past labeled data, and to clas­sify data from future incoming data streams using these trained models…As the model processes the training data, the model learns to categorize the data into classes, and any errors in the classi­fication may be addressed…A classification model should adapt itself to the most recent concept in order to cope with concept-drift.; where the existing classes have been trained with data points of an existing class, that is considered the first set of existing training samples of each classification model in the ensemble, in [0051-0052]: … In one embodiment, classification models 106 comprise an ensemble of N models, and each model may be trained to classify data instances using a labeled, data chunk. The ensemble may also be continuously updated so that it repre­sents the most recent concept in the stream. For example, the update may be performed in one embodiment as follows: when a new classification model is trained,… In addition, each classification model 106 in the ensemble may detect novel classes within data stream 104… A class may be defined as a novel class if none of the classification models 106 has been trained with that class. Otherwise, if one or more of the classification models 106 has been trained with that class, then that class may be considered an existing class…; where the existing set of data points are used to build existing class clusters, as the recited  first set of existing training samples of an existing training set in an existing class that has been trained, in [0009]: …existing class clusters that have been built from training data in accordance with an illustrative embodiment ;and receiving new data instances to detect new samples with the data set associated with the trained existing class data points, as the recited  first set of existing training samples of an existing training set, in [0053]-[0056]: The detection and determination of a novel class may comprise the following main aspects. First, a decision boundary may be built during training of the models… For instance, when classifying a data point within the data stream 104, if the data point is determined to be inside the decision boundary of any classification model 106 in the ensemble, then that data point may be classified as an existing class instance using majority voting of the models. However, if that data point is outside the decision boundary of all the classification models 106, then the data point may be consid­ered an F-outlier, and the data instance is temporarily stored in a buffer buf… The cohesion and separation analyzer 114 compares the F-outliers to each other and to the existing classes. In particular, the cohesion and separation analyzer 114 makes a determination as to whether the F-outliers represent data points that are well separated from the training data points of the existing classes...)
the new training sample having a same class as the first set of existing training samples in the existing training set stored in a memory of the device; (Masud teaches the Classifiers and a trained set of N models where the new training data samples arrive as the data stream that fall inside the decision boundary, as the recited  a the new training sample having a same class as the first set of existing training samples in the existing training set, in [0054]-[0055]: … a novel class determination engine 108 may comprise a decision boundary builder 110, an F-outlier identifier 112, and a cohesion and separation ana­lyzer 114. Decision boundary builder 110 may be used to identify boundaries around the training data; where the training instances are saved into a into memory for determining the same class as closer to data points belonging to the same class and far from points to other class, in [0052]: … each classification model 106 in the ensemble may detect novel classes within data stream 104… A class may be defined as a novel class if none of the classification models 106 has been trained with that class. Otherwise, if one or more of the classification models 106 has been trained with that class, then that class may be considered an existing class…)
calculating at least one of a first similarity metric, a second similarity metric, or a combination thereof for the new training sample, wherein the first similarity metric is associated with a first distance between the new training sample and the first set of existing training samples of the same class as the new training sample, and wherein the second similarity metric is associated with a second distance between the new training sample and a second set of existing training samples, in the existing training set, of a different class than the new training sample; (Masud teaches calculating a first similarity metric as the cohesion distance measure for the sample being closer to the other data points in the same class and the second similarity metric as the separation distance measure for how far apart the data points are from other classes that are not of the same class, in [0052]: … Data points belonging to the same class should be closer to each other (cohesion) than other data points, and should be far apart from the data points belonging to other classes (separa­tion).; or determining instances that belong to the same class using the nearest neighbor concepts for defining a first or second distance metric for each respective trained class set as the mean distance from labeled class c to the new sample instance x, where the class with the minimum distance is the class label, in [0080]-[0081]: 

    PNG
    media_image1.png
    275
    673
    media_image1.png
    Greyscale
.

    PNG
    media_image2.png
    438
    654
    media_image2.png
    Greyscale

)
selectively updating the first set of existing training samples to include the new training sample based on the first similarity metric or the second similarity metric, (Masud teaches selectively updating based on the selected min distance among the first and second similarity metric of class label options that are same class as the x-new training data in the existing c classes, including the first, second and plurality of c nearest distance metric, in [0080]-[0081].)
wherein updating the first set of existing training samples comprises: selecting an existing training sample from the first set of existing training samples for removal from the memory when updating the first set of existing training samples would cause the memory to exceed a capacity threshold, the existing training sample having been used to train the … prior to receiving the new training sample, and the selected existing training sample … having a smallest distance to another existing training sample in comparison to a distance between pairs of existing training samples in the first set of existing training samples stored in the memory; removing the … from the memory; (Masud teaches updating the first set of existing training samples classifiers trained earlier in the trained ensemble models before new instances are selected a x class instance the older instances my be determined that removing an existing sample in the oldest class of existing samples, is the updating step based on an age memory threshold of the existing training data class data set, in [0118]: … This means class c' may have been outdated, and in that case, L, may be removed from the ensemble. FIG. 6A is an example illustra­tion of scenario (1 ), and shows an example of the impact of evolving class labels on ensemble. The classifier (s) in the ensemble may be sorted according to their age, with L1 being the oldest, and La being the youngest. Each classifier L, may be marked, with the classes with which, it has been trained…. Therefore, c1 may have become outdated, and L1 may be removed from the ensemble. In this way, it may be ensured that older classifiers have less impact in the voting process…; where the instance that is determined to in detected as part of the same class of an existing class c is a minimum distance between the mean distance of the class it is labeled with, that is considered the selected existing training sample having a smallest distance to another existing training sample in comparison to a distance between pairs of existing training samples in the first set of existing training samples stored in the memory, in [0080]-[0081]; where the classifiers are sorted by age and the oldest may be removed, the claimed removing the … from the memory, in [0118]… Therefore, c1 may have become outdated, and L1 may be removed from the ensemble. In this way, it may be ensured that older classifiers have less impact in the voting process….)
retraining the…..with the updated training set. (Masud teaches the ensemble, the learning model trained with a learning algorithm, in [0209], is updated (i.e. retrained) on the latest training chuck that is considered the updated training set, in [0211].)
While Masud does teach supplying the constructed feature vectors to the learning algorithm to train a model, that is considered retraining a model with an update feature training set, in [0209], that can be newly trained (i.e. retrained) model with latest training chuck.
Masud does not expressly teach claim 1 limitations:
…after training the artificial neural network… [as the classification algorithm]
… the artificial neural network … [as the classification algorithm]
retraining the artificial neural network with …training set.
Yang does teach claim 1 limitations:
…after training the artificial neural network… [as the classification algorithm]  and … the artificial neural network … [as the classification algorithm] (Yang teaches the trained set of convolution neural network (CNN) classification models for classification of gender and age, in [0015]-[0016], as trained base learners that periodically updated with new test videos, as a new training sample using a supervised learning, in [0020]-[0021]: …the aligned faces and their correspondences given by visual tracking are stored as additional data which are used to update the Baseline CNN models periodically. Then, the updated CNN models are applied to new test videos. Note, we always update the Base­line models using the additional data collected online, not the latest updated CNN models, to avoid model drift...)
retraining the artificial neural network with …training set. (Yang teaches retraining the CNN to update the models with training set including new test videos, in [0020])
Masud and Yang are analogous art because they are directed towards methods and systems for classifying data.
It would have been obvious to one of ordinary skill in the art before art before the effective filing date of the claimed invention to integrate the use of neural network classifiers and retraining artificial neural networks with training data disclosed by Yang with the method for training classifiers as disclosed by Masud.
One of ordinary skill in the arts would have been motivated to integrate the disclosed methods in order to provide an improvement of updated baseline trained CNN classifier models to avoid model drift (Yang, [0007]); doing so will allow the learning analysis to adapt to specific sciences in the training  incrementally to enhance  (Yang, [0003]).
While Masud teaches setting a capacity threshold as the length of time exceeded for storing elements (e.g. an existing dataset) in memory when updating and removing training samples from memory as disclosed above. 

Law expressly teaches the capacity threshold as the number of non-empty memory blocks. (Law teaches selecting an existing training sample as a selected data array to be removed from the existing classifier that is to be updated with a new training sample when the memory capacity threshold is be exceeded and the memory runs out, in pg. 114: Sec. 2.4: 2nd para & Algorithm 3: … Therefore, we only need to update the data array of these blocks and their classes if necessary. During the update process, the system may run out of memory as the number of nonempty blocks may increase. To deal with this, we may simply remove the finest data array…

    PNG
    media_image3.png
    524
    1063
    media_image3.png
    Greyscale

)
Masud, Yang, and Law are analogous art because they are directed towards methods and systems for classifying data.
It would have been obvious to one of ordinary skill in the art before art before the effective filing date of the claimed invention to integrate method for selecting existing training instances from an existing class based on a memory constraint based on the number of memory blocks as disclosed by Law with the method for training classifiers as collectively disclosed by Masud and Yang.
One of ordinary skill in the arts would have been motivated to integrate the disclosed methods in order to provide an improvement of enabling incremental algorithms that are fast and gracefully st para.); doing so would allow for the update classifiers to adapt to concept drift behaviors with data streams as necessary (Law, Sec. 2.4).
While Masud in combination with Yang and Law discuss the process for updating classifiers based on by monitoring a capacity threshold associated with the memory for storing current training data samples.
Masud, Yang, and Law do not expressly teach the samples in a current class data (e.g. existing training data) as redundant samples to be removed.
Diao does express teach the samples in a current class data (e.g. existing training data) as redundant samples to be removed, as recited in the claim limitation
… selected existing training sample is determined to be a redundant training sample…; removing the redundant training sample from the memory (in pg. 1260: Left Col. 2nd full ¶ : … Each ensemble member is now transformed into an artiﬁcial feature in a newly constructed dataset, and the feature values are generated by collecting the classiﬁers’ predictions. FS algorithms can then be used to remove redundant features (now representing classiﬁers) in the present context, to select a minimal classiﬁer subset while maintaining original ensemble diversity, and preserving ensemble prediction accuracy…)
Masud, Yang, Law, and Diao are analogous art because they are directed towards methods and systems for classifying data.
It would have been obvious to one of ordinary skill in the art before art before the effective filing date of the claimed invention to integrate method for selecting existing training instances as redundant samples to be removed as disclosed by Diao with the method for training classifiers as collectively disclosed by Masud, Yang, and Law.


Regarding claim 2, the rejection of claim 1 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the method of claim 1, further comprising:
calculating at least one of a third similarity metric or a fourth similarity metric, wherein the third similarity metric is associated with a candidate training sample from the training set and existing training samples of a same class as the candidate training sample, and wherein the fourth similarity metric is associated with the candidate training sample and existing training samples of a different class than the candidate training sample; and (Masud teaches the calculating of similarity metric as a vote from the existing classes L to see if the candidate training sample form the training set is not an outlier and associated with an existing class and further evaluate the test point if it is determined to be a potential outlier, in [0108]; where x input data from the data stream within the set of outliers is associated with a q-NSC metric, that includes an additional set of third, fourth …etc. similarity metric used to remove outliers from the buffer and classify the test point as an existing class instance and update the ensemble of existing classes, in [0110]-[0113].)
selectively removing the candidate training sample from the memory based at least in part on the at least one of the third similarity metric or the fourth similarity metric. (Masud teaches removing the F-outliers/ x data instances (candidate training samples) from the buffer memory, that is candidate training sampled stored in the buffer, based on the q-neighborhood silhouette coefficient (q-NSC), that is a third and fourth similarity matric, in [0111]-[0113].)

Regarding claim 3, the rejection of claim 1 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the method of claim 1:
wherein at least one of the first similarity metric or the second similarity metric are computed based at least in part on a centroid of the first set of existing training samples. (Masud teaches computing the decision boundary is based on the test points is associated with an existing class instance, considered the first set of test instances, in [0098]-[0099] based in part of the centroid of the points in the existing training instances , in [0096]-[0097]; and the computing the distances for cohesion and separation in part on a centroid as the mean distance of existing data points, in [0080]-[0081].)

Regarding claim 4, the rejection of claim 1 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the method of claim 1:
wherein the first similarity metric comprises a difference between: a minimum distance between any two points of the first set of existing training samples and a minimum distance between points of the new training sample and all training samples of the first set of existing training samples. (Masud teaches the first similarity metric as a decision boundary using a to determine classification of training data into groups associated with an existing classes using existing training samples, the first set of existing training samples), in [0098]-[0099]; where the metric comprises comparing the distance of the test and a minimum distance using the nearest neighbor distance rule using the distance between two points and the distances with all points in the class of training examples in an existing class as the minimum distance among all distance to compute the nearest neighbor foe the point x in the first set of existing training samples, in [0080]-[0083].)

(Masud teaches a method for managing memory usage of training set, training chucks stored in memory for classification models, in [0093] & as disclosed below):
the claim limitations are similar to those in claim 1 and are rejected under the same rationale
Examiner notes means for is associated with the structure in paragraph [0077] & [0098] of applicant specification as hardware and or software components on a processor or ASIC for performing corresponding functions. This is disclosed in the Masud reference as the processor for executing computer instructions for preforming the disclosed associated functions, in [0142], [0249], [0257], [0261]-[0262]. 

Regarding claim 9, the rejection of claim 8 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the apparatus of claim 8:
similar to claim 2 limitations and are rejected under the same rationale.

Regarding claim 10, the rejection of claim 8 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the apparatus of claim 8:
similar to claim 3 limitations and are rejected under the same rationale.

Regarding claim 11, the rejection of claim 8 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the apparatus of claim 8:
similar to claim 4 limitations and are rejected under the same rationale.


 (Masud teaches a method for managing memory usage of training set, training chucks stored in memory for classification model, in [0093] & as disclosed below):
a memory; and at least one processor coupled to the memory and configured: (Masud teaches a processor associated with memory, in [0142] & [0256]-[0258].)
the claim limitations are similar to those in claim 1 and are rejected under the same rationale

Regarding claim 16, the rejection of claim 15 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the apparatus of claim 15:
the claim limitations are similar to those in claim 2 and are rejected under the same rationale

Regarding claim 17, the rejection of claim 15 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the apparatus of claim 15:
the claim limitations are similar to those in claim 3 and are rejected under the same rationale

Regarding claim 18, the rejection of claim 15 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the apparatus of claim 15:
the claim limitations are similar to those in claim 4 and are rejected under the same rationale

 for managing memory usage of …., the program code being executed by a processor and comprising: (Masud teaches a method for managing memory usage of training set, training chucks stored in memory for classification models, in [0093] & as disclosed below; where the disclosure includes a computer memory for storing computer-usable program code executed on a computing device with a processor and memory unit, in [0256]-[0258]:
the claim limitations are similar to those in claim 1 and are rejected under the same rationale.


Regarding claim 23, the rejection of claim 22 is incorporated and Masud in combination with Yang and Law further teaches the computer-readable medium of claim 22:
the claim limitations are similar to those in claim 2 and are rejected under the same rationale.
Masud teaches where processes are executed using computer program code, in [0261].)

Regarding claim 24, the rejection of claim 22 is incorporated and Masud in combination with Yang and Law further teaches the computer-readable medium of claim 22:
the claim limitations are similar to those in claim 3 and are rejected under the same rationale.
Masud teaches where processes are executed using computer program code, in [0261].)

Regarding claim 25, the rejection of claim 22 is incorporated and Masud in combination with Yang and Law further teaches the computer-readable medium of claim 22:
the claim limitations are similar to those in claim 4 and are rejected under the same rationale.
Masud teaches where processes are executed using computer program code, in [0261].)


Claims 5-6, 12-13, 19-20, and 26-27 are rejected under 35 U.S.C. 103 as being unpatentable over Masud (US Patent Application Publication No. 2012/0054184) in view of Yang et al. (US Pub. No. 2011/0222724, hereinafter ‘Yang’), in further view of Law et al. (NPL: “An Adaptive Nearest Neighbor Classification Algorithm for Data Streams”, hereinafter ‘Law’), and in further view of Diao et al. (NPL: “Feature Selection Inspired Classifier Ensemble Reduction”, hereinafter ‘Diao’), and in further view of Alcalde et al. (US Patent Application Publication No. 2008/0021851 hereinafter ‘Alcalde’).

Regarding claim 5, the rejection of claim 1 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the method of claim 1:
wherein the first similarity metric comprises a difference between: a … summed distance between any two points of the first set of existing training samples, and a … summed distance between points of the new training sample and all training samples of the first set of existing training samples. (Masud teaches the decision boundary  determining a x-sample instance class membership, uses the average of the summed distances of the points of the new training sample as the x instance belonging to the novel class c or all the points in existing training sample c’ (the first set of data from the data stream after the models are trained) to determine the nearest neighbor to the points in new x training samples associated with an existing class, that uses the first similarity metric [0089]; where the comparison to determine the nearest neighbor for a x-new training instance samples is determined when the average summed distance of points in the in the training samples class c is smaller than the average summed distance of any of the existing class c’, in [0089]; where the samples in an existing class, that is any two points of the existing training samples in class c’ are q nearest neighbors, distance between all t points of the existing class is minimized, in [0080], that is than summed and averaged, in [0081] and labeled by an existing class label, in [0083].)
While Masud does disclose the use of an averaged distance among data points, in [0081], that can be expressed as a Euclidean, in [0100], Masud, Yang, Law, and Diao do not expressly teach the use of a maximized summed distance.
Alcalde does teach the use of a maximized summed distance. (Alcalde teaches the use of the maximized Euclidean distance among data points to select a maximized pair of songs, in [0105] that is expressed as a summed distance in [0074] among all points in a data base from the data sample seed song, in [0074]; where the selected song of possible pairs are used in determining high distance pairs, in [0110], that includes any two points in of the existing training data samples, where the longest pairs, that is a maximum Euclidean distance among any two points, are grouped, in [0118].) 
Masud, Yang, Law, Diao, and Alcalde are analogous art because they are directed towards methods and systems for classifying data.
It would have been obvious to one of ordinary skill in the art before art before the effective filing date of the claimed invention to integrate the use of maximized summed Euclidean distances disclosed by Alcalde with the method for training classification models collectively disclosed by Masud, Yang, Law, and Diao.
One of ordinary skill in the arts would have been motivated to integrate the disclosed methods in order to provide an improvement to mathematically determine patterns associated with patterns 

Regarding claim 6, the rejection of claim 1 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the method of claim 1:
the first similarity metric comprises a … summed distance between each point of the first set of existing training samples, and all other sets of existing training samples in other classes, the second similarity metric comprises a … summed distance between points of the new training sample and all other sets of existing training samples in the other classes, and , and the new training sample is selectively added to the first set of existing training samples based at least in part on a difference between the first similarity metric and the second similarity metric. (Masud teaches the selectively adding new data instance x to the training set based on comparing differences between the mean Euclidean distances based on two distance metrics, that is                                 
                                    
                                        
                                            D
                                        
                                        -
                                    
                                
                            (x, H) &                                 
                                    
                                        
                                            D
                                        
                                        -
                                    
                                
                            (x’, H), in [0103]; where the second metric is the comprises the distance of x test point (i.e. new training sample in the first set of data from the data stream after the classifiers are trained) that maybe inside a decision boundary of existing instances in existing class H (the set of other classes in the ensemble of classifiers, in [0098]-[0099]), and the distance is computing between xi point in existing cluster H, and all points in existing training class H, that is than averaged over the summed distances, (i.e. average summed distance), in [0100]-[0103]; and the first metric comprises the distance x’ (i.e. an existing training sample) as the existing class data point, that is a point associated with an existing training sample and the xi points in existing cluster H, all points in existing training class H, in [0099]- [0100], that is than averaged over the summed distances, that is an average summed distance, in [0103]; for determining the test instance x, the first set of training data, is associated (selectively added) with an existing class instance, the set of training data for training an existing class, in [0099]; based on testing the test instance using the decision boundary (using the first and second distance metric), in [0103]-[0104].)
While Masud does disclose the use of an averaged distance among data points, in [0081], that can be expressed as a Euclidean, in [0100], Masud, Yang, and Law do not expressly teach the use of a maximized summed distance.
Alcalde does teach the use of a maximized summed distance. (Alcalde teaches the use of the maximized Euclidean distance among data points to select a maximized pair of songs, in [0105] that is expressed as a summed distance in [0074] among all points in a data base from the data sample seed song, that is the second metric, in [0074]; where the selected song of possible pairs are used in determining high distance pairs, in [0110], that includes any two points in of the existing training data samples, where the longest pairs, that is a maximum Euclidean distance among any two points, are grouped, that is the first metric, in [0118].) 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Masud, Yang, Law and Alcalde for the same reasons disclosed above in claim 5.
	
Regarding claim 12, the rejection of claim 8 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the apparatus of claim 8:
the claim limitations are similar to those in claim 5 and are rejected under the same rationale.



the claim limitations are similar to those in claim 6 and are rejected under the same rationale.

	
Regarding claim 19, the rejection of claim 15 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the apparatus of claim 15:
the claim limitations are similar to those in claim 5 and are rejected under the same rationale.

Regarding claim 20, the rejection of claim 15 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the apparatus of claim 15:
the first similarity metric comprises a … summed distance between each point of the first set of existing training samples, and all other sets of existing training samples in other classes, the second similarity metric comprises a … summed distance between points of the new training sample and all other sets of existing training samples in the other classes, and , and the new training sample is selectively added to the first set of existing training samples based at least in part on a difference between the first similarity metric and the second similarity metric. (Masud teaches the selectively adding new data instance x to the training set based on comparing differences between the mean Euclidean distances based on two distance metrics, that is                                 
                                    
                                        
                                            D
                                        
                                        -
                                    
                                
                            (x, H) &                                 
                                    
                                        
                                            D
                                        
                                        -
                                    
                                
                            (x’, H), in [0103]; where the second metric is the comprises the distance of x test point (i.e. new training sample in the first set of data from the data stream after the classifiers are trained) that maybe inside a decision boundary of existing instances in existing class H (the set of other classes in the ensemble of classifiers, in [0098]-[0099]), and the distance is computing between xi point in existing cluster H, and all points in existing training class H, that is than averaged over the summed distances, (i.e. average summed distance), in [0100]-[0103]; and the first metric comprises the distance x’ (i.e. an existing training sample) as the existing class data point, that is a point associated with an existing training sample and the xi points in existing cluster H, all points in existing training class H, in [0099]- [0100], that is than averaged over the summed distances, that is an average summed distance, in [0103]; for determining the test instance x, the first set of training data, is associated (selectively added) with an existing class instance, the set of training data for training an existing class, in [0099]; based on testing the test instance using the decision boundary (using the first and second distance metric), in [0103]-[0104].)
While Masud does disclose the use of an averaged distance among data points, in [0081], that can be expressed as a Euclidean, in [0100], Masud, Yang, and Law do not expressly teach the use of a maximized summed distance.
Alcalde does teach the use of a maximized summed distance. (Alcalde teaches the use of the maximized Euclidean distance among data points to select a maximized pair of songs, in [0105] that is expressed as a summed distance in [0074] among all points in a data base from the data sample seed song, that is the second metric, in [0074]; where the selected song of possible pairs are used in determining high distance pairs, in [0110], that includes any two points in of the existing training data samples, where the longest pairs, that is a maximum Euclidean distance among any two points, are grouped, that is the first metric, in [0118].) 


Regarding claim 26, the rejection of claim 22 is incorporated and Masud in combination with Yang, Law, and Diao further teaches the computer-readable medium of claim 22:
the claim limitations are similar to those in claim 5 and are rejected under the same rationale.

Regarding claim 27, the rejection of claim 22 is incorporated and Masud in combination with Yang, Law, and Diao in combination with Yang and Law further teaches the computer-readable medium of claim 22:
the first similarity metric comprises a … summed distance between each point of the first set of existing training samples, and all other sets of existing training samples in other classes, the second similarity metric comprises a … summed distance between points of the new training sample and all other sets of existing training samples in the other classes, and , and the new training sample is selectively added to the first set of existing training samples based at least in part on a difference between the first similarity metric and the second similarity metric. (Masud teaches the selectively adding new data instance x to the training set based on comparing differences between the mean Euclidean distances based on two distance metrics, that is                                 
                                    
                                        
                                            D
                                        
                                        -
                                    
                                
                            (x, H) &                                 
                                    
                                        
                                            D
                                        
                                        -
                                    
                                
                            (x’, H), in [0103]; where the second metric is the comprises the distance of x test point (i.e. new training sample in the first set of data from the data stream after the classifiers are trained) that maybe inside a decision boundary of existing instances in existing class H (the set of other classes in the ensemble of classifiers, in [0098]-[0099]), and the distance is computing between xi point in existing cluster H, and all points in existing training class H, that is than averaged over the summed distances, (i.e. average summed distance), in [0100]-[0103]; and the first metric comprises the distance x’ (i.e. an existing training sample) as the existing class data point, that is a point associated with an existing training sample and the xi points in existing cluster H, all points in existing training class H, in [0099]- [0100], that is than averaged over the summed distances, that is an average summed distance, in [0103]; for determining the test instance x, the first set of training data, is associated (selectively added) with an existing class instance, the set of training data for training an existing class, in [0099]; based on testing the test instance using the decision boundary (using the first and second distance metric), in [0103]-[0104].)
While Masud does disclose the use of an averaged distance among data points, in [0081], that can be expressed as a Euclidean, in [0100], Masud, Yang, and Law do not expressly teach the use of a maximized summed distance.
Alcalde does teach the use of a maximized summed distance. (Alcalde teaches the use of the maximized Euclidean distance among data points to select a maximized pair of songs, in [0105] that is expressed as a summed distance in [0074] among all points in a data base from the data sample seed song, that is the second metric, in [0074]; where the selected song of possible pairs are used in determining high distance pairs, in [0110], that includes any two points in of the existing training data samples, where the longest pairs, that is a maximum Euclidean distance among any two points, are grouped, that is the first metric, in [0118].) 
.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure is listed below:
Gao et al (US Pub No 10,520,397): teaches remove redundant samples and using a max distance for classification of samples.
Lewis et al. (US Pat. No.  9111218) teaches a method for updating classifiers based on concept drift and changes in training data in real-time. 
Graepel et al. (US Patent Publication No. 7,167,849): Graepel teaches the use of maximized sum for extrapolating training objects, in 7:29-8:14.
Platt el al. (US Pub No. 2004/0002931): teaches using a trained classifier for making predications using k-nearest neighbor estimations of input instances in the same class.
Deo et al. (US Pub No. 2009/0210368): teaches retraining a neural network classifier by examining the new training data instances with existing training data.
Ou et al (NPL: “Multi-class pattern classification using neural networks”): teaches neural networks as an ensemble of classifiers used to generating a decision boundary of data classes associated with an existing set of training samples for the particular class label/class membership.
                                                                                                         
Any inquiry concerning this communication or earlier communications from the examiner should be directed to OLUWATOSIN ALABI whose telephone number is (571)272-0516.  The examiner can normally be reached on Monday-Friday, 8:00am-5:00pm EST..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571) 272-9767.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact 






/O.O.A./Examiner, Art Unit 2126 
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126