DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Objections 
Claims 1, 13, and 25 are objected.  A limitation “the representative data elements” should be read “the one or more representative data elements”.  An appropriate correction is required.
Claims 1, 13, and 25 are objected.  A limitation “the sequence” should be read “the sequence of data elements”.  An appropriate correction is required.
Claim 35 is objected.  A limitation “wherein said said data” should be read “wherein said ”.  An appropriate correction is required.
The drawings are objected to under 37 CFR 1.83(a).  The drawings must show every feature of the invention specified in the claims.  Therefore, “a feature space”, “features of the respective data element”, “neighboring data elements within the feature”, “respective data element”, “representative data element”, “cluster of data elements”, “a sequence of data elements”, “labeled data elements”, “unlabeled data elements”, and “selected unlabeled data elements” must be shown or the features must be canceled from the claims 1-36.  No new matter should be entered.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Rejection – 35 U.S.C. § 112

The following is a quotation of 35 U.S.C. 112(b): 
(B) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. 
The following is a quotation of pre-AIA  35 U.S.C. 112, second paragraph: 
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-12 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter, which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention. Claim 1 recites " the features of the respective data element". There is insufficient antecedent basis for these limitations in the claim.  Therefore, claim 1 and its depdendent claims are indefinite and are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph.
Claims 1-36 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter, which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention. Claims 1, 13, 25 recite “neighboring data elements within the feature”. It is noted that the claims recite multiple types of data elements such as: “respective data element”, “representative data element”, “a cluster of data elements”, “a sequence of data elements”, “labeled data elements”, “unlabeled data elements”, “selected unlabeled data elements”.  Hence it is not clear from the claim language the “neighboring data elements” neighbors to what type of data elements.  Therefore, claims 1, 13, 25 and their dependent claims are indefinite and are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph.
Claim 11 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter, which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention. Claim 11 recites “The method of claim 11”. It is noted that a dependent claim should be depended on a preceding claim, not on itself.  Hence, it is not clear what preceding claim that claim 11 should be depended on.  Therefore, claim 11 is indefinite and is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under pre-AIA  35 U.S.C. 103(a) are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
           This application currently names joint inventors. In considering patentability of the claims under pre-AIA  35 U.S.C. 103(a), the examiner presumes that the subject matter of the various claims was commonly owned at the time any inventions covered therein were made absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and invention dates of each claim that was not commonly owned at the time a later invention was made in order for the examiner to consider the applicability of pre-AIA  35 U.S.C. 103(c) and potential pre-AIA  35 U.S.C. 102(e), (f) or (g) prior art under pre-AIA  35 U.S.C. 103(a).

Claims 1-36 are rejected under 35 U.S.C. 103 as being unpatentable over Dasgupta et al. (US Patent 10,719,301 B1), (“Dasgupta”), in view of Garbow et al. (US Patent 7,484,132 B2), (“Garbow”).
Regarding claim 1, Dasgupta meets the claim limitations as follow.
A method (i.e. methods) [Dasgupta: col. 4, line 6] for classifying data elements (i.e. classifying a set of media samples) [Dasgupta: col. 3, line 25], the method being executed by a processor coupled to a computer memory (i.e. One or more non-transitory computer-accessible storage media storing program instructions that when executed on or across one or more processors implement) [Dasgupta: col. 65, line 38-40], the method (i.e. methods) [Dasgupta: col. 4, line 6] comprising: receiving unlabeled data set (i.e. receive a set of unlabeled media samples to be annotated with respective labels) [Dasgupta: Fig. 19A, Box 1910], the unlabeled data set having a plurality of data elements (i.e. a set of unlabeled media samples) [Dasgupta: : Fig. 19A, Box 1910], each represented in a feature space (i.e. the feature vector as an intermediate representation of an input image) [Dasgupta: col. 35, line 57-58] comprising a set of values corresponding to the features of the respective data element ((i.e. feature vectors in two-dimensional space, three-dimensional space, or spaces of higher dimensionality. In some embodiments, the user interface may include a button 1436 that allows users to configure which features should be used to make the scatter plot, or how the scatter plot should be displayed) [Dasgupta: col. 35, line 62-64]; (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9]); selecting from the unlabeled data set (i.e. the selection is done from a large corpus of unlabeled samples) [Dasgupta: col. 32, line 6-7], one or more representative data elements to represent corresponding clusters of data elements (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9], the selecting based on proximity of the representative data elements to neighboring data elements within the feature space ((i.e. At operation 1930, a classification model is initialized based on the user's annotations of the seed samples. In some embodiments, the classification model may be the classification model 1380 of FIG. 13. Depending on the embodiment, the classification model may employ one of a variety of different techniques. For example, different classification algorithms may include random forests, Support Vector Machines (SVM), logistic regression, a neural network, or k-NN (nearest neighbor) algorithms.) [Dasgupta: col. 41, line 19-27]; (i.e. one or more images or samples from each cluster (e.g., one or more images near center of the clusters)) [Dasgupta: col. 56, line 31-34]); labeling the representative data elements (i.e. the learner builds a classifier which is then executed over all the unlabeled examples. In some embodiments, samples that are difficult to classify are selected for labeling) [Dasgupta: col. 32, line 10-12] to identify the corresponding clusters (i.e. Initially the active learner is seeded with data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data.) [Dasgupta: col. 32, line 7-9]; and for each one of a sequence of data elements (i.e. for each data set) [Dasgupta: col. 45, line 7], beginning with said representative data elements (i.e. the classifier begins to annotate images from the training set) [Dasgupta: col. 37, line 8-9]: selecting a labeled data element in the sequence ((i.e. the labels selected by the classifier) [Dasgupta: col. 34, line 8]; (i.e. in some embodiments, the classifier model is able to select multiple labels for each sample) [Dasgupta: col. 38, line 64-66]; (i.e. the annotation system employs active learning techniques to interactively select the most informative samples to be annotated) [Dasgupta: col. 32, line 3-5]); selecting unlabeled data elements (i.e. selected from the set of unlabeled media samples) [Dasgupta: Fig. 19A, Box 1920] neighboring that labeled data element (i.e. selected closest training samples from a first class and from a second class) [Dasgupta: Fig. 28, Box 2840]; copying (i.e. moving) [Dasgupta: col. 34, line 18] a label from that labeled data element (i.e. the labels assigned to media samples) [Dasgupta: col. 14, line 53] to the selected unlabeled data elements  (i.e. In some embodiments, the moving may be accomplished by updating an indication or designation of a training image in the unlabeled set to indicate that the image is now labeled) [Dasgupta: col. 34, line 18-21]; and  Application No.: 16/280,690Page 4 of 11 Preliminary Amendment dated May 9, 2019 adding the selected unlabeled data elements to the sequence (i.e. adding the one or more incorrectly predicted images to the training data set) [Dasgupta: col. 63, line 35-36].    
Garbow further discloses the claim limitations as follows:
the selecting based on proximity of the representative data elements to neighboring data elements within the feature space (i.e. assigning a rank value to each nearest neighbor based on the proximity of the nearest neighbor to the point; determining a mutual nearest neighbor score for each pair of nearest neighbor points in the set of points based on the rank values; and clustering the points based on the mutual nearest neighbor score) [Garbow: col. 12, line 19-25].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 2, Dasgupta meets the claim limitations as set forth in claim 1.Dasgupta further meets the claim limitations as follow.
The method of claim 1 (i.e. methods) [Dasgupta: col. 4, line 6], wherein selecting the one or more representative data elements (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9; Figs. 22A-B, 25, 27] comprises assigning a mutuality score to each data element representing the portion of mutual neighbors among a threshold ranking of that data element's closest neighbors, wherein data elements are mutual neighbors if they are each among one another's closest neighbors ((i.e. selected one or more closest training samples from the top predicted class) [Dasgupta: Fig. 29, Box 2940]; (i.e. selected closest training samples from a first class and from a second class) [Dasgupta: Fig. 28, Box 2840]).
Dasgupta does not explicitly disclose the following claim limitations (Emphasis added).
The method of claim 1, wherein selecting the one or more representative data elements comprises assigning a mutuality score to each data element representing the portion of mutual neighbors among a threshold ranking of that data element's closest neighbors, wherein data elements are mutual neighbors if they are each among one another's closest neighbors. 
However, in the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
(i.e. Each nearest neighbor may also be given a proximity ranking based on the distance from a particular point. The nearer the neighbor to the point the lower the proximity ranking may be.) [Garbow: col. 8, line 43-46], wherein data elements are mutual neighbors if they are each among one another's closest neighbors (i.e. A mutual nearest neighbor (MNN) score for each pair of points may be calculated for each pair of points based on the sum of the proximity rankings of the points with respect to each other. For example, in FIG. 5, the proximity ranking for point C with respect to point A is 1, whereas the proximity ranking for point A with respect to point C is 0. Therefore, the MNN score for the point pair (A,C) is 1, the sum of their proximity rankings with respect to each other. Similarly, the proximity rankings for each point pair may be determined.) [Garbow: col. 8, line 54-63]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 3, Dasgupta meets the claim limitations as set forth in claim 2.Dasgupta further meets the claim limitations as follow.
The method of claim 2 (i.e. methods) [Dasgupta: col. 4, line 6], wherein selecting the one or more representative data elements comprises assigning a cluster score (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9; Figs. 22A-B, 25, 27] to each data element ((i.e. selected one or more closest training samples from the top predicted class) [Dasgupta: Fig. 29, Box 2940]; (i.e. selected closest training samples from a first class and from a second class) [Dasgupta: Fig. 28, Box 2840]), representing a combination of mutuality scores at multiple threshold ranking values.
Dasgupta does not explicitly disclose the following claim limitations (Emphasis added).
The method of claim 1, wherein selecting the one or more representative data elements comprises assigning a cluster score to each data element, representing a combination of mutuality scores at multiple threshold ranking values. 
However, in the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
representing a combination of mutuality scores at multiple threshold ranking values ((i.e. A mutual nearest neighbor (MNN) score for each pair of points may be calculated for each pair of points based on the sum of the proximity rankings of the points with respect to each other. For example, in FIG. 5, the proximity ranking for point C with respect to point A is 1, whereas the proximity ranking for point A with respect to point C is 0. Therefore, the MNN score for the point pair (A,C) is 1, the sum of their proximity rankings with respect to each other. Similarly, the proximity rankings for each point pair may be determined.) [Garbow: col. 8, line 54-63]; (i.e. Each nearest neighbor may also be given a proximity ranking based on the distance from a particular point. The nearer the neighbor to the point the lower the proximity ranking may be) [Garbow: col. 8, line 43-46]). 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 4, Dasgupta meets the claim limitations as set forth in claim 3.Dasgupta further meets the claim limitations as follow.
The method of claim 3 (i.e. methods) [Dasgupta: col. 4, line 6], wherein said sequence is ordered according to cluster score (i.e. In some embodiments, the to-do list may be sorted in the order of impact on a chosen accuracy measure, for example the F1 score) [Dasgupta: col. 48, line 60-62; Figs. 22A-B, 25, 27].

Regarding claim 5, Dasgupta meets the claim limitations as set forth in claim 3.Dasgupta further meets the claim limitations as follow.
The method of claim 3 (i.e. methods) [Dasgupta: col. 4, line 6], wherein selecting the one or more representative data elements comprises selecting data elements with a cluster score (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9; Figs. 22A-B, 25, 27] above a defined threshold (i.e. until the classifier's performance is above a certain threshold, or after the classifier) [Dasgupta: col. 39, line 11-12].Regarding claim 6, Dasgupta meets the claim limitations as set forth in claim 5.Dasgupta further meets the claim limitations as follow.
The method of claim 5 (i.e. methods) [Dasgupta: col. 4, line 6], wherein selecting the one or more representative data elements comprises assigning a cluster score (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9; Figs. 22A-B, 25, 27] higher than the cluster scores of their mutual neighbors.
Dasgupta does not explicitly disclose the following claim limitations (Emphasis added).
The method of claim 1, wherein selecting the one or more representative data elements comprises assigning a cluster score higher than the cluster scores of their mutual neighbors. 
However, in the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
higher than the cluster scores of their mutual neighbors (i.e. A mutual nearest neighbor (MNN) score for each pair of points may be calculated for each pair of points based on the sum of the proximity rankings of the points with respect to each other. For example, in FIG. 5, the proximity ranking for point C with respect to point A is 1, whereas the proximity ranking for point A with respect to point C is 0. Therefore, the MNN score for the point pair (A,C) is 1) [Garbow: col. 8, line 54-63 – Note: Garbow’s method selects a higher cluster score 1]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 
Regarding claim 7, Dasgupta meets the claim limitations as set forth in claim 1.Dasgupta further meets the claim limitations as follow.
The method of claim 1 (i.e. methods) [Dasgupta: col. 4, line 6], further comprising receiving labels for each one of the representatives from a user input device, each label representing a class (i.e. The sample annotation interface 532 may be configured to allow a user to manually or programmatically annotate individual input samples (e.g. images). In some embodiments, the sample annotation interface 532 may be based on a trained classifier, which has been trained to annotate samples by observing user annotation behavior, and is automatically used to annotate incoming media samples) [Dasgupta: col. 19, line 23 – col. 20, line 2].

Regarding claim 8, Dasgupta meets the claim limitations as set forth in claim 1.Dasgupta further meets the claim limitations as follow.
The method of claim 1 (i.e. methods) [Dasgupta: col. 4, line 6], wherein the selecting unlabeled data elements comprises selecting data elements that are within a threshold proximity of the selected labeled one of the data elements ((i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9]; (i.e. The extrapolation process may be initiated, for example, because the accuracy level of the classifier in predicting user annotations have reached a certain threshold level) [Dasgupta: col. 35, line 2]).

Regarding claim 9, Dasgupta meets the claim limitations as set forth in claim 7.Dasgupta further meets the claim limitations as follow.
The method of claim 7 (i.e. methods) [Dasgupta: col. 4, line 6], wherein the threshold proximity (i.e. The extrapolation process may be initiated, for example, because the accuracy level of the classifier in predicting user annotations have reached a certain threshold level) [Dasgupta: col. 35, line 2] is defined as a threshold rank of the closest neighboring data elements.
Dasgupta does not explicitly disclose the following claim limitations (Emphasis added).
The method of claim 7, wherein the threshold proximity is defined as a threshold rank of the closest neighboring data elements. 
However, in the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
wherein the threshold proximity is defined as a threshold rank of the closest neighboring data elements ((i.e. Each nearest neighbor may also be given a proximity ranking based on the distance from a particular point. The nearer the neighbor to the point the lower the proximity ranking may be.) [Garbow: col. 8, line 43-46]; (i.e. A mutual nearest neighbor (MNN) score for each pair of points may be calculated for each pair of points based on the sum of the proximity rankings of the points with respect to each other. For example, in FIG. 5, the proximity ranking for point C with respect to point A is 1, whereas the proximity ranking for point A with respect to point C is 0. Therefore, the MNN score for the point pair (A,C) is 1, the sum of their proximity rankings with respect to each other. Similarly, the proximity rankings for each point pair may be determined) [Garbow: col. 8, line 54-63]). 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 10, Dasgupta meets the claim limitations as set forth in claim 3.Dasgupta further meets the claim limitations as follow.
The method of claim 3 (i.e. methods) [Dasgupta: col. 4, line 6], comprising identifying one or more data elements as outliers based on proximity to neighboring data elements within said feature space (i.e. In some embodiments, the MDE implements data visualization techniques such as PCA (principal component analysis) and t-SNE (T-distributed Stochastic Neighbor Embedding) to help locate outlier media samples, allowing the user to easily identify and address these media samples) [Dasgupta: col. 6, line 15-20].
In the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
identifying one or more data elements as outliers based on proximity to neighboring data elements within said feature space (i.e. An r-cut value may be specified to indicate a threshold radius. Data points falling outside of the threshold radius, r-cut, may not be considered a near neighbor with respect to a reference data point. For example, if an r-cut value of 4 is specified, then only points B, C and D may be considered near neighbors of point A because they are at or within a distance of 4 from A. However, points E and F are not near neighbors of point A because they are at a distance greater than 4 from A (see table in FIG. 4)) [Garbow: col. 8, line 23-31; Fig. 4]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 11, Dasgupta meets the claim limitations as set forth in claim 11.Dasgupta further meets the claim limitations as follow.
The method of claim 11 (i.e. methods) [Dasgupta: col. 4, line 6], wherein identifying said one or more outliers comprises identifying data elements (i.e. In some embodiments, the MDE implements data visualization techniques such as PCA (principal component analysis) and t-SNE (T-distributed Stochastic Neighbor Embedding) to help locate outlier media samples, allowing the user to easily identify and address these media samples) [Dasgupta: col. 6, line 15-20] having a mutuality score below a threshold value (i.e. In some embodiments, if one or more performance metrics fall below a specified threshold, an aberration may be detected) [Dasgupta: col. 30, line 5-7].

Regarding claim 12, Dasgupta meets the claim limitations as set forth in claim 10.Dasgupta further meets the claim limitations as follow.
The method of claim 10 (i.e. methods) [Dasgupta: col. 4, line 6], wherein selecting said one or more outliers comprises selecting data elements (i.e. In some embodiments, the MDE implements data visualization techniques such as PCA (principal component analysis) and t-SNE (T-distributed Stochastic Neighbor Embedding) to help locate outlier media samples, allowing the user to easily identify and address these media samples) [Dasgupta: col. 6, line 15-20] having cluster scores lower than the cluster scores of their mutual neighbors.
In the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
wherein selecting said one or more outliers comprises selecting data elements having cluster scores lower than the cluster scores of their mutual neighbors (i.e. In one embodiment, the MNN score may be used to cluster pairs of data points. In one aspect, using the MNN score clusters are created based on the mutual strength of the bond between a pair of points. A threshold MNN score value may be specified so that only those points with the strongest mutual bond are clustered. In FIG. 6, the lower the MNN score, the stronger the bond between a pair of points. Therefore, the pairs of points with the lowest MNN score within the threshold may be clustered first. For example, with an MNN threshold value of 2, all pairs of points in FIG. 6 will be clustered together, with the points with an MNN score of 1 being clustered first. However, if a threshold value of 1 was used, then points B and C will not be clustered because the MNN score for the point pair including points B and C is 2. FIGS. 7A-7F illustrate a series of clustering steps to cluster pairs of points in table 601 for an MNN threshold value of 2. Two clusters are formed as a result, and are illustrated in FIG. 7F. The first cluster consists of points A, B, and C; and the second cluster consists of points D, E and F. While the above description illustrates an agglomerative mutual nearest neighbor clustering algorithm being used to cluster data points, one skilled in the art will recognize that any appropriate clustering algorithm may be used. For example, the k-means clustering or the hierarchical clustering algorithms may be used instead of the agglomerative clustering described above.) [Garbow: col. 8, line 66 – col. 9, line 31; Fig. 6]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 13, Dasgupta meets the claim limitations as follow.
A computing system (i.e. computer system) [Dasgupta: col. 60, line 52] for classifying data elements (i.e. classifying a set of media samples) [Dasgupta: col. 3, line 25] comprising: a memory (i.e. a system memory 3020) [Dasgupta: col. 60, line 59] storing an unlabeled data set ((i.e. In some embodiments, the feature vector may simply be stored as the representation of the test sample in the MDE, the feature vector is provided as input to the ML media model) [Dasgupta: col. 60, line 1-4]; (i.e. in some embodiments, the feature vectors of the training images may be stored with the training images in an image repository) [Dasgupta: col. 60, line 7-10]), the unlabeled data set having a plurality of data elements (i.e. a set of unlabeled media samples) [Dasgupta: : Fig. 19A, Box 1910] represented in a feature space (i.e. the feature vector as an intermediate representation of an input image) [Dasgupta: col. 35, line 57-58], and storing executable instructions (i.e. System memory 3020 may be configured to store instructions and data accessible by processor(s) 3010) [Dasgupta: col. 61, line 8-9]; and at least one processor configured to execute the executable instructions, the executable instructions causing the processor to (i.e. Processors 3010 may be any suitable processors capable of executing instructions) [Dasgupta: col. 60, line 66 – col. 61, line 1]: generate a labeled data set (i.e. Generate a dataset creation user interface to create datasets for a machine learning model using the exported labeled samples) [Dasgupta: Fig. 19, Box 1996] by:selecting from the unlabeled data set (i.e. the selection is done from a large corpus of unlabeled samples) [Dasgupta: col. 32, line 6-7], one or more representative data elements to represent corresponding clusters of data elements (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9], the selecting based on proximity of the representative data elements to neighboring data elements within the feature space ((i.e. At operation 1930, a classification model is initialized based on the user's annotations of the seed samples. In some embodiments, the classification model may be the classification model 1380 of FIG. 13. Depending on the embodiment, the classification model may employ one of a variety of different techniques. For example, different classification algorithms may include random forests, Support Vector Machines (SVM), logistic regression, a neural network, or k-NN (nearest neighbor) algorithms.) [Dasgupta: col. 41, line 19-27]; (i.e. one or more images or samples from each cluster (e.g., one or more images near center of the clusters)) [Dasgupta: col. 56, line 31-34]); labeling the representative data elements (i.e. the learner builds a classifier which is then executed over all the unlabeled examples. In some embodiments, samples that are difficult to classify are selected for labeling) [Dasgupta: col. 32, line 10-12] to identify the corresponding clusters (i.e. Initially the active learner is seeded with data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data.) [Dasgupta: col. 32, line 7-9]; and for each one of a sequence of data elements (i.e. for each data set) [Dasgupta: col. 45, line 7], beginning with said representative data elements (i.e. the classifier begins to annotate images from the training set) [Dasgupta: col. 37, line 8-9]: selecting a labeled data element in the sequence ((i.e. the labels selected by the classifier) [Dasgupta: col. 34, line 8]; (i.e. in some embodiments, the classifier model is able to select multiple labels for each sample) [Dasgupta: col. 38, line 64-66]; (i.e. the annotation system employs active learning techniques to interactively select the most informative samples to be annotated) [Dasgupta: col. 32, line 3-5]); selecting unlabeled data elements (i.e. selected from the set of unlabeled media samples) [Dasgupta: Fig. 19A, Box 1920] neighboring that labeled data element (i.e. selected closest training samples from a first class and from a second class) [Dasgupta: Fig. 28, Box 2840]; copying (i.e. moving) [Dasgupta: col. 34, line 18] a label from that labeled data element (i.e. the labels assigned to media samples) [Dasgupta: col. 14, line 53] to the selected unlabeled data elements  (i.e. In some embodiments, the moving may be accomplished by updating an indication or designation of a training image in the unlabeled set to indicate that the image is now labeled) [Dasgupta: col. 34, line 18-21]; and  Application No.: 16/280,690Page 4 of 11 Preliminary Amendment dated May 9, 2019 adding the selected unlabeled data elements to the sequence (i.e. adding the one or more incorrectly predicted images to the training data set) [Dasgupta: col. 63, line 35-36].    
Garbow further discloses the claim limitations as follows:
the selecting based on proximity of the representative data elements to neighboring data elements within the feature space (i.e. assigning a rank value to each nearest neighbor based on the proximity of the nearest neighbor to the point; determining a mutual nearest neighbor score for each pair of nearest neighbor points in the set of points based on the rank values; and clustering the points based on the mutual nearest neighbor score) [Garbow: col. 12, line 19-25]
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 14, Dasgupta meets the claim limitations as set forth in claim 13.Dasgupta further meets the claim limitations as follow.
The computing system of claim 13 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein selecting the one or more representative data elements (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9] comprises assigning a mutuality score to each data element representing the portion of mutual neighbors among a threshold ranking of that data element's closest neighbors, wherein data elements are mutual neighbors if they are each among one another's closest neighbors ((i.e. selected one or more closest training samples from the top predicted class) [Dasgupta: Fig. 29, Box 2940]; (i.e. selected closest training samples from a first class and from a second class) [Dasgupta: Fig. 28, Box 2840]).
Dasgupta does not explicitly disclose the following claim limitations (Emphasis added).
The computing system of claim 13, wherein selecting the one or more representative data elements comprises assigning a mutuality score to each data element representing the portion of mutual neighbors among a threshold ranking of that data element's closest neighbors, wherein data elements are mutual neighbors if they are each among one another's closest neighbors. 
However, in the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
(i.e. Each nearest neighbor may also be given a proximity ranking based on the distance from a particular point. The nearer the neighbor to the point the lower the proximity ranking may be.) [Garbow: col. 8, line 43-46], wherein data elements are mutual neighbors if they are each among one another's closest neighbors (i.e. A mutual nearest neighbor (MNN) score for each pair of points may be calculated for each pair of points based on the sum of the proximity rankings of the points with respect to each other. For example, in FIG. 5, the proximity ranking for point C with respect to point A is 1, whereas the proximity ranking for point A with respect to point C is 0. Therefore, the MNN score for the point pair (A,C) is 1, the sum of their proximity rankings with respect to each other. Similarly, the proximity rankings for each point pair may be determined.) [Garbow: col. 8, line 54-63]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 15, Dasgupta meets the claim limitations as set forth in claim 14.Dasgupta further meets the claim limitations as follow.
The computing system of claim 14 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein selecting the one or more representative data elements comprises assigning a cluster score (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9; Figs. 22A-B, 25, 27] to each data element ((i.e. selected one or more closest training samples from the top predicted class) [Dasgupta: Fig. 29, Box 2940]; (i.e. selected closest training samples from a first class and from a second class) [Dasgupta: Fig. 28, Box 2840]), representing a combination of mutuality scores at multiple threshold ranking values.
Dasgupta does not explicitly disclose the following claim limitations (Emphasis added).
The computing system of claim 14, wherein selecting the one or more representative data elements comprises assigning a cluster score to each data element, representing a combination of mutuality scores at multiple threshold ranking values. 
However, in the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
representing a combination of mutuality scores at multiple threshold ranking values ((i.e. A mutual nearest neighbor (MNN) score for each pair of points may be calculated for each pair of points based on the sum of the proximity rankings of the points with respect to each other. For example, in FIG. 5, the proximity ranking for point C with respect to point A is 1, whereas the proximity ranking for point A with respect to point C is 0. Therefore, the MNN score for the point pair (A,C) is 1, the sum of their proximity rankings with respect to each other. Similarly, the proximity rankings for each point pair may be determined.) [Garbow: col. 8, line 54-63]; (i.e. Each nearest neighbor may also be given a proximity ranking based on the distance from a particular point. The nearer the neighbor to the point the lower the proximity ranking may be) [Garbow: col. 8, line 43-46]). 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 16, Dasgupta meets the claim limitations as set forth in claim 15.Dasgupta further meets the claim limitations as follow.
The computing system of claim 15 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein said sequence is ordered according to cluster score (i.e. In some embodiments, the to-do list may be sorted in the order of impact on a chosen accuracy measure, for example the F1 score) [Dasgupta: col. 48, line 60-62; Figs. 22A-B, 25, 27].

Regarding claim 17, Dasgupta meets the claim limitations as set forth in claim 15.Dasgupta further meets the claim limitations as follow.
The computing system of claim 15 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein selecting the one or more representative data elements comprises selecting data elements with a cluster score (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9; Figs. 22A-B, 25, 27] above a defined threshold (i.e. until the classifier's performance is above a certain threshold, or after the classifier) [Dasgupta: col. 39, line 11-12].
Regarding claim 18, Dasgupta meets the claim limitations as set forth in claim 17.Dasgupta further meets the claim limitations as follow.
The computing system of claim 17 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein selecting the one or more representative data elements comprises assigning a cluster score (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9; Figs. 22A-B, 25, 27] higher than the cluster scores of their mutual neighbors.
Dasgupta does not explicitly disclose the following claim limitations (Emphasis added).
The computing system of claim 17, wherein selecting the one or more representative data elements comprises assigning a cluster score higher than the cluster scores of their mutual neighbors. 
However, in the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
higher than the cluster scores of their mutual neighbors (i.e. A mutual nearest neighbor (MNN) score for each pair of points may be calculated for each pair of points based on the sum of the proximity rankings of the points with respect to each other. For example, in FIG. 5, the proximity ranking for point C with respect to point A is 1, whereas the proximity ranking for point A with respect to point C is 0. Therefore, the MNN score for the point pair (A,C) is 1) [Garbow: col. 8, line 54-63 – Note: Garbow’s method selects a higher cluster score 1]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22].

Regarding claim 19, Dasgupta meets the claim limitations as set forth in claim 13.Dasgupta further meets the claim limitations as follow.
The computing system of claim 13 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein said instructions cause said processor to (i.e. One or more non-transitory computer-accessible storage media storing program instructions that when executed on or across one or more processors implement) [Dasgupta: col. 65, line 38-40] receive labels for each one of the representatives from a user input device, each label representing a class (i.e. The sample annotation interface 532 may be configured to allow a user to manually or programmatically annotate individual input samples (e.g. images). In some embodiments, the sample annotation interface 532 may be based on a trained classifier, which has been trained to annotate samples by observing user annotation behavior, and is automatically used to annotate incoming media samples) [Dasgupta: col. 19, line 23 – col. 20, line 2].

Regarding claim 20, Dasgupta meets the claim limitations as set forth in claim 13.Dasgupta further meets the claim limitations as follow.
The computing system of claim 13 (i.e. computer system) [Dasgupta: col. 60, line 52],  wherein the selecting unlabeled data elements comprises selecting data elements that are within a threshold proximity of the selected labeled one of the data elements ((i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9]; (i.e. The extrapolation process may be initiated, for example, because the accuracy level of the classifier in predicting user annotations have reached a certain threshold level) [Dasgupta: col. 35, line 2]).

Regarding claim 21, Dasgupta meets the claim limitations as set forth in claim 20.Dasgupta further meets the claim limitations as follow.
The computing system of claim 20 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein the threshold proximity (i.e. The extrapolation process may be initiated, for example, because the accuracy level of the classifier in predicting user annotations have reached a certain threshold level) [Dasgupta: col. 35, line 2] is defined as a threshold rank of the closest neighboring data elements.
Dasgupta does not explicitly disclose the following claim limitations (Emphasis added).
The computing system of claim 20, wherein the threshold proximity is defined as a threshold rank of the closest neighboring data elements. 
However, in the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
wherein the threshold proximity is defined as a threshold rank of the closest neighboring data elements ((i.e. Each nearest neighbor may also be given a proximity ranking based on the distance from a particular point. The nearer the neighbor to the point the lower the proximity ranking may be.) [Garbow: col. 8, line 43-46]; (i.e. A mutual nearest neighbor (MNN) score for each pair of points may be calculated for each pair of points based on the sum of the proximity rankings of the points with respect to each other. For example, in FIG. 5, the proximity ranking for point C with respect to point A is 1, whereas the proximity ranking for point A with respect to point C is 0. Therefore, the MNN score for the point pair (A,C) is 1, the sum of their proximity rankings with respect to each other. Similarly, the proximity rankings for each point pair may be determined) [Garbow: col. 8, line 54-63]). 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 22, Dasgupta meets the claim limitations as set forth in claim 15.Dasgupta further meets the claim limitations as follow.
The computing system of claim 15 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein said instructions cause said processor to (i.e. One or more non-transitory computer-accessible storage media storing program instructions that when executed on or across one or more processors implement) [Dasgupta: col. 65, line 38-40] identify one or more data elements as outliers based on proximity to neighboring data elements within said feature space (i.e. In some embodiments, the MDE implements data visualization techniques such as PCA (principal component analysis) and t-SNE (T-distributed Stochastic Neighbor Embedding) to help locate outlier media samples, allowing the user to easily identify and address these media samples) [Dasgupta: col. 6, line 15-20].
In the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
identify one or more data elements as outliers based on proximity to neighboring data elements within said feature space (i.e. An r-cut value may be specified to indicate a threshold radius. Data points falling outside of the threshold radius, r-cut, may not be considered a near neighbor with respect to a reference data point. For example, if an r-cut value of 4 is specified, then only points B, C and D may be considered near neighbors of point A because they are at or within a distance of 4 from A. However, points E and F are not near neighbors of point A because they are at a distance greater than 4 from A (see table in FIG. 4)) [Garbow: col. 8, line 23-31; Fig. 4]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 23, Dasgupta meets the claim limitations as set forth in claim 22.Dasgupta further meets the claim limitations as follow.
The computing system of claim 22 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein said instructions cause said processor to (i.e. One or more non-transitory computer-accessible storage media storing program instructions that when executed on or across one or more processors implement) [Dasgupta: col. 65, line 38-40]  identify said one or more outliers comprises identifying data elements (i.e. In some embodiments, the MDE implements data visualization techniques such as PCA (principal component analysis) and t-SNE (T-distributed Stochastic Neighbor Embedding) to help locate outlier media samples, allowing the user to easily identify and address these media samples) [Dasgupta: col. 6, line 15-20] having a mutuality score below a threshold value (i.e. In some embodiments, if one or more performance metrics fall below a specified threshold, an aberration may be detected) [Dasgupta: col. 30, line 5-7].

Regarding claim 24, Dasgupta meets the claim limitations as set forth in claim 23.Dasgupta further meets the claim limitations as follow.
The computing system of claim 13 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein said instructions cause said processor to (i.e. One or more non-transitory computer-accessible storage media storing program instructions that when executed on or across one or more processors implement) [Dasgupta: col. 65, line 38-40] select said one or more outliers comprises selecting data elements (i.e. In some embodiments, the MDE implements data visualization techniques such as PCA (principal component analysis) and t-SNE (T-distributed Stochastic Neighbor Embedding) to help locate outlier media samples, allowing the user to easily identify and address these media samples) [Dasgupta: col. 6, line 15-20] having cluster scores lower than the cluster scores of their mutual neighbors.
In the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
select said one or more outliers comprises selecting data elements having cluster scores lower than the cluster scores of their mutual neighbors (i.e. In one embodiment, the MNN score may be used to cluster pairs of data points. In one aspect, using the MNN score clusters are created based on the mutual strength of the bond between a pair of points. A threshold MNN score value may be specified so that only those points with the strongest mutual bond are clustered. In FIG. 6, the lower the MNN score, the stronger the bond between a pair of points. Therefore, the pairs of points with the lowest MNN score within the threshold may be clustered first. For example, with an MNN threshold value of 2, all pairs of points in FIG. 6 will be clustered together, with the points with an MNN score of 1 being clustered first. However, if a threshold value of 1 was used, then points B and C will not be clustered because the MNN score for the point pair including points B and C is 2. FIGS. 7A-7F illustrate a series of clustering steps to cluster pairs of points in table 601 for an MNN threshold value of 2. Two clusters are formed as a result, and are illustrated in FIG. 7F. The first cluster consists of points A, B, and C; and the second cluster consists of points D, E and F. While the above description illustrates an agglomerative mutual nearest neighbor clustering algorithm being used to cluster data points, one skilled in the art will recognize that any appropriate clustering algorithm may be used. For example, the k-means clustering or the hierarchical clustering algorithms may be used instead of the agglomerative clustering described above.) [Garbow: col. 8, line 66 – col. 9, line 31; Fig. 6]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 25, Dasgupta meets the claim limitations as follow.
A computing device (i.e. computer system) [Dasgupta: col. 60, line 52] comprising: a memory (i.e. a system memory 3020) [Dasgupta: col. 60, line 59] storing an unlabeled data set ((i.e. In some embodiments, the feature vector may simply be stored as the representation of the test sample in the MDE, the feature vector is provided as input to the ML media model) [Dasgupta: col. 60, line 1-4]; (i.e. in some embodiments, the feature vectors of the training images may be stored with the training images in an image repository.) [Dasgupta: col. 60, line 7-10]), the unlabeled data set having a plurality of data elements (i.e. a set of unlabeled media samples) [Dasgupta: : Fig. 19A, Box 1910]  represented in a feature space ((i.e. At operation 1930, a classification model is initialized based on the user's annotations of the seed samples. In some embodiments, the classification model may be the classification model 1380 of FIG. 13. Depending on the embodiment, the classification model may employ one of a variety of different techniques. For example, different classification algorithms may include random forests, Support Vector Machines (SVM), logistic regression, a neural network, or k-NN (nearest neighbor) algorithms.) [Dasgupta: col. 41, line 19-27]; (i.e. one or more images or samples from each cluster (e.g., one or more images near center of the clusters)) [Dasgupta: col. 56, line 31-34]); a data element selector to select (i.e. the model diagnosis system to select) [Dasgupta: col. 59, line 46-47; Fig. 29] one or more representative data elements (i.e. a set of unlabeled media samples) [Dasgupta: Fig. 19A, Box 1910] to represent corresponding clusters of data elements (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9], based on proximity of the representative data elements to neighboring data elements within the feature space from the unlabeled data set ((i.e. At operation 1930, a classification model is initialized based on the user's annotations of the seed samples. In some embodiments, the classification model may be the classification model 1380 of FIG. 13. Depending on the embodiment, the classification model may employ one of a variety of different techniques. For example, different classification algorithms may include random forests, Support Vector Machines (SVM), logistic regression, a neural network, or k-NN (nearest neighbor) algorithms.) [Dasgupta: col. 41, line 19-27]; (i.e. one or more images or samples from each cluster (e.g., one or more images near center of the clusters)) [Dasgupta: col. 56, line 31-34]); a label generator (i.e. computer system 3000 includes one or more processors 3010) [Dasgupta: col. 60, line 58-59] to label the representative data elements (i.e. the learner builds a classifier which is then executed over all the unlabeled examples. In some embodiments, samples that are difficult to classify are selected for labeling) [Dasgupta: col. 32, line 10-12] to identify the corresponding clusters (i.e. Initially the active learner is seeded with data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data.) [Dasgupta: col. 32, line 7-9], and to propagate (i.e. moving) [Dasgupta: col. 34, line 18] labels from labeled data elements in the data set (i.e. the labels assigned to media samples) [Dasgupta: col. 14, line 53] to unlabeled data elements in the data set by (i.e. In some embodiments, the moving may be accomplished by updating an indication or designation of a training image in the unlabeled set to indicate that the image is now labeled) [Dasgupta: col. 34, line 18-21],for each one of a sequence of data elements (i.e. for each data set) [Dasgupta: col. 45, line 7], beginning with said representative data elements (i.e. the classifier begins to annotate images from the training set) [Dasgupta: col. 37, line 8-9]: selecting a labeled data element in the sequence ((i.e. the labels selected by the classifier) [Dasgupta: col. 34, line 8]; (i.e. in some embodiments, the classifier model is able to select multiple labels for each sample) [Dasgupta: col. 38, line 64-66]; (i.e. the annotation system employs active learning techniques to interactively select the most informative samples to be annotated) [Dasgupta: col. 32, line 3-5]); selecting unlabeled data elements (i.e. selected from the set of unlabeled media samples) [Dasgupta: Fig. 19A, Box 1920] neighboring that labeled data element (i.e. selected closest training samples from a first class and from a second class) [Dasgupta: Fig. 28, Box 2840]; copying (i.e. moving) [Dasgupta: col. 34, line 18] a label from that labeled data element (i.e. the labels assigned to media samples) [Dasgupta: col. 14, line 53] to the selected unlabeled data elements  (i.e. In some embodiments, the moving may be accomplished by updating an indication or designation of a training image in the unlabeled set to indicate that the image is now labeled) [Dasgupta: col. 34, line 18-21]; and  Application No.: 16/280,690Page 4 of 11 Preliminary Amendment dated May 9, 2019 adding the selected unlabeled data elements to the sequence (i.e. adding the one or more incorrectly predicted images to the training data set) [Dasgupta: col. 63, line 35-36].    
Garbow further discloses the claim limitations as follows:
the selecting based on proximity of the representative data elements to neighboring data elements within the feature space (i.e. assigning a rank value to each nearest neighbor based on the proximity of the nearest neighbor to the point; determining a mutual nearest neighbor score for each pair of nearest neighbor points in the set of points based on the rank values; and clustering the points based on the mutual nearest neighbor score) [Garbow: col. 12, line 19-25]
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 26, Dasgupta meets the claim limitations as set forth in claim 25.Dasgupta further meets the claim limitations as follow.
The computing device of claim 25 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein selecting the one or more representative data elements (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9] comprises assigning a mutuality score to each data element representing the portion of mutual neighbors among a threshold ranking of that data element's closest neighbors, wherein data elements are mutual neighbors if they are each among one another's closest neighbors ((i.e. selected one or more closest training samples from the top predicted class) [Dasgupta: Fig. 29, Box 2940]; (i.e. selected closest training samples from a first class and from a second class) [Dasgupta: Fig. 28, Box 2840]).
Dasgupta does not explicitly disclose the following claim limitations (Emphasis added).
The computing device of claim 25, wherein selecting the one or more representative data elements comprises assigning a mutuality score to each data element representing the portion of mutual neighbors among a threshold ranking of that data element's closest neighbors, wherein data elements are mutual neighbors if they are each among one another's closest neighbors. 
However, in the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
(i.e. Each nearest neighbor may also be given a proximity ranking based on the distance from a particular point. The nearer the neighbor to the point the lower the proximity ranking may be.) [Garbow: col. 8, line 43-46], wherein data elements are mutual neighbors if they are each among one another's closest neighbors (i.e. A mutual nearest neighbor (MNN) score for each pair of points may be calculated for each pair of points based on the sum of the proximity rankings of the points with respect to each other. For example, in FIG. 5, the proximity ranking for point C with respect to point A is 1, whereas the proximity ranking for point A with respect to point C is 0. Therefore, the MNN score for the point pair (A,C) is 1, the sum of their proximity rankings with respect to each other. Similarly, the proximity rankings for each point pair may be determined.) [Garbow: col. 8, line 54-63]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 27, Dasgupta meets the claim limitations as set forth in claim 26.Dasgupta further meets the claim limitations as follow.
The computing device of claim 26 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein selecting the one or more representative data elements comprises assigning a cluster score (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9; Figs. 22A-B, 25, 27] to each data element ((i.e. selected one or more closest training samples from the top predicted class) [Dasgupta: Fig. 29, Box 2940]; (i.e. selected closest training samples from a first class and from a second class) [Dasgupta: Fig. 28, Box 2840]), representing a combination of mutuality scores at multiple threshold ranking values.
Dasgupta does not explicitly disclose the following claim limitations (Emphasis added).
The computing device of claim 26, wherein selecting the one or more representative data elements comprises assigning a cluster score to each data element, representing a combination of mutuality scores at multiple threshold ranking values. 
However, in the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
representing a combination of mutuality scores at multiple threshold ranking values ((i.e. A mutual nearest neighbor (MNN) score for each pair of points may be calculated for each pair of points based on the sum of the proximity rankings of the points with respect to each other. For example, in FIG. 5, the proximity ranking for point C with respect to point A is 1, whereas the proximity ranking for point A with respect to point C is 0. Therefore, the MNN score for the point pair (A,C) is 1, the sum of their proximity rankings with respect to each other. Similarly, the proximity rankings for each point pair may be determined.) [Garbow: col. 8, line 54-63]; (i.e. Each nearest neighbor may also be given a proximity ranking based on the distance from a particular point. The nearer the neighbor to the point the lower the proximity ranking may be) [Garbow: col. 8, line 43-46]). 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 28, Dasgupta meets the claim limitations as set forth in claim 27.Dasgupta further meets the claim limitations as follow.
The computing system of claim 27 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein the label generator is to order (i.e. One or more non-transitory computer-accessible storage media storing program instructions that when executed on or across one or more processors implement) [Dasgupta: col. 65, line 38-40] said sequence is ordered according to cluster score (i.e. In some embodiments, the to-do list may be sorted in the order of impact on a chosen accuracy measure, for example the F1 score) [Dasgupta: col. 48, line 60-62; Figs. 22A-B, 25, 27].

Regarding claim 29, Dasgupta meets the claim limitations as set forth in claim 27.Dasgupta further meets the claim limitations as follow.
The computing device of claim 29 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein data element selector is to (i.e. the model diagnosis system to) [Dasgupta: col. 59, line 46-47; Fig. 29] select the one or more representative data elements comprises selecting data elements with a cluster score (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9; Figs. 22A-B, 25, 27] above a defined threshold (i.e. until the classifier's performance is above a certain threshold, or after the classifier) [Dasgupta: col. 39, line 11-12].
Regarding claim 30, Dasgupta meets the claim limitations as set forth in claim 29.Dasgupta further meets the claim limitations as follow.
The computing device of claim 29 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein data element selector is to (i.e. the model diagnosis system to) [Dasgupta: col. 59, line 46-47; Fig. 29] select the one or more representative data elements comprises assigning a cluster score (i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9; Figs. 22A-B, 25, 27] higher than the cluster scores of their mutual neighbors.
Dasgupta does not explicitly disclose the following claim limitations (Emphasis added).
The computing device of claim 29, wherein data element selector is to select the one or more representative data elements comprises assigning a cluster score higher than the cluster scores of their mutual neighbors. 
However, in the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
higher than the cluster scores of their mutual neighbors (i.e. A mutual nearest neighbor (MNN) score for each pair of points may be calculated for each pair of points based on the sum of the proximity rankings of the points with respect to each other. For example, in FIG. 5, the proximity ranking for point C with respect to point A is 1, whereas the proximity ranking for point A with respect to point C is 0. Therefore, the MNN score for the point pair (A,C) is 1) [Garbow: col. 8, line 54-63 – Note: Garbow’s method selects a higher cluster score 1]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22].

Regarding claim 31, Dasgupta meets the claim limitations as set forth in claim 25.Dasgupta further meets the claim limitations as follow.
The computing device of claim 26 (i.e. computer system) [Dasgupta: col. 60, line 52], further comprising a user interface for receiving labels for each one of the representatives from a user input device, each label representing a class (i.e. The sample annotation interface 532 may be configured to allow a user to manually or programmatically annotate individual input samples (e.g. images). In some embodiments, the sample annotation interface 532 may be based on a trained classifier, which has been trained to annotate samples by observing user annotation behavior, and is automatically used to annotate incoming media samples) [Dasgupta: col. 19, line 23 – col. 20, line 2].

Regarding claim 32, Dasgupta meets the claim limitations as set forth in claim 26.Dasgupta further meets the claim limitations as follow.
The computing device of claim 26 (i.e. computer system) [Dasgupta: col. 60, line 52],  wherein the selecting unlabeled data elements comprises selecting data elements that are within a threshold proximity of the selected labeled one of the data elements ((i.e. data points that are chosen by identifying the centroid of unique clusters in the unlabeled pool of data) [Dasgupta: col. 32, line 8-9]; (i.e. The extrapolation process may be initiated, for example, because the accuracy level of the classifier in predicting user annotations have reached a certain threshold level) [Dasgupta: col. 35, line 2]).

Regarding claim 33, Dasgupta meets the claim limitations as set forth in claim 32.Dasgupta further meets the claim limitations as follow.
The computing device of claim 32 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein the threshold proximity (i.e. The extrapolation process may be initiated, for example, because the accuracy level of the classifier in predicting user annotations have reached a certain threshold level) [Dasgupta: col. 35, line 2] is defined as a threshold rank of the closest neighboring data elements.
Dasgupta does not explicitly disclose the following claim limitations (Emphasis added).
The computing device of claim 32, wherein the threshold proximity is defined as a threshold rank of the closest neighboring data elements. 
However, in the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
wherein the threshold proximity is defined as a threshold rank of the closest neighboring data elements ((i.e. Each nearest neighbor may also be given a proximity ranking based on the distance from a particular point. The nearer the neighbor to the point the lower the proximity ranking may be.) [Garbow: col. 8, line 43-46]; (i.e. A mutual nearest neighbor (MNN) score for each pair of points may be calculated for each pair of points based on the sum of the proximity rankings of the points with respect to each other. For example, in FIG. 5, the proximity ranking for point C with respect to point A is 1, whereas the proximity ranking for point A with respect to point C is 0. Therefore, the MNN score for the point pair (A,C) is 1, the sum of their proximity rankings with respect to each other. Similarly, the proximity rankings for each point pair may be determined) [Garbow: col. 8, line 54-63]). 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 34, Dasgupta meets the claim limitations as set forth in claim 28.Dasgupta further meets the claim limitations as follow.
The computing device of claim 28 (i.e. computer system) [Dasgupta: col. 60, line 52], wherein said data element selector is to (i.e. the model diagnosis system to) [Dasgupta: col. 59, line 46-47; Fig. 29] identify one or more data elements as outliers based on proximity to neighboring data elements within said feature space (i.e. In some embodiments, the MDE implements data visualization techniques such as PCA (principal component analysis) and t-SNE (T-distributed Stochastic Neighbor Embedding) to help locate outlier media samples, allowing the user to easily identify and address these media samples) [Dasgupta: col. 6, line 15-20].
In the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
identify one or more data elements as outliers based on proximity to neighboring data elements within said feature space (i.e. An r-cut value may be specified to indicate a threshold radius. Data points falling outside of the threshold radius, r-cut, may not be considered a near neighbor with respect to a reference data point. For example, if an r-cut value of 4 is specified, then only points B, C and D may be considered near neighbors of point A because they are at or within a distance of 4 from A. However, points E and F are not near neighbors of point A because they are at a distance greater than 4 from A (see table in FIG. 4)) [Garbow: col. 8, line 23-31; Fig. 4]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Regarding claim 35, Dasgupta meets the claim limitations as set forth in claim 34.Dasgupta further meets the claim limitations as follow.
The computing device of claim 34 (i.e. computer system) [Dasgupta: col. 60, line 52] wherein said data element selector is to (i.e. the model diagnosis system to) [Dasgupta: col. 59, line 46-47; Fig. 29] identify said one or more outliers comprises identifying data elements (i.e. In some embodiments, the MDE implements data visualization techniques such as PCA (principal component analysis) and t-SNE (T-distributed Stochastic Neighbor Embedding) to help locate outlier media samples, allowing the user to easily identify and address these media samples) [Dasgupta: col. 6, line 15-20] having a mutuality score below a threshold value (i.e. In some embodiments, if one or more performance metrics fall below a specified threshold, an aberration may be detected) [Dasgupta: col. 30, line 5-7].

Regarding claim 36, Dasgupta meets the claim limitations as set forth in claim 35.Dasgupta further meets the claim limitations as follow.
The computing device of claim 35 (i.e. computer system) [Dasgupta: col. 60, line 52] wherein said data element selector is to select (i.e. the model diagnosis system to select) [Dasgupta: col. 59, line 46-47; Fig. 29] said one or more outliers comprises selecting data elements (i.e. In some embodiments, the MDE implements data visualization techniques such as PCA (principal component analysis) and t-SNE (T-distributed Stochastic Neighbor Embedding) to help locate outlier media samples, allowing the user to easily identify and address these media samples) [Dasgupta: col. 6, line 15-20] having cluster scores lower than the cluster scores of their mutual neighbors.
In the same field of endeavor Garbow further discloses the claim limitations and the deficient claim limitations as follows:
select said one or more outliers comprises selecting data elements having cluster scores lower than the cluster scores of their mutual neighbors (i.e. In one embodiment, the MNN score may be used to cluster pairs of data points. In one aspect, using the MNN score clusters are created based on the mutual strength of the bond between a pair of points. A threshold MNN score value may be specified so that only those points with the strongest mutual bond are clustered. In FIG. 6, the lower the MNN score, the stronger the bond between a pair of points. Therefore, the pairs of points with the lowest MNN score within the threshold may be clustered first. For example, with an MNN threshold value of 2, all pairs of points in FIG. 6 will be clustered together, with the points with an MNN score of 1 being clustered first. However, if a threshold value of 1 was used, then points B and C will not be clustered because the MNN score for the point pair including points B and C is 2. FIGS. 7A-7F illustrate a series of clustering steps to cluster pairs of points in table 601 for an MNN threshold value of 2. Two clusters are formed as a result, and are illustrated in FIG. 7F. The first cluster consists of points A, B, and C; and the second cluster consists of points D, E and F. While the above description illustrates an agglomerative mutual nearest neighbor clustering algorithm being used to cluster data points, one skilled in the art will recognize that any appropriate clustering algorithm may be used. For example, the k-means clustering or the hierarchical clustering algorithms may be used instead of the agglomerative clustering described above.) [Garbow: col. 8, line 66 – col. 9, line 31; Fig. 6]. 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Dasgupta with Garbow to program the system to implement clustering algorithms, such as the Agglomerative Mutual Nearest Neighbor clustering algorithm.  
Therefore, the combination of Dasgupta with Garbow will enable the system to identify distinguishable clusters of data points within the set of data points [Garbow: col. 7, line 46 – col. 8, line 22]. 

Reference Notice 
Additional prior arts, included in the Notice of Reference Cited, made of record and not relied upon is considered pertinent to applicant's disclosure.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Philip Dang whose telephone number is (408) 918-7529.  The examiner can normally be reached on Monday-Thursday between 8:30 am - 5:00 pm (PST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sath Perungavoor can be reached on 571-272-7455.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
/Philip P. Dang/Primary Examiner, Art Unit 2488