Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Application 16/859,107 filed 4/27/2020 has been examined.
In this Office Action, claims 1-20 are currently pending.

Examiner’s Note
It appears all the independent claims (1, 18 and 19) recite this limitation in the final stanza:
“use the trained machine learning model to classify data points of the dataset.”;
Except for claim 20, where this limitation is omitted.
The Examiner simply recommends that Applicant needs to check to make sure if the limitation above was omitted from claim 20 by mistake/typographical error.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an
abstract idea without significantly more.
Claim 1 recites:
providing a classifier for data points to receive labels of the data points.
The limitation of providing a classifier for data points to receive labels of the data points, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of 
Further, these concepts also recite “Certain Methods of Organizing Human Activity”; (such as
commercial or legal interactions (including agreements in the form of contracts; legal
obligations; advertising, marketing or sales activities or behaviors; business relations) where
providing labels based on classification is a method of human activity in commercial or legal interactions or advertising/marketing activities.
Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim only
recites one additional element – using generic methods to perform both training; selecting; repeating; selecting; providing; and using and providing steps. The methods in both steps is recited at a high level of generality (i.e., as a generic processor performing a generic method function of providing labels) such that it amounts no more than mere instructions to apply the

The claim does not include additional elements that are sufficient to amount to significantly more
than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a generic method to perform
both the training; selecting; repeating; selecting; providing; and using and providing steps amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic method or generic computer component cannot provide an inventive concept. The claim(s) is/are not patent eligible.

Dependent claims 2-17 are merely add further details of the abstract steps/elements recited in
claim 1 without integrating the idea into a practical application; or including an improvement to
another technology or technical field, an improvement to the functioning of the computer itself,
or meaningful limitations beyond generally linking the use of an abstract idea to a particular
technological environment. Therefore, dependent claims 2-17 are also directed towards
nonstatutory subject matter.

As per independent claims 18, 19 and 20, are also rejected as ineligible subject matter under 35
U.S.C. 101 for substantially the same reasons as the method claim(s) 1. The components (i.e.,
product/system described in independent claims 18, 19 and 20 do not provide for integrating the abstract idea into a practical application. At best, the claim(s) are merely providing alternate
environments to implement the abstract idea. 


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4-7 and 17-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Simard et al., US Pub. No. 2017/0039486.  

As to claim 1 (and substantially similar claims 18, claim 19 and claim 20), Simard discloses 
a method for matching data records of a dataset, the data records having values of a set of attributes, the method comprising:
training a machine learning model using a current set of labeled data points, each of the
data points being multiple data records, wherein a label of a data point indicates a classification
of the data point, the training resulting in a trained machine learning model that is configured to
classify a data point as representing a same entity or different entities;
(Simard [0044] The classifier is trained with the search query and the dictionary as input features. The classifier is utilized to determine new predicted labels for one or more of the data items. One or more data items having a discrepancy between a previous label and a new predicted label are identified.; see also [0079] Active learning is often viewed as a means to

features to produce a valuable classifier or schema extractor. With Big Data with lopsided classes, only a small fraction of the data will ever be observed and some nuggets of positive
or negative may never be discovered. )

selecting from a current set of unlabeled data points a subset of unlabeled data points
using classification results of classification of the current set of unlabeled data points using the
trained machine learning model, the current set of unlabeled data points without the selected
subset of unlabeled data points becoming the current set of unlabeled data points;
(Simard [0041] One or more tokens having a discrepancy between a previous label and a predicted label are identified.)

providing to a classifier the subset of unlabeled data points and in response to the
providing receiving labels of the subset of unlabeled data points;
(Simard [0044] The classifier is utilized to determine new predicted labels for one or more of the data items. One or more data items having a discrepancy between a previous label and a new predicted label are identified. Via the user interface, an indication is presented of the one or
more data items having the discrepancy between the previous label and the new predicted label. Via the user interface, a user selection of one or more features is received. The classifier is trained with the one or more user-selected features as input features.)

repeating training, selecting, and providing using the subset of labeled data points in
addition to the current set of labeled data points as the current set of labeled data points; 


and
using the trained machine learning model to classify data points of the dataset
(Simard see also [0038] previously labeled as examples of a particular class of data item. A classifier is utilized to determine predicted labels for one or more of the data items).

It would have been obvious to one having ordinary skill in the art at the time the time of the effective filing date to apply repeated training and machine learning for labelling as taught by Simard since it was known in the art that machine learning systems provide active learning as a means to increase the efficiency of labeling on a fixed size set with a fixed number of features and providing an exploration tool that will help the user add labels and features to produce a valuable classifier or schema extractor. (Simard 0079).

As to claim 4, Simard discloses the method of claim 1, wherein:
the trained machine learning model that is used to classify data points of the dataset is the
trained machine learning model that results from a predefined number of iterations of training,
selecting, and providing (Simard [0104] In interactive machine learning, the number of labels and features varies over time, as labels and features are added. A classifier may be (re )trained successively with example counts of 10, 20, 40, 80, 160, as the labels are coming in).


for a predefined number of iterations; and until the set of unlabeled data points comprises a number of data points that is smaller than a predefined minimum number (Simard [0104] In interactive machine learning, the number of labels and features varies over time, as labels and features are added. A classifier may be (re )trained successively with example counts of 10, 20, 40, 80, 160, as the labels are coming in;
see also [0073] Coming back to FIG. 3, in one embodiment, a system samples by filtering data around P=0.75 to improve precision and around P=0.25 to improve recall. These
thresholds are adjustable. As previously mentioned, FIG. 5 depicts exemplary plots 500 of sampling distributions 510 as a function of score 520. This alternating strategy has proven
more useful than, for example, sampling uniformly for all the scores between 0 and 1.).

As to claim 6, Simard discloses the method of claim 5, wherein repeating training, selecting, and providing comprises:
in response to determining that the set of unlabeled data points comprises a number of data
points that is smaller than a predefined minimum number, waiting until the set of unlabeled data
points comprises a number of data points that is higher than or equal to the predefined minimum
number for the repeating
(Simard [0104] In interactive machine learning, the number of labels and features varies over time, as labels and features are added. A classifier may be (re )trained successively with example counts of 10, 20, 40, 80, 160, as the labels are coming in;
see also [0085] a. Estimating Reachability (0086] Reachability can be estimated based on the labeling strategy and the score distribution of the unlabeled examples. As an example of this, let L be the set oflabels and U the universe, and let S be the patterns with score~, a

the score of the sample, i.e., one can compute for each document wEU, the probability of sampling Ps =Pr [wELlscore(w)=s].).

As to claim 7, Simard discloses the method of claim 1, further comprising:
receiving further unlabeled data points, wherein the current set of unlabeled data points in
addition to the received further unlabeled data points becomes the current set of unlabeled data
points
(Simard [0044] The classifier is utilized to determine new predicted labels for one or more of the data items. One or more data items having a discrepancy between a previous label and a new predicted label are identified. Via the user interface, an indication is presented of the one or
more data items having the discrepancy between the previous label and the new predicted label. Via the user interface, a user selection of one or more features is received. The classifier is trained with the one or more user-selected features as input features.).

As to claim 17, Simard discloses the method of claim 1, wherein selecting comprising ranking the data points and selecting the first ranked data points
(Simard [0300] A feature representation is a function of the raw representation, which captures the information that is relevant to a machine learning algorithm for performing a task on the item, such as classification, extraction, regression, ranking, and so forth.;
See also [0416] Each edge 1312 in trellis 1310 has a weight that is a function of features in the document. Using standard decoding algorithms ( e.g., Viterbi), one can identify the
highest-weight path through the trellis 1310 and output the corresponding labeling of the tokens 1316 and transitions (edges) 1312. One can also train the weight functions so that the probability of any given path can be extracted.).
Claims 2-3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Simard et al., US Pub. No. 2017/0039486, in view of Heimann et al., US Patent No. 10,685,293.

As to claim 2, Simard does not disclose:  
deduplicating the data records of the dataset using the classification of the data points of
the dataset by the trained machine learning model; and
one of merging and keeping separate data records of each data point of the dataset based
on the classification of the data points of the dataset by the trained machine learning model;

However, Heimann discloses: the method of claim 1, further comprising one of:
deduplicating the data records of the dataset using the classification of the data points of
the dataset by the trained machine learning model; 
(Heimann col. 21 ln. 12-16: Finally, after aggregating queries by client and deduplicating
E2LDs across a rolling span of queries, platform 300 may ignore any inference window containing fewer than four distinct E2LDs.; see also col. 24 ln. 38-42: The following
section outlines and explains example training data, 40 feature extraction, supervised learning classifier, the active learning system, the self-training system, and the context providing
categorization engine that may comprise Janus.)
and
one of merging and keeping separate data records of each data point of the dataset based
on the classification of the data points of the dataset by the trained machine learning model
(Heimann col.  Ln. 1-4: Cyber Active Learning Intelligence (CALI) may merge intelligent machines and humans to create a platform that autonomously and actively learns like a human, together with humans in some embodiments).




As to claim 3, Heimann discloses under the rationale above, the method of claim 2, further comprising:
storing deduplicated data records of the dataset using the classification of the data points
of the dataset by the trained machine learning model (Heimann col. 21 ln. 12-16: Finally, after aggregating queries by client and deduplicating E2LDs across a rolling span of queries, platform 300 may ignore any inference window containing fewer than four distinct E2LDs).

Claims 8-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Simard et al., US Pub. No. 2017/0039486, in view of Sharma et al., US Patent No.: US 10,977,518

As to claim 8, Simard does not disclose:
wherein selecting the subset of unlabeled data points further comprises:
selecting from the current set of unlabeled data points an intermediate subset of unlabeled
data points using the classification results;
clustering the data points of the intermediate subset of unlabeled data points using a first
subset of attributes of the set of attributes, resulting in multiple clusters; 

for each cluster of the multiple clusters identifying a closest data point to the centroid of
the cluster, wherein the subset of unlabeled data points comprises the identified closest data points;

However, Sharma discloses:
the method of claim 1, wherein selecting the subset of unlabeled data points further comprises:
selecting from the current set of unlabeled data points an intermediate subset of unlabeled
data points using the classification results;
(Sharma col. 4 ln. 64-col. 5 ln. Generally, active learning is a machine learning-based
technique that can be useful in reducing the amount of human-annotated data required to achieve a target performance Active learning often starts by incrementally training a ML model with a small, labeled dataset and then applying this model to the unlabeled data. For each unlabeled sample, the system estimates whether this sample includes information that has not been learned by the model. An example of active learning algorithm is to train an object detection model that takes an image as input and outputs a set of bounding boxes. To train such an object detection model, the training and validation images of the detector are annotated
with a bounding box per object and its category ).
clustering the data points of the intermediate subset of unlabeled data points using a first
subset of attributes of the set of attributes, resulting in multiple clusters; 
(Sharma col. 10 ln. 28-64: In some embodiments, the element selector comprises a clustering ML model (e.g., implementing a k-means type algorithm) to identify different clusters of data elements and the element selector selects one or more data elements from each cluster (or from some subset of the clusters), such as centroids from the clusters.).
and
for each cluster of the multiple clusters identifying a closest data point to the centroid of

(Sharma col. 10 ln. 57-col. 11 ln. 7: In some embodiments, the element selector comprises a clustering ML 60 model (e.g., implementing a k-means type algorithm) to identify different clusters of data elements and the element selector selects one or more data elements from each cluster (or from some subset of the clusters), such as centroids from the clusters. Alternatively, the element selector may comprise a distance comparison engine that generates distances (or similarities) between pairs of the embeddings (using distance metrics/techniques known to those of skill in the art) and selecting one or more of the data elements from the
candidate set based on these distances (or similarities). For example, a first data element may be selected that has a highest overall combined distance to all other data elements, and potentially a second data element may be selected that has a second highest overall combined distance to all other data elements, etc.)

It would have been obvious to one having ordinary skill in the art at the time the time of the effective filing date to apply an annotation service using clustering as taught by Sharma since it was known in the art that active/machine learning systems provide an annotation service
may also solicit additional feedback from the annotators during the annotation of certain types of data elements to further improve the training of an annotating machine learning model and/or the ability to detect or process annotations for edge case data elements, bad/difficult data elements, etc. where accordingly, the annotation service can automatically and quickly identify useful examples of data elements to be provided as part of annotation job instructions, removing the need for job submitters to manually find such examples from often exceedingly large sets of data elements (Sharma col. 2 ln. 10-20 ).


selecting from the current set of unlabeled data points a first intermediate subset of
unlabeled data points using the classification results;
(Sharma col. 10 ln. 55-64: With these embeddings, the filtering module 122 may utilize a representative data element selector to identify a number of different embeddings corresponding to different data elements. In some embodiments, the element selector comprises a clustering ML model (e.g., implementing a k-means type algorithm) to
identify different clusters of data elements and the element selector selects one or more data elements from each cluster (or from some subset of the clusters), such as centroids from
the clusters.)
selecting from the first intermediate subset of unlabeled data points a second intermediate
subset of unlabeled data points using a metadata parameter descriptive of the data points;
(Sharma col. 8 ln. 51-62: This active learning model may comprise a ML model ( such as a convolutional neural network ( CNN) or other deep model) that is trained with actual input data
elements ( e.g., images) using the consolidated annotations/ labels, and optionally other metadata such as a confidence score of the consolidated annotations, individual annotations
of the involved annotators, and/or quality scores of the involved annotators. The ML model may thus be created, through this iterative training, to perform inference by examining a data element to predict/infer a label along with a corresponding confidence score for its label for the data element.)
clustering the data points of the second intermediate subset of unlabeled data points using
a first subset of attributes of the set of attributes, resulting in multiple clusters; and
for each cluster of the multiple clusters identifying a closest data point to the centroid of
the cluster, wherein the subset of unlabeled data points comprises the identified data points

candidate set based on these distances (or similarities). For example, a first data element may be selected that has a highest overall combined distance to all other data elements, and potentially a second data element may be selected that has a second highest overall combined distance to all other data elements, etc.).

As to claim 10, Sharma discloses under the rationale above the method of claim 1, wherein selecting the subset of unlabeled data points further comprises:
selecting from the current set of unlabeled data points an intermediate subset of unlabeled
data points using the classification results; 
(Sharma col. 10 ln. 55-64: With these embeddings, the filtering module 122 may utilize a representative data element selector to identify a number of different embeddings corresponding to different data elements. In some embodiments, the element selector comprises a clustering ML model (e.g., implementing a k-means type algorithm) to
identify different clusters of data elements and the element selector selects one or more data elements from each cluster (or from some subset of the clusters), such as centroids from
the clusters.)
and
selecting from the intermediate subset of unlabeled data points the subset of unlabeled data
points using a metadata parameter descriptive of the data points

elements ( e.g., images) using the consolidated annotations/ labels, and optionally other metadata such as a confidence score of the consolidated annotations, individual annotations
of the involved annotators, and/or quality scores of the involved annotators. The ML model may thus be created, through this iterative training, to perform inference by examining a data element to predict/infer a label along with a corresponding confidence score for its label for the data element.).


As to claim 11, Sharma discloses under the rationale above the method of claim 9, wherein the metadata parameter comprises at least one of: a last modification time of the data point; or a user priority value of the data point
(Sharma col. 7 ln. 39-44: The consolidated annotations (and optionally, confidences) may also be stored as part of job results 134 in the repository 114 (e.g., with an identifier of the corresponding data element, a time of annotation, etc.), and in some embodiments the individual annotations provided by the individual annotators may also be recorded;
See also col. 6 ln. 20-25: The AJC 112 may receive the request and begin to
configure the annotation job. For example, the AJC 112 may at circle (3) store job information 128 (e.g., the information provided via the interface 200 of FIG. 2, along with other metadata such as timestamps, a user identifier, etc.) in an annotation job repository).

As to claim 12, Sharma discloses under the rationale above the method of claim 1, further comprising:
clustering received data points using a second subset of attributes of the set of attributes,
resulting in multiple clusters; 

identify different clusters of data elements and the element selector selects one or more data elements from each cluster (or from some subset of the clusters), such as centroids from
the clusters.)
and
for each cluster of the multiple clusters identifying a closest data point to the centroid of
the cluster, wherein the set of unlabeled data points comprises the identified data points
(Sharma col. 10 ln. 57-col. 11 ln. 7: In some embodiments, the element selector comprises a clustering ML 60 model (e.g., implementing a k-means type algorithm) to identify different clusters of data elements and the element selector selects one or more data elements from each cluster (or from some subset of the clusters), such as centroids from the clusters. Alternatively, the element selector may comprise a distance comparison engine that generates distances (or similarities) between pairs of the embeddings (using distance metrics/techniques known to those of skill in the art) and selecting one or more of the data elements from the
candidate set based on these distances (or similarities). For example, a first data element may be selected that has a highest overall combined distance to all other data elements, and potentially a second data element may be selected that has a second highest overall combined distance to all other data elements, etc.).

As to claim 13, Sharma discloses the method of claim 12, wherein the second subset of attributes is one of:
the same as the first subset of attributes; 

See also col. 6 ln. 64-69: In some embodiments, a same data element may be
provided to multiple annotators for annotation, and the multiple annotations may be "consolidated" into one "final" annotation)

and different from the first subset of attributes
(Sharma col. 13 ln. 40-50: At block 720, the operations 700 include sending, to a
client device of an annotator, the job instruction to be presented to the annotator for an annotation task involving first data element of the plurality of data elements, the job
instruction including or identifying the selected one or more data elements, wherein the first data element is different than the selected one or more data elements)

As to claim 14, Sharma discloses under the rationale above the method of claim 12, wherein the first subset of attributes is part of attributes of the second subset of attributes
(Sharma col. 8 ln. 37-45: As another example, the auto-example selection module
118 may analyze an annotated data element and if it detects that some threshold amount ( e.g., two, three, etc.) annotators 40 having "high" quality scores (above some threshold) provided back substantially or completely different annotations, the auto-example selection module 118 may mark that data element as a candidate to be an "edge case" example because a significant number of high-quality annotators labeled it incorrectly.).


(Sharma col. 13 ln. 40-50: At block 720, the operations 700 include sending, to a
client device of an annotator, the job instruction to be presented to the annotator for an annotation task involving first data element of the plurality of data elements, the job
instruction including or identifying the selected one or more data elements, wherein the first data element is different than the selected one or more data elements. The job instruction
may also include a textual description provided by the user indicating how to annotate each of the selected one or more data elements.
See also col. 10 ln. 49-58: The filtering module 122 may be implemented using a variety of techniques. As one example, the filtering module 122 may generate a representation of each data element (e.g., an embedding) via an embedding generator that is specific to the type of the data element ( e.g., image, audio clip, etc.). Various types of embedding generators are known of those of skill in the art. With these embeddings, the filtering module 122 may utilize a representative data element selector to identify a number of different embeddings corresponding to different data elements).

Claims 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Simard et al., US Pub. No. 2017/0039486, in view of Hughes et al., US Pub. No. 2020/0202171.

As to claim 16, Simard does not disclose:
wherein selecting is performed using one of random sampling;
margin sampling; entropy sampling; and disagreement sampling.

However, Hughes discloses:

random sampling;
(Hughes [0012] random sampling algorithm;)
margin sampling; 
(Hughes [0012] minimum margin sampling algorithm;)
entropy sampling; 
(Hughes [0012]  entropy sampling algorithm;;)
and disagreement sampling
(Hughes [0154] Measures of quality may include precision, recall, average precision,
receiver operator characteristic scores, and F-beta scores, for example. Other measures of quality may be used. Examples of predictions where the models agree, as well as disagree
may be presented to a user through the reporting 328; 
see also [0012] According to any of the above aspects of the disclosure, the respective sampling algorithm can be selected from a density sampling algorithm; entropy sampling
algorithm; estimated error reduction sampling algorithm; exhaustive sampling algorithm; flagged predictions algorithm; hard negative mining sampling algorithm; high confidence sampling algorithm; linear sampling algorithm; map visualization sampling algorithm; metadata search sampling algorithm; minimum margin sampling algorithm; query by committee sampling algorithm; random sampling algorithm; review sampling algorithm; search sampling
algorithm; similarity sampling algorithm; sampling of samples for which the input was to skip the sample type algorithm; stratified sampling algorithm; most confident samples algorithm; or most uncertain samples algorithm.).

It would have been obvious to one having ordinary skill in the art at the time the time of the effective filing date to apply a selection of sampling algorithms as taught by Hughes since it was known in the art that active/machine learning systems provide an annotation system receiving a .


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

Dasgupta et al. , US Patent No. 10,719,301, teaches methods are disclosed to implement a model development environment (MDE) that allows a team of users to perform iterative model experiments to develop machine learning (ML) media models. In embodiments, the MDE implements a media data management interface that allows users to annotate and manage
training data for models. In embodiments, the MDE implements a model experimentation interface that allows users to configure and run model experiments, which include a
training run and a test run of a model. In embodiments, the MDE implements a model diagnosis interface that displays the model's performance metrics and allows users to visually inspect media samples that were used during the model experiment to determine corrective actions to improve model performance for later iterations of experiments. In embodiments, the MDE allows different types of users to collaborate on a series of model experiments to build an optimal media model.


CONTACT INFORMATION
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EVAN S ASPINWALL whose telephone number is (571)270-7723. The examiner can normally be reached Monday-Friday 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Neveen Abel-Jalil can be reached on 571-270-0474. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN US

/Evan Aspinwall/Primary Examiner, Art Unit 2152                                                                                                                                                                                                        1/25/2022