DETAILED ACTION

Status of the Application

The present application is being examined under the pre-AIA  first to invent provisions. 
This action is in response to the application filed on 01/28/2022. Claims 1-5, 8-14, and 17-19 are currently pending. 
Claims of 01/28/2022 are further amended with an Examiner’s amendment here as provided below. The Examiner’s amendment amends claims 1, 3, 10, 12, and 19 and cancels claims 2, 9, 11, and 18. Claims 1, 3-5, 8, 10, 12-14, 17, and 19 are allowed.

Allowable Subject Matter

Claims 1, 3-5, 8, 10, 12-14, 17, and 19 are allowed.

EXAMINER’S AMENDMENT

An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

	Please amend the claims listing dated 01/28/2022 as follows:
1.        (Currently Amended)  A method for data acquisition for construction of classification models, the method comprising:
(a) receiving, using a hardware processor, a budget and a cost structure for constructing a classification model using a data set that includes positive and negative examples of a class of interest, wherein the classification model is set to a guided learning mode that receives selected instances of the class of interest that satisfy at least one criterion from the data set and wherein the cost structure includes a cost of a search performed by a human reviewer;
(b) in response to the classification model being set to the guided learning mode, transmitting, using the hardware processor, a definition of the class of interest over the Internet for review by a plurality of human reviewers;
(c) in response to the classification model being set to the guided learning mode, transmitting, using the hardware processor, instructions over the Internet to the plurality of human reviewers to search through the data set and select one or more instances of the class that satisfy at least one criterion;
(d) receiving, using the hardware processor, an indication over the Internet that the one or more instances from the data set have been selected by at least one of the plurality of human reviewers;
(e) training the classification model in the guided learning mode, using the hardware processor, with the one or more selected instances;
(f) estimating, using the hardware processor, a performance of the classification model after training the classification model with the one or more selected instances to construct a learning curve;
(g) determining, using the hardware processor, a rate of change in the estimated performance of the classification model as a function of the cost of at least one of the plurality of human reviewers performing an additional search in the guided learning mode, wherein the rate of change in the estimated performance of the classification model is determined based on a slope of the learning curve;
(h) repeating (b)-(g) until it is determined at (g) that the rate of change in the estimated performance as a function of the cost of the at least one of the plurality of human 
(i) switching, using the hardware processor, the classification model from the guided learning mode to an active learning mode, which includes machine learning, that receives labelled instances from portions of the data set selected as being useful to the classification model in response to determining that the rate of change in the estimated performance as a function of the cost of the at least one of the plurality of human reviewers performing the additional search in the guided learning mode is lower than the first predetermined threshold at (h), until the estimated performance of the classification model is greater than a second predetermined threshold or the budget for constructing the classifier has been exhausted, wherein the plurality of human reviewers include human reviewers that are available to label one or more instances of the class of interest during training of the classification model using the active learning mode and human reviewers that are available to search for the one or more instances of the class of interest in response to the instruction transmitted at (c), and wherein the training of the classification model using the active learning mode is discontinued in response to determining that the rate of change in the estimated performance as a function of a cost of labeling performed by the plurality of human reviewers is lower than a third predetermined threshold;
(j) receiving, using the hardware processor, identifying information of a web page to be classified by the trained classification model;
(k) classifying, using the hardware processor, the web page using the trained classification model to determine whether the web page is a member of the class of interest; and
(l) transmitting information indicating whether the web page is a member of the class of interest to an advertiser, over the Internet, in response to receiving a request for classification information about the web page.

2.	(Cancelled)


1, further comprising:
allocating, using the hardware processor, a portion of the budget to the plurality of human reviewers available for searching and a remaining portion of the budget to the plurality of human reviewers available for labeling.

4.	(Previously Presented) The method of claim 1, wherein the at least one criterion includes a criterion that the one or more instances to be selected are to be positive examples of the class of interest.

5.	(Previously Presented) The method of claim 1, wherein a subset of the data set is selected for presentation to the plurality of human reviewers by at least one of: uncertainty sampling and boosted disagreement with query-by-committee.

6-7.	(Previously Cancelled) 

8.	(Previously Presented) The method of claim 1, wherein the data set includes online resources containing pointers to the class of interest, and wherein the instructions to search through the data set further comprise instructions to query the online resources for examples of the class of interest that meet the at least one criterion.

9.	(Previously Cancelled)

10.	(Currently Amended) A system for data acquisition for construction of classification models, the system comprising:
a processor that:
(a) receives a budget and a cost structure for constructing a classification model using a data set that includes positive and negative examples of a class of interest, wherein the classification model is set to a guided learning mode that receives selected instances of the class of interest that satisfy at least one criterion from the data set and wherein the cost structure includes a cost of a search performed by a human reviewer;
(b) in response to the classification model being set to the guided learning mode, transmits a definition of the class of interest over the Internet for review by a plurality of human reviewers;

(d) receives an indication over the Internet that the one or more instances from the data set have been selected by at least one of the plurality of human reviewers;
(e) trains the classification model in the guided learning mode with the one or more selected instances;
(f) estimates a performance of the classification model after training the classification model with the one or more selected instances to construct a learning curve;
(g) determines a rate of change in the estimated performance of the classification model as a function of the cost of at least one of the plurality of human reviewers performing an additional search in the guided learning mode, wherein the rate of change in the estimated performance of the classification model is determined based on a slope of the learning curve;
(h) repeats (a)-(g) until it is determined at (g) that the rate of change in the estimated performance as a function of the cost of the at least one of the plurality of human reviewers performing the additional search in the guided learning mode is lower than a first predetermined threshold;
(i) switches the classification model from the guided learning mode to an active learning mode, which includes machine learning, that receives labelled instances from portions of the data set selected as being useful to the classification model in response to determining that the rate of change in the estimated performance as a function of the cost of the at least one of the plurality of human reviewers performing the additional search in the guided learning mode is lower than the first predetermined threshold at (h), until the estimated performance of the classification model is greater than a second predetermined threshold or the budget for constructing the classifier has been exhausted, wherein the plurality of human reviewers include human reviewers that are available to label one or more instances of the class of interest during training of the classification model using the active learning mode and human reviewers that are available to search for the one or more instances of the class of interest in response to the instruction transmitted at (c), and wherein the training of the classification model using the active learning mode is discontinued in response to determining that the rate of change in the estimated performance as a function of a cost of labeling performed by the plurality of human reviewers is lower than a third predetermined threshold;

(k) classifies the web page using the trained classification model to determine whether the web page is a member of the class of interest; and
(l) transmitting information indicating whether the web page is a member of the class of interest to an advertiser, over the Internet, in response to receiving a request for classification information of the web page.

11.	(Cancelled)

12.	(Currently Amended) The system of claim 10, wherein the processor is further configured to:
allocate a portion of the budget to the plurality of human reviewers available for searching and a remaining portion of the budget to the plurality of human reviewers available for labeling.

13.	(Previously Presented) The system of claim 10, wherein the at least one criterion includes a criterion that one or more instances to be selected are to be positive examples of the class of interest.

14.	(Previously Presented) The system of claim 10, wherein a subset of the data set is selected for presentation to the plurality of human reviewers by at least one of: uncertainty sampling and boosted disagreement with query-by-committee.

15-16. 	(Previously Cancelled)

17.	(Previously Presented) The system of claim 10, wherein the data set includes online resources containing pointers to the class of interest, and wherein the processor is further configured to instruct the plurality of human reviewers to query the online resources for examples of the class of interest that meet the at least one criterion.

18.	(Cancelled)


(a) receiving a budget and a cost structure for constructing a classification model using a data set that includes positive and negative examples of a class of interest, wherein the classification model is set to a guided learning mode that receives selected instances of the class of interest that satisfy at least one criterion from the data set and wherein the cost structure includes a cost of a search performed by a human reviewer;
(b) in response to the classification model being set to the guided learning mode, transmitting a definition of the class of interest over the Internet to a plurality of human reviewers;
(c) in response to the classification model being set to the guided learning mode, transmitting instructions to the plurality of human reviewers to search through the data set and select one or more instances of the class that satisfy at least one criterion;
(d) receiving an indication over the Internet that the one or more instances from the data set have been selected by at least one of the plurality of human reviewers;
(e) training the classification model in the guided learning mode with the one or more selected instances;
(f) estimating a performance of the classification model after training the classification model with the one or more selected instances to construct a learning curve;
(g) determining a rate of change in the estimated performance of the classification model as a function of the cost of at least one of the plurality of human reviewers performing an additional search in the guided learning mode, wherein the rate of change in the estimated performance of the classification model is determined based on a slope of the learning curve;
(h) repeating (b)-(g) until it is determined at (g) that the rate of change in the estimated performance as a function of the cost of the at least one of the plurality of human reviewers performing the additional search in the guided learning mode is lower than a first predetermined threshold;
(i) switching, using the hardware processor, the classification model from the guided learning mode to an active learning mode, which includes machine learning, that receives labelled instances from portions of the data set selected as being useful to the classification model in response to determining that the rate of change in the estimated performance as a function of the cost of the at least one of the plurality of human reviewers performing the additional search in the guided learning mode is lower than the first predetermined threshold at (h), until the estimated , wherein the plurality of human reviewers include human reviewers that are available to label one or more instances of the class of interest during training of the classification model using the active learning mode and human reviewers that are available to search for the one or more instances of the class of interest in response to the instruction transmitted at (c), and wherein the training of the classification model using the active learning mode is discontinued in response to determining that the rate of change in the estimated performance as a function of a cost of labeling performed by the plurality of human reviewers is lower than a third predetermined threshold;
(j) receiving identifying information of a web page to be classified by the trained classification model;
(k) classifying the web page using the trained classification model to determine whether the web page is a member of the class of interest; and
(l) transmitting information indicating whether the web page is a member of the class of interest to an advertiser, over the Internet, in response to receiving a request for classification information about the web page.

Reasons for Allowance

Claimed invention provided in the Examiner’s amendment overcame the rejections under 35 U.S.C. 101 and 103, and accordingly the rejections have been withdrawn.
Claimed invention as provided in the examiner’s amendment above provides a practical application similar to Diamond vs. Diehr with the incorporation of monitoring performance and switching between states of guided and active machine learning.
Closest prior art(s) to the claimed invention include(s) Privault et al. (US 2010/0312725 A1), Horvitz et al.  (US 2010/0332281 A1), Hakkani-Tur et al. (US 7,835,910 B1), and Nigam et al. (US 2005/0210065 A1). None of the prior art(s) alone or 
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee. Such submissions should be clearly labeled "Comments on Statements of Reasons for Allowance".

Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ayanna Minor whose telephone number is (571)272-3605. The examiner can normally be reached M-F 9am-5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jerry O'Connor can be reached on 571-272-6787. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-

/A.M./Examiner, Art Unit 3624                                                                                                                                                                                                        



/MEHMET YESILDAG/Primary Examiner, Art Unit 3624