Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

EXAMINER'S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

Authorization for this examiner’s amendment was given in an interview with Ariyeh Akmal on 5/2/22.

The application has been amended as follows: 
Claim 1
A system for document analysis comprising: 
a processor; 
a data store, comprising a first target corpus of electronic document and a second target corpus of electronic documents; and 
a non-transitory computer readable medium comprising instructions for: 
receiving a definition of a first code in association with the first target corpus; 
creating a first dataset for the first code, wherein the first dataset comprises documents from the first target corpus; 
receiving an indication that the first code is to be boosted with a second code, wherein the second code is associated with the second target corpus and the second code is associated with a second dataset, the second dataset comprising a first set of positive signals associated with the second code and documents of the second corpus and a first set of negative signals associated with the second code and documents of the second target corpus, wherein a positive signal of the first set of positive signals indicates that a first associated document of the second target corpus belongs to the second code and a negative signal of the first set of negative signals indicates that a second associated document of the second target corpus does not belong to the second code; 
adding the second dataset associated with the second code and the second target corpus to the first dataset of the first code such that the first dataset comprises a boosting dataset including the second dataset comprising the first set of positive signals associated with the second code and documents of the second corpus and the first set of negative signals associated with the second code and documents of the second target corpus; training a first machine learning model for the first code on the boosting dataset of the first dataset, including training the first machine learning model based on the positive signal, the first associated document of the second target corpus, the [ first ] negative signal, and the second associated document of the second target corpus of the boosting dataset; 
generating predictive scores for the first code for documents of the first target corpus using the first machine learning model; and 
 3presenting the predictive scores in association with documents of the first target corpus to a user.

Claim 8
A method for document analysis comprising: 
receiving a definition of a first code in association with the first target corpus; 
creating a first dataset for the first code, wherein the first dataset comprises documents from the first target corpus; 
receiving an indication that the first code is to be boosted with a second code, wherein the second code is associated with the second target corpus and the second code is associated with a second dataset, the second dataset comprising a first set of positive signals associated with the second code and documents of the second corpus and a first set of negative signals associated with the second code and documents of the second target corpus, wherein a positive signal of the first set of positive signals indicates that a first associated document of the second target corpus belongs to the second code and a negative signal of the first set of negative signals indicates that a second associated document of the second target corpus does not belong to the second code; 
adding the second dataset associated with the second code and the second target corpus to the first dataset of the first code such that the first dataset comprises a boosting dataset including the second dataset comprising the first set of positive signals associated with the second code and documents of the second corpus and the first set of negative signals associated with the second code and documents of the second target corpus; 
training a first machine learning model for the first code on the boosting dataset of the first dataset, including training the first machine learning model based on the positive signal, the first associated document of the second target corpus, the [ first ] negative signal, and the second associated document of the second target corpus of the boosting dataset; 
generating predictive scores for the first code for documents of the first target corpus using the first machine learning model; and 
presenting the predictive scores in association with documents of the first target corpus to a user.

Claim 15
A non-transitory computer readable medium, comprising instruction for: 
receiving a definition of a first code in association with the first target corpus; 
creating a first dataset for the first code, wherein the first dataset comprises documents from the first target corpus; 
receiving an indication that the first code is to be boosted with a second code, wherein the second code is associated with the second target corpus and the second code is associated with a second dataset, the second dataset comprising a first set of positive signals associated with the second code and documents of the second corpus and a first set of negative signals associated with the second code and documents of the second target corpus, wherein a positive signal of the first set of positive signals indicates that a first associated document of the second target corpus belongs to the second code and a negative signal of the first set of negative signals indicates that a second associated document of the second target corpus does not belong to the second code; 
adding the second dataset associated with the second code and the second target corpus to the first dataset of the first code such that the first dataset comprises a boosting dataset including the second dataset comprising the first set of positive signals associated with the second code and documents of the second corpus and the first set of negative signals associated with the second code and documents of the second target corpus; 
training a first machine learning model for the first code on the boosting dataset of the first dataset, including training the first machine leaning model based on the positive signal, the first associated document of the second target corpus, the [ first ] negative signal, and the second associated document of the second target corpus of the boosting dataset; 
generating predictive scores for the first code for documents of the first target corpus using the first machine learning model; and 
presenting the predictive scores in association with documents of the first target corpus to a user.



Reasons for Allowance
The following is an examiner’s statement of reasons for allowance:
The prior art does not reasonably teach or suggest a document analysis system operable to train a first machine learning model based on a first defined code functional to create a first data set comprising documents from a first target corpus wherein the first code is boosted with a second defined code associated with a second target corpus; the second corpus comprising a second dataset associated therewith; the second dataset comprising a positive signal within a first set of positive and a negative signal within a first set of negative signals associated with the second code, corpus, etc. the signals operable to define an association between second corpus documents and the second code, tag, label, etc. wherein the machine learning model is iteratively trained based on the boosting dataset which includes the second dataset comprising the first set of positive signals associated with the second code and documents of the second corpus and the first set of negative signals associated with the second code and documents of the second target corpus and further includes the positive signal, the first associated document of the second target corpus, the negative signal,  and the second associated document of the second corpus and wherein the trained model operates to predict scores of a first code for documents of the first corpus.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL C MCCORD whose telephone number is (571)270-3701. The examiner can normally be reached 730-630 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VIVIAN CHIN can be reached on 5712727848. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/PAUL C MCCORD/Primary Examiner, Art Unit 2654