DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This Office Action is in response to correspondence filed 13 January 2021 in reference to application 17/148,020.  Claims 1019 and 12-21 are pending and have been examined.

Response to Amendment
The amendment filed 19 January 2021 has been accepted and considered in this office action.  Claim 11 has been cancelled and claims 12-21 added.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-6, 10, 11-14 and 17-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Claim(s) 1, 10 and 17 recite(s) identifying attributes of input text, where the attribute can be different values, recognizing bias with respect to the attribute, and generating output text corresponding to the attribute imparting diversity and based on an optimization function.  The limitation of identifying attributes of input text, where the attribute can be different values, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than generic computer components such as “a memory” and “a processor” in claim 10 and “computer readable storage medium” in claim 17 nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the generic computer components, “identifying” in the context of this claim encompasses the user manually reading text data and marking attributes that could be changed to different values.  Similarly, the limitation of recognizing bias in the attributes, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than generic computer components such as “a memory” and “a processor” in claim 10 and “computer readable storage medium” in claim 17 nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the generic computer components, “recognizing” in the context of this claim encompasses the user manually tallying the value of attributes and identifying bias within those values.  Finally, the limitation of generating output text, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than generic computer components such as “a memory” and “a processor” in claim 10 and “computer readable storage medium” in claim 17 nothing in the claim element precludes the step from practically being performed in the mind. For example, but for the generic computer components, “generating” in the context of this claim encompasses the user manually creating new values for the attributes while choosing values that reduce a measure of loss. Accordingly the claims recite an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the additional elements claimed are generic computer components such as “a memory” and “a processor” in claim 10 and “computer readable storage medium” in claim 17 used to perform the various steps. The computer components are recited at a high-level of generality (i.e., as a generic processor performing a generic computer function of manipulating information based on a cost functions) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element computer components to perform both the steps amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. These claims are not patent eligible.

Claims 2, 12, and 18 are rejected as well as being directed toward an abstract idea.  These claims specify that recognizing bias is performed by determining whether possible values are present in the attributes of the training data.  Similar to above, a user could manually tally values of the training data and make the determination.  Additionally, based on the same rationale discussed above, these claims are not integrated into a practical application or amount to significantly more than the judicial exception.  These claims are not patent eligible. 

Claims 3, 4, 13, and 19 are rejected as well as being directed toward an abstract idea.  These claims specify that the input text is a sentence and the output text is a sentence, where the output text contains different attribute values.  Similar to above, a user could manually perform the steps claimed to text in sentence form.  Additionally, based on the same rationale discussed above, these claims are not integrated into a practical application or amount to significantly more than the judicial exception.  These claims are not patent eligible. 

Claims 5, 14, and 20 are rejected as well as being directed toward an abstract idea.  These claims specify adding the output to the corpus of training text to use to train a machine learning model.  Similar to above, a user could manually write the additional sentences into a training corpus.  Note that actually training the model is not claimed.  Additionally, based on the same rationale discussed above, these claims are not integrated into a practical application or amount to significantly more than the judicial exception.  These claims are not patent eligible. 

Claim 6 is rejected as well as being directed toward an abstract idea.  This claim specifies comparing the output to a bias model to evaluate performance.  Similar to above, a user could manually compare output values.  Additionally, based on the same rationale discussed above, these claims are not integrated into a practical application or amount to significantly more than the judicial exception.  These claims are not patent eligible. 

Claims 7-9, 15-16, and 21 are NOT rejected as being directed toward a judicial exception because the calculation of the specified optimization function could not be meaningfully performed manually and therefore amount to significantly more than the abstract idea. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-6, 10, 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lee et al. (US Patent 10,796,104) in view of Beaver (US PAP 2020/0143794) and further in view of Xu et al. (Conditional BERT Contextual Augmentation).

Consider claim 1, Lee teaches a method (abstract) comprising: 
identifying an attribute of input text, the input text selected from a machine learning model training corpus, the attribute comprising one or more words of the input text, and the attribute corresponding to an attribute class encompassing a plurality of different possible class values (col 11 lines 7-36, identifying slots which can have different values other than the one in the training example); and 
generating output text corresponding to the attribute and imparting diversity with respect to the attribute class and relative to the input text (col 13 lines 10-57, identified slots have their values replaced with random values in order to impart diversity).
Lee does not specifically teach recognizing bias in the input text with respect to the attribute class.
In the same field of preparing training data, Beaver teaches recognizing bias in the input text with respect to the attribute class (0019-23, detecting bias within training data).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to detect bias as taught by Beaver in the system of Lee in order to generate a corpus that results in more accurately trained models (Beaver 0003-04).
Lee and Beaver do not specifically teach wherein generating the output text uses an optimization function based on a plurality of loss objectives to minimize loss in the generated output text as compared to the input text.
In the same field of data augmentation, Wu teaches wherein generating the output text uses an optimization function based on a plurality of loss objectives to minimize loss in the generated output text as compared to the input text (section 3, especially section 3.3, replacement for masked (i.e. slot to be replaced) word is generated based on loss functions comparing to labels of original sentence).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to use cost optimization as taught by Wu in the system of Lee and Beaver in order to allow for context to be maintained in the augmented corpus (Wu abstract and introduction).

Consider claim 2, Beaver teaches the method of claim 1, wherein recognizing the bias comprises identifying that one possible class value of the plurality of different possible class values of the attribute class is represented by the input text and at least one other possible class value of the plurality of different possible class values of the attribute class is not represented by the input text (0023, single class bias, where one possible value is over or under represented).

Consider claim 3, Lee teaches the method of claim 1, wherein the input text comprises a sentence and wherein the generated output text comprises one or more output sentences (figure 3, col 13 lines 60-67, text manipulated in sentence form.).

Consider claim 4, Lee and Wu teach the method of claim 3, wherein each output sentence of the one or more output sentences represents a respective different possible class value of the plurality of different possible class values of the attribute class (Lee col 13 lines 450 col 14 lines 17, slots replaced with either random values or values from a list.  Wu section 3, especially section 3.3, replacement for masked (i.e. slot to be replaced) word is generated based on loss functions comparing to labels of original sentence, and thus is a different value with the same context as the original sentence).

Consider claim 5, Lee teaches the method of claim 1, further comprising including the generated output text in the machine learning model training corpus to facilitate debiased training of a machine learning model using the machine learning model training corpus (col 14 line 32-46, training a learning model using the artificially diversified corpus).

Consider claim 6, the above combination of Lee, Beaver, and Wu additionally teach the method of claim 1, but does not specifically teach further comprising using the generated output text as a baseline for comparison against output of a training corpus debiasing model to evaluate performance of the training corpus debiasing model in debiasing the machine learning model training corpus.
Wu additionally teaches using the generated output text as a baseline for comparison against output of a training corpus debiasing model to evaluate performance of the training corpus debiasing model in debiasing the machine learning model training corpus (section 4.2.3 and 4.2.4, comparing generating sentences to those output by other models).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to compare the generated output to that of other models as taught by Wu in the system of Lee, Beaver, and Wu in order to insure the data augmentation is meeting desired benchmarks of quality.
Consider claim 10, Lee teaches a computer system (abstract) comprising: 
a memory (col 14 lines 40-44, RAM, ROM); and 
a processor in communication with the memory (col 14 line 54-57, processor), wherein the computer system is configured to perform a method comprising: 
identifying an attribute of input text, the input text selected from a machine learning model training corpus, the attribute comprising one or more words of the input text, and the attribute corresponding to an attribute class encompassing a plurality of different possible class values (col 11 lines 7-36, identifying slots which can have different values other than the one in the training example); and 
generating output text corresponding to the attribute and imparting diversity with respect to the attribute class and relative to the input text (col 13 lines 10-57, identified slots have their values replaced with random values in order to impart diversity).
Lee does not specifically teach recognizing bias in the input text with respect to the attribute class.
In the same field of preparing training data, Beaver teaches recognizing bias in the input text with respect to the attribute class (0019-23, detecting bias within training data).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to detect bias as taught by Beaver in the system of Lee in order to generate a corpus that results in more accurately trained models (Beaver 0003-04).
Lee and Beaver do not specifically teach wherein generating the output text uses an optimization function based on a plurality of loss objectives to minimize loss in the generated output text as compared to the input text.
In the same field of data augmentation, Wu teaches wherein generating the output text uses an optimization function based on a plurality of loss objectives to minimize loss in the generated output text as compared to the input text (section 3, especially section 3.3, replacement for masked (i.e. slot to be replaced) word is generated based on loss functions comparing to labels of original sentence).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to use cost optimization as taught by Wu in the system of Lee and Beaver in order to allow for context to be maintained in the augmented corpus (Wu abstract and introduction).

Claim 12 contains similar limitations as claim 2 and is therefore rejected for the same reasons. 

Claim 13 contains similar limitations as claim 4 and is therefore rejected for the same reasons. 

Claim 14 contains similar limitations as claim 5 and is therefore rejected for the same reasons. 

Consider claim 17, Lee teaches a computer program product comprising: 
a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing (col 14 lines 40-44, RAM, ROM) a method comprising:  
identifying an attribute of input text, the input text selected from a machine learning model training corpus, the attribute comprising one or more words of the input text, and the attribute corresponding to an attribute class encompassing a plurality of different possible class values (col 11 lines 7-36, identifying slots which can have different values other than the one in the training example); and 
generating output text corresponding to the attribute and imparting diversity with respect to the attribute class and relative to the input text (col 13 lines 10-57, identified slots have their values replaced with random values in order to impart diversity).
Lee does not specifically teach recognizing bias in the input text with respect to the attribute class.
In the same field of preparing training data, Beaver teaches recognizing bias in the input text with respect to the attribute class (0019-23, detecting bias within training data).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to detect bias as taught by Beaver in the system of Lee in order to generate a corpus that results in more accurately trained models (Beaver 0003-04).
Lee and Beaver do not specifically teach wherein generating the output text uses an optimization function based on a plurality of loss objectives to minimize loss in the generated output text as compared to the input text.
In the same field of data augmentation, Wu teaches wherein generating the output text uses an optimization function based on a plurality of loss objectives to minimize loss in the generated output text as compared to the input text (section 3, especially section 3.3, replacement for masked (i.e. slot to be replaced) word is generated based on loss functions comparing to labels of original sentence).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to use cost optimization as taught by Wu in the system of Lee and Beaver in order to allow for context to be maintained in the augmented corpus (Wu abstract and introduction).

Claim 18 contains similar limitations as claim 2 and is therefore rejected for the same reasons. 

Claim 19 contains similar limitations as claim 4 and is therefore rejected for the same reasons. 

Claim 20 contains similar limitations as claim 5 and is therefore rejected for the same reasons. 

Allowable Subject Matter
Claims 7-9 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Consider claim 7, Lee, Beaver, and Wu teach The method of claim 1. However the prior art of record does not teach or fairly suggest the limitations of “wherein the plurality of loss objectives comprises (i) a label loss corresponding to cross entropy in the generated output text as compared to the input text, (ii) a proximity loss corresponding to loss of proximity of the generated output text as compared to the input text, (iii) an attribute loss corresponding to loss of attentiveness to the identified attribute in the generated output text as compared to the input text, and (iv) diversity loss corresponding to overlap in diversity with respect to the attribute class of the generated output text as compared to the input text” when combined with each and every other limitation of the claim and the base claim.  Therefore claim 7 contains allowable subject matter.

Claims 8 and 9 depend on and further limit claim 7 and therefore contain allowable subject matter as well.

Claim 15 contains similar limitations as claim 7 and therefore contains allowable subject matter as well.

Claim 16 depends on and further limits claim 15 and therefore contains allowable subject matter as well. 

Claim 21 contains similar limitations as claim 7 and therefore contains allowable subject matter as well.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Garimella et al. (US PAP 2022/0147713) discusses removing bias in textual models.  Gadde et al.  (US PAP 2021/0390951) discusses data augmentation for slots as well.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS C GODBOLD whose telephone number is (571)270-1451. The examiner can normally be reached 6:30am-5pm Monday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on (571)272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

DOUGLAS GODBOLD
Examiner
Art Unit 2655



/DOUGLAS GODBOLD/Primary Examiner, Art Unit 2655