Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This office action is in response to application 16/870,917, which was filed 05/09/20. Claims 1-20 are pending in the application and have been considered.

35 USC § 101 Analysis (NOT A REJECTION)
Each of independent claims 1, 8, and 15 recite “… fine-tuning a language model using a training dataset”. As those skilled in the art of artificial intelligence would have recognized, this step invokes machine learning, which is considered to improve the functioning of the computer itself. Claims 1, 8, and 15 are therefore not directed to an abstract idea. Dependent claims 2-7, 9-14, and 16-20, which are dependent on and include the machine learning in parent claims 1, 8, and 15 respectively, are also not directed to an abstract idea. This is solely for clarity of the record and is NOT a rejection.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –



Claims 1, 8, and 15 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Peng et al. (“Data Augmentation for Spoken Language Understanding via Pretrained Models”. arXiv:2004.13952v1 [cs.CL] 29 Apr 2020).

Consider claim 1, Peng discloses a method implemented in a computer system comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor (implicit for implementing the AI data structures used in the experiments, Section 4, pages 4-5), the method comprising: fine-tuning a language model using a training dataset (“we finetune SC-GPT on the intent and slot-value labels in the training data”, Section 3, page 3); synthesizing a plurality of samples using the fine-tuned language model (“we augment the dialogue acts in the training data by replacing/inserting/deleting slot values to create more combinations. The finetuned model then generates multiple candidate utterances for each dialogue act”, Section 3, page 3), filtering the plurality of synthesized samples (“As the generated utterances may not always contain the required slot-value labels, we filter them to make sure that each utterance has
all the required input slot-values”, Section 3, page 3), and generating an augmented training dataset comprising the training dataset and the filtered plurality of synthesized sentences (“After filtering, around 500 utterance-DA pairs are added to the original training split”, Section 4.2, Data augmentation, page 4). 

Consider claim 8, Peng discloses a system comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor (implicit for implementing the AI data structures used in the experiments, Section 4, pages 4-5) to perform: fine-tuning a language model using a training dataset (“we finetune SC-GPT on the intent and 


Consider claim 15, Peng discloses a computer program product comprising a non-transitory computer readable storage having program instructions embodied therewith, the program instructions executable by a computer (implicit for implementing the AI data structures used in the experiments, Section 4, pages 4-5), to cause the computer to perform a method comprising: fine-tuning a language model using a training dataset (“we finetune SC-GPT on the intent and slot-value labels in the training data”, Section 3, page 3); synthesizing a plurality of samples using the fine-tuned language model (“we augment the dialogue acts in the training data by replacing/inserting/deleting slot values to create more combinations. The finetuned model then generates multiple candidate utterances for each dialogue act”, Section 3, page 3), filtering the plurality of synthesized samples (“As the generated utterances may not always contain the required slot-value labels, we filter them to make sure that each utterance has all the required input slot-values”, Section 3, page 3), and generating an augmented training dataset comprising the training dataset and the filtered plurality of synthesized sentences (“After filtering, . 


Allowable Subject Matter
Claims 2-7, 9-14, and 16-20 are objected to as being dependent on a rejected base claim, but would be allowable if rewritten in independent form including all limitations of the base and any intervening claims.

The following is the examiner’s statement of reasons for indicating subject matter allowable over the prior art of record:

With regard to claim 2, the prior art does not fairly teach or suggest ”…concatenating sentences in the training dataset according to: U*=y1SEPx1EOSy2SEPx2EOSy3 . . . ynSEPxnEOS, wherein SEP is an auxiliary token that separates between a class label and a corresponding sentence, and EOS is a token terminates a sentence and separates it from a label that follows; and synthesizing a plurality of samples comprises: generating a set of new labeled sentences starting from "y SEP". Claims 9 and 16 recite similar limitations. Dependent claims 3-7, 10-14, and 17-20 include the allowable subject matter of intervening dependent claims 2, 9, and 16 respectively. 


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US 20210142181 A1 Liu discloses adversarial training of machine learning models
US 20200226212 A1 Tan discloses adversarial training data augmentation for text classifiers
US 20180018576 A1 Boyer discloses text classifier training
US 20200065384 A1 Costello discloses determining an intent class based on a second-layer ensemble
 US 8165870 B2 Acero discloses a classification filter for processing data for creating a language model
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jesse Pullias whose telephone number is 571/270-5135. The examiner can normally be reached on M-F 8:00 AM - 4:30 PM. The examiner’s fax number is 571/270-6135.

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Andrew Flanders can be reached on 571/272-7516. 

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained 


/Jesse S Pullias/
Primary Examiner, Art Unit 2655                                   03/08/22