Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
1.	This action is responsive to AFCP remarks filed 5/26/2022.
Response to Amendment
2.	Independent Claims 1, 10, 16 have been amended.
Response to Arguments
3.	Applicants arguments filed have been considered and are persuasive.
Allowable Subject Matter
4.	Claims 1-20 are allowed.
5.	The following is an examiner’s statement of reasons for allowance: the claims are allowed as they further teach:
A method of spoken language understanding, the method comprising: 
receiving audio data for a spoken language expression; 
encoding the audio data using a basic encoder of a multi-stage encoder to obtain character features, wherein the basic encoder is trained to generate the character features during a first training phase by appending a softmax layer to the basic encoder and comparing an output of the softmax layer to ground-truth character training data, and wherein the softmax layer is removed prior to encoding the audio data; 
encoding the character features using a sequential encoder of the multi-stage encoder to obtain token features, wherein the sequential encoder is trained to generate the token features during a second training phase based on ground-truth token training data; and 
decoding the token features to generate semantic information representing the spoken language expression.

Regarding claim 1 Sypniewski, the closest art of record, teaches A method of spoken language understanding (abstract:  systems and methods for speech recognition and classification; 91 end-to-end speech classification), the method comprising: 
receiving audio data for a spoken language expression (7; 57 audio input); 
encoding the audio data using a multi-stage encoder comprising a basic encoder and a sequential encoder, wherein the basic encoder is trained to generate character features during a first training phase and the sequential encoder is trained to generate token features during a second training phase, wherein the first training phase is based on ground-truth character features, and wherein the second training phase is based on ground-truth token features (abstract: multiple neural networks to form an end-to-end neural network; 56; 65 CNN can encode, CNN receive audio input…processes audio features to determine first set of features; phoneme representation; 
56; 80 RNN…sequential…receives acoustic features and outputs features related to words
 [0109] Turning to the method of training the neural networks, in some embodiments, all layers and stacks of an end-to-end speech recognition system 200, end-to-end speech classification system 700, or end-to-end phoneme recognition system 800 are… trained … based on training data that contains audio and an associated ground-truth output; 
56: the features produced by CNN stack 202 are entirely learned in the training process, and RNN stack 204 learns relationships between sounds and words through training as well); and 
decoding the token features to generate semantic information representing the spoken language expression (7 semantic; 91-94; 91 end-to-end speech classification, semantic topic; 94: output neural network stack…probability spoken words corresponds to associated classification).  
Sypniewski teaches end-to-end neural networks for speech recognition and classification that determine acoustic features, word features, and semantic information.  Training is performed using ground truth information.  Paragraph 56 also teaches the features produced by CNN stack 202 are entirely learned in the training process, and RNN stack 204 learns relationships between sounds and words through training as well.

	Closely related NPL prior art teaches: 

Haghani:
(abstract)Conventional spoken language understanding systems consist of two main components: an automatic speech recognition module that converts audio to a transcript, and a natural language understanding module that transforms the resulting text (or top N hypotheses) into a set of domains, intents, and arguments. These modules are typically optimized independently. In this paper, we formulate audio to semantic understanding as a sequence-to-sequence problem [1]. We propose and compare various encoder-decoder based approaches that optimize both modules jointly, in an end-to-end manner.

Serdyuk teaches end-to-end spoken language understanding and Chen spoken language understanding without speech recognition. 

	However neither of the closest references of record specifically teach:
encoding the audio data using a basic encoder of a multi-stage encoder to obtain character features, wherein the basic encoder is trained to generate the character features during a first training phase by appending a softmax layer to the basic encoder and comparing an output of the softmax layer to ground-truth character training data, and wherein the softmax layer is removed prior to encoding the audio data; 
encoding the character features using a sequential encoder of the multi-stage encoder to obtain token features, wherein the sequential encoder is trained to generate the token features during a second training phase based on ground-truth token training data.

Therefore the closest art of record does not teach or make obvious the limitations of the claim.

The additional independent claims are allowed for similar rationale and reasoning as claim 1.
The dependent claims are allowed as they further limit the parent claims.

6.	Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAUN A ROBERTS whose telephone number is (571)270-7541.  The examiner can normally be reached Monday-Friday 9-5 EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on 571-272-7516.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SHAUN ROBERTS/
Primary Examiner, Art Unit 2655