Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given in a telephone interview with Lyssa-Michelle Morris (Reg. No. 74794) on 05/10/2022.
The claims has been amended as follows: 
Claim 1: (Currently Amended) A computer-implemented method, comprising:
	initializing a model having a sequence to sequence network architecture, wherein the sequence to sequence network architecture comprises an encoder and a decoder; 
	training the model based on a training set comprising a plurality of training sequences, wherein training the model comprises:
generating a bidirectional encoding of each training sequence for the plurality of training sequences, 
wherein the bidirectional encoding for a training sequence comprises a forward encoding and a backward encoding; selecting a subset of the bidirectional encodings; appending an informative padding to each of the selected subset of bidirectional encodings, wherein the informative padding is a random sampling of encoded tokens from the plurality of training sequences;
generating a set of encoder sequences based on the forward encodings in the subset of the bidirectional encodings;
generating a set of decoder sequences based on the backward encodings in the subset of the bidirectional encodings;
prepending a start of sequence token to each of the encoder sequences;
appending an end of sequence token to each of the decoder sequences; and
for each of the selected subset of bidirectional encodings:
training the encoder based on the encoder sequence for a particular bidirectional encoding; and
training the decoder using the corresponding decoder sequence for the particular bidirectional encoding; and 
	generating a prediction based on input data using the trained model.
Claims 2-10  (Original) No amendments.
Claim 11, (Currently Amended) A device, comprising:
a processor; and
a memory in communication with the processor and storing instructions that, when read by the processor, cause the device to:
initialize a model having a sequence to sequence network architecture, wherein the sequence to sequence network architecture comprises an encoder and a decoder;
	train the model based on a training set comprising a plurality of training sequences, wherein training the model comprises:
generating a bidirectional encoding of each training sequence, wherein the bidirectional encoding for a training sequence comprises a forward encoding and a backward encoding;
selecting a subset of the bidirectional encodings;
appending an informative padding to each of the bidirectional encodings in the subset of bidirectional encodings,
 wherein the informative padding is a random sampling of encoded tokens from the plurality of training sequences;
generating a set of encoder sequences based on the forward encodings in the subset of the bidirectional encodings, wherein each encoder sequence comprises an attention weight;
generating a set of decoder sequences based on the backward encodings in the subset of the bidirectional encodings;
prepending a start of sequence token to each of the encoder sequences in the set of encoder sequences;
appending an end of sequence token to each of the decoder sequences in the set of decoder sequences; and for each of the encoder sequences:
training the encoder based on the encoder sequence for a particular bidirectional encoding;
  updating the attention weight for each of the encoder sequence based on the training; and training the decoder using the corresponding decoder sequence for the particular bidirectional encoding; and generate a prediction based on input data using the trained model.
Claims 12-17 (Original) No amendments. 
Claim 18 (Currently Amended) A computer-implemented method, comprising:
initializing a model having a sequence to sequence network architecture, wherein the sequence to sequence network architecture comprises an encoder and a decoder;
	training the model based on a training set comprising a plurality of training sequences, wherein training the model comprises:
generating a bidirectional encoding of each training sequence, wherein the bidirectional encoding for a training sequence comprises a forward encoding and a backward encoding;
selecting a subset of the bidirectional encodings;
appending an informative padding to each of bidirectional encodings in the selected subset of bidirectional encodings, wherein the informative padding is a random sampling of encoded tokens from the plurality of training sequences;
generating a set of encoder sequences based on the forward encodings in the subset of the bidirectional encodings;
generating a set of decoder sequences based on the backward encodings in the subset of the bidirectional encodings;
prepending a start of sequence token to each of the encoder sequences;
appending an end of sequence token to each of the decoder sequences; and
for each encoding of the encoder sequences:
training the encoder using the set of encoder sequences; and
training the decoder using the set of decoder sequences;
obtaining input data;
generating an input encoding of the input data;
generating an output sequence comprising a start of sequence token;
completing the output sequence by:
generating a next output sequence token by providing the input encoding to the trained model;
appending the next output sequence token to the output sequence;
iteratively generating next output sequence tokens by providing the input encoding to the trained model and appending each generated next output sequence token to the output sequence until the generated subsequent next output sequence token comprises an end of sequence token; and
generating a prediction based on the output sequence.
    Claims 19-20 (Original) No amendments. 
Allowable Subject Matter
Examiner’s reason for Allowance
Claims 1-20 are allowed. 
(Claim 1) A computer-implemented method, comprising:
	initializing a model having a sequence to sequence network architecture, wherein the sequence to sequence network architecture comprises an encoder and a decoder; 
	training the model based on a training set comprising a plurality of training sequences, wherein training the model comprises:
generating a bidirectional encoding of each training sequence for the plurality of training sequences, 
wherein the bidirectional encoding for a training sequence comprises a forward encoding and a backward encoding; selecting a subset of the bidirectional encodings; appending an informative padding to each of the selected subset of bidirectional encodings, wherein the informative padding is a random sampling of encoded tokens from the plurality of training sequences;
generating a set of encoder sequences based on the forward encodings in the subset of the bidirectional encodings; generating a set of decoder sequences based on the backward encodings in the subset of the bidirectional encodings;
prepending a start of sequence token to each of the encoder sequences;
appending an end of sequence token to each of the decoder sequences; and
for each of the selected subset of bidirectional encodings:
training the encoder based on the encoder sequence for a particular bidirectional encoding; and
training the decoder using the corresponding decoder sequence for the particular bidirectional encoding; and 
	generating a prediction based on input data using the trained model.
The following is an examiner's statement of reasons for allowance:Regarding claims 1, the prior art of record, specifically Henderson  et al. (US Patent #10664527 ) teaches model learns from millions of examples what responses are appropriate in different conversational contexts. The model is used to rank a large index of responses that are known to be relevant, e.g., snippets and photos from reviews about a restaurant. The system then presents the responses visually to the user. As a result there is no need to engineer a structured ontology, or to solve the difficult task of general language generation (Col. 3 lines 48-56).
Wayne et al. (US 10872299) teaches a system can be used for anomaly detection. For example, the system can be used to generate a database of different sequences of predicted observations of the environment . A previously unseen sequences of observations of the environment can be characterized as an anomaly if it is sufficiently different (according to some appropriate measure) from the sequences  of predicted observations in the database. The system  includes a memory. The memory  is a logical data storage area or a physical data storage device. The data stored in the memory is an ordered collection of numerical values that can be represented as a matrix. As will be described in more detail below, at each of the multiple time steps, the system both reads (i.e., extracts) data from the memory and updates the memory. Generally, the system  uses the memory to store information for use over multiple time steps. (Col. 5 Lines 63-67)
However, none of the prior art cited alone or in combination provides the motivation to teach selecting a subset of the bidirectional encodings; appending an informative padding to each of the selected subset of bidirectional encodings, wherein the informative padding is a random sampling of encoded tokens from the plurality of training sequences;
generating a set of encoder sequences based on the forward encodings in the subset of the bidirectional encodings; generating a set of decoder sequences based on the backward encodings in the subset of the bidirectional encodings;
prepending a start of sequence token to each of the encoder sequences;
appending an end of sequence token to each of the decoder sequences; and
for each of the selected subset of bidirectional encodings:
training the encoder based on the encoder sequence for a particular bidirectional encoding; and
training the decoder using the corresponding decoder sequence for the particular bidirectional encoding; and 
	generating a prediction based on input data using the trained model.
Claim 11, A device, comprising: a processor; and a memory in communication with the processor and storing instructions that,
 when read by the processor, cause the device to initialize a model having a sequence to sequence network architecture,
 wherein the sequence to sequence network architecture comprises an encoder and a decoder; train the model based on a training set comprising a plurality of training sequences, 
wherein training the model comprises: generating a bidirectional encoding of each training sequence, wherein the bidirectional encoding for a training sequence comprises a forward encoding and a backward encoding;
 selecting a subset of the bidirectional encodings; appending an informative padding to each of the bidirectional encodings in the subset of bidirectional encodings;
generating a set of encoder sequences based on the forward encodings in the subset of the bidirectional encodings, 
wherein each encoder sequence comprises an attention weight; generating a set of decoder sequences based on the backward encodings in the subset of the bidirectional encodings; 
prepending a start of sequence token to each of the encoder sequences in the set of encoder sequences; appending an end of sequence token to each of the decoder sequences in the set of decoder sequences; 
and for each of the encoder sequences: training the encoder based on the encoder sequence for a particular bidirectional encoding; 31009033.00210\US 
P6453-US updating the attention weight for each of the encoder sequence based on the training; and training the decoder using the corresponding decoder sequence for the particular bidirectional encoding; and generate a prediction based on input data using the trained model.
The following is an examiner's statement of reasons for allowance:Regarding claims 11, the prior art of record, specifically Henderson  et al. (US Patent #10664527 ) teaches model learns from millions of examples what responses are appropriate in different conversational contexts. The model is used to rank a large index of responses that are known to be relevant, e.g., snippets and photos from reviews about a restaurant. The system then presents the responses visually to the user. As a result there is no need to engineer a structured ontology, or to solve the difficult task of general language generation (Col. 3 lines 48-56).
Wayne et al. (US 10872299) teaches a system can be used for anomaly detection. For example, the system can be used to generate a database of different sequences of predicted observations of the environment . A previously unseen sequences of observations of the environment can be characterized as an anomaly if it is sufficiently different (according to some appropriate measure) from the sequences  of predicted observations in the database. The system  includes a memory. The memory  is a logical data storage area or a physical data storage device. The data stored in the memory is an ordered collection of numerical values that can be represented as a matrix. As will be described in more detail below, at each of the multiple time steps, the system both reads (i.e., extracts) data from the memory and updates the memory. Generally, the system  uses the memory to store information for use over multiple time steps. (Col. 5 Lines 63-67)
However, none of the prior art cited alone or in combination provides the motivation to teach generating a set of encoder sequences based on the forward encodings in the subset of the bidirectional encodings, 
wherein each encoder sequence comprises an attention weight; generating a set of decoder sequences based on the backward encodings in the subset of the bidirectional encodings; 
prepending a start of sequence token to each of the encoder sequences in the set of encoder sequences; appending an end of sequence token to each of the decoder sequences in the set of decoder sequences; 
and for each of the encoder sequences: training the encoder based on the encoder sequence for a particular bidirectional encoding; 31009033.00210\US 
P6453-US updating the attention weight for each of the encoder sequence based on the training; and training the decoder using the corresponding decoder sequence for the particular bidirectional encoding; and generate a prediction based on input data using the trained model.
(Claim 18) A computer-implemented method, comprising: initializing a model having a sequence to sequence network architecture,
 wherein the sequence to sequence network architecture comprises an encoder and a decoder; training the model based on a training set comprising a plurality of training sequences, 
wherein training the model comprises: generating a bidirectional encoding of each training sequence, wherein the bidirectional encoding for a training sequence comprises a forward encoding and a backward encoding;
 selecting a subset of the bidirectional encodings; appending an informative padding to each of bidirectional encodings in the selected subset of bidirectional encodings; 
generating a set of encoder sequences based on the forward encodings in the subset of the bidirectional encodings;
 generating a set of decoder sequences based on the backward encodings in the subset of the bidirectional encodings; 
prepending a start of sequence token to each of the encoder sequences; appending an end of sequence token to each of the decoder sequences; 
and for each encoding of the encoder sequences: training the encoder using the set of encoder sequences; 
and training the decoder using the set of decoder sequences; obtaining input data; generating an input encoding of the input data; 
generating an output sequence comprising a start of sequence token; completing the output sequence by: 
generating a next output sequence token by providing the input encoding to the trained model; appending the next output sequence token to the output sequence; 
iteratively generating next output sequence tokens by providing the input encoding to the trained model and appending each generated next output sequence token to the output sequence until the generated subsequent next output sequence token comprises an end of sequence token; 
and generating a prediction based on the output sequence.
The following is an examiner's statement of reasons for allowance:Regarding claims 18, the prior art of record, specifically Henderson  et al. (US Patent #10664527 ) teaches model learns from millions of examples what responses are appropriate in different conversational contexts. The model is used to rank a large index of responses that are known to be relevant, e.g., snippets and photos from reviews about a restaurant. The system then presents the responses visually to the user. As a result there is no need to engineer a structured ontology, or to solve the difficult task of general language generation (Col. 3 lines 48-56).
Wayne et al. (US 10872299) teaches a system can be used for anomaly detection. For example, the system can be used to generate a database of different sequences of predicted observations of the environment . A previously unseen sequences of observations of the environment can be characterized as an anomaly if it is sufficiently different (according to some appropriate measure) from the sequences  of predicted observations in the database. The system  includes a memory. The memory  is a logical data storage area or a physical data storage device. The data stored in the memory is an ordered collection of numerical values that can be represented as a matrix. As will be described in more detail below, at each of the multiple time steps, the system both reads (i.e., extracts) data from the memory and updates the memory. Generally, the system  uses the memory to store information for use over multiple time steps. (Col. 5 Lines 63-67)
However, none of the prior art cited alone or in combination provides the motivation to teach generating a set of decoder sequences based on the backward encodings in the subset of the bidirectional encodings; 
prepending a start of sequence token to each of the encoder sequences; appending an end of sequence token to each of the decoder sequences; 
and for each encoding of the encoder sequences: training the encoder using the set of encoder sequences; 
and training the decoder using the set of decoder sequences; obtaining input data; generating an input encoding of the input data; 
generating an output sequence comprising a start of sequence token; completing the output sequence by: 
generating a next output sequence token by providing the input encoding to the trained model; appending the next output sequence token to the output sequence; 
iteratively generating next output sequence tokens by providing the input encoding to the trained model and appending each generated next output sequence token to the output sequence until the generated subsequent next output sequence token comprises an end of sequence token; 
and generating a prediction based on the output sequence.



	Conclusion


Any inquiry concerning this communication or earlier communications from the examiner should be directed to Akwasi M Sarpong whose telephone number is (571)270-3438. The examiner can normally be reached Mon-Fri. 8:00am-4:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KING D POON can be reached on 571-272-7440. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AKWASI M SARPONG/Primary  Examiner, Art Unit 2675                                                                                                                                                                                                        05/19/2022.