Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claim Rejections - 35 USC § 103
1.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

2.	Claims 1, 3, 6, 8 are rejected under 35 U.S.C. 103 as being unpatentable over submitted prior art Keung et al. (Adversarial Learning with Contextual Embeddings for Zero-resource Cross-lingual Classification and NER) in view of West et al. (2020/0257985).
As to claim 1, Keung teaches a computing device for generating a machine-learning model (p. 1355 – multilingual version of BERT shown the model performs very well in cross lingual settings; there are many recent approaches to zero-resource cross-lingual classification and NER including adversarial learning using a model pre-trained 
West teaches the methods, apparatuses, and systems realize a GAN-based communications system which is evaluated by a discriminator machine learning network ([0005]) the method performed by at least one processor, computer storage devices, each configured to cause at least one operably connected processor to perform the actions of the methods ([0006]).
It would have been obvious before the effective filing date of the claimed invention to incorporate the teachings of West into the teachings of Keung for the purpose of implementing the aspect include corresponding computer storage device configured to cause at least one operably connected processor to perform the actions of the methods.
As to claims 3, 8, Keung teaches the computing device of Claim 1 and the method of claim 6, wherein the discriminator network comprises a bidirectional encoder representation from Transformer (BERT) model, and in the 3Application No. 16/699,477 discriminator network, the processor is further configured to: generate a plurality of target training word-embedding vectors based on the target training sentence; and input the target training word-embedding vectors to the BERT model (Fig. 1, p. 1356, used the retrained cased 
As to claim 6, Keung teaches a method for a computing device to generate a machine-learning model (p. 1355 – multilingual version of BERT shown the model performs very well in cross lingual settings; there are many recent approaches to zero-resource cross-lingual classification and NER including adversarial learning using a model pre-trained on parallel text), a dictionary data (p. 1355 – multilingual version of BERT trained on Wikipedia from 100 languages and equipped with 110,000 shared word piece vocabulary; labeled English text and unlabeled non-English text are used during training and hyperparameters are selected using English evaluation sets; p. 1356 – use the labeled English data of each corpus; use the non-English text portion without the labels for the adversarial training) and a generative adversarial network (GAN) (Fig. 1, p. 1356, Language adversarial training), the dictionary data comprises a correspondence between a plurality of words of a source language and a plurality of words of a target language (p. 1355-1356), and the GAN comprises a generator network (Fig. 1 and p. 1356 –used the pre-trained cased multilingual BERT model as the initialization for all of our experiments and language adversarial training) and a discriminator network (Fig. 1 and p. 1356 – add a language discriminator module which uses the BERT embeddings to classify whether the input sentence was written in English or the non-English language), the method comprising: inputting, by the 
West teaches the methods, apparatuses, and systems realize a GAN-based communications system which is evaluated by a discriminator machine learning network ([0005]) the method performed by at least one processor, computer storage devices, each configured to cause at least one operably connected processor to perform the actions of the methods ([0006]).
.

3.	Claims 2, 7 rejected under 35 U.S.C. 103 as being unpatentable over Keung and West in view of Chaoyin (CN 201811285344 A).
As to claims 2, 7, Keung teaches the computing device of Claim 1 and the method of claim 6 in the generator network, the processor is further configured to: generate a training word sequence of the target language according to the source training sentence and the dictionary data; generate a plurality of training word-embedding vectors of the target language according to the training word sequence (p. 1355 – at least contextual word embeddings have been successfully applied to various NLP tasks, including named entity recognition, document classification, and textual entailment. The multilingual version of BERT which is trained on Wikipedia articles from 100 languages and equipped with 110,000 shared word piece vocabulary); and input the training word-embedding vectors to generate the target training sentence (Fig. 1; pages 1355-1356, the addition of a language adversarial task during fine tuning for multilingual BERT can significantly improve the zero-resource cross-lingual transfer performance).  Keung does not explicitly discuss the generator network comprises a Transformer model to generate the target training sentence.

It would have been obvious before the effective filing date of the claims invention to incorporate the teachings of Chaoyin into the teachings of Keung and West for the purpose of solving corresponding electromagnetic transient equations according to different models.

4.	Claims 11-16 rejected under 35 U.S.C. 103 as being unpatentable over Keung et al. in view of Chaoyin (CN 201811285344 A) and Artetxe et al. (Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and beyond) and further in view of West et al. (2020/0257985).
As to claim 11, Keung teaches a machine-translation device (p. 1355 – multilingual version of BERT shown the model performs very well in cross lingual settings; there are many recent approaches to zero-resource cross-lingual classification and NER including adversarial learning using a model pre-trained on parallel text), comprising: a dictionary data (p. 1355 – multilingual version of BERT trained on Wikipedia from 100 languages and equipped with 110,000 shared word piece vocabulary; labeled English text and unlabeled non-English text are used during training and hyperparameters are selected using English evaluation sets; p. 1356 – use the labeled English data of each corpus; use the non-English text portion without the labels for the adversarial training), wherein the dictionary data comprises a correspondence 
Chaoyin teaches a network comprises a generator model, a transformer model (abstract). It would have been obvious to generate target sentence via the transformer model for the purpose of attending to different positions of the input sentence to compute a representation of the sentence.
Artetxe teaches generating a word sequence of the target language according to a source sentence of the source language and the dictionary data; generating a plurality of word-embedding vectors of the target language based on the word sequence (p. 3 –a single language agnostic BILSTM encoder to build sentence embeddings, which is coupled with an auxiliary decoder and trained on parallel corpora; sentence embeddings used to initialize the decoder LSTM through a linear transformation and concatenated to its input embeddings).
West teaches the methods, apparatuses, and systems realize a GAN-based communications system which is evaluated by a discriminator machine learning network ([0005]) the method performed by at least one processor, computer storage devices, 
It would have been obvious before the effective filing date of the claims invention to incorporate the teachings of Chaoyin into the teachings of Keung for the purpose of solving corresponding electromagnetic transient equations according to different models and the teachings of Artetxe into the teachings of Keung and Chaoyin for the purpose of obtaining the best results in zero shot cross lingual transfer for all languages; and incorporate the teachings of West into the teachings of Keung, Chaoyin, and Artetxe for the purpose of implementing the aspect include corresponding computer storage device configured to cause at least one operably connected processor to perform the actions of the methods.
As to claim 12, Artetxe teaches the machine-translation device of Claim 11, wherein the processor is further configured to: generate a plurality of word-embedding vectors of the source language based on the word- embedding vectors (p. 1 – universal language agnostic sentence embeddings that is vector representations of sentences that are general with respect to two dimensions; p. 3 – a single, language agnostic BiLSTM encoder to build our sentence embeddings, which is coupled with an auxiliary decoder and trained on parallel corpora); input the word-embedding vectors of the source language to a bidirectional encoder representation from Transformer (BERT) model (p. 5 – BERT; training a classifier on top of our multilingual encoder using the combination of the two sentence embeddings), so as to obtain a sentence-embedding vector (sentence embedding); and Keung teaches 7Application No. 16/699,477input the training word embedding vectors that generates the target training sentence (Fig. 1; pages 1355-1356).

West teaches the methods, apparatuses, and systems realize a GAN-based communications system which is evaluated by a discriminator machine learning network 
It would have been obvious before the effective filing date of the claimed invention to incorporate the teachings of West into the teachings of Keung for the purpose of implementing the aspect include corresponding computer storage device configured to cause at least one operably connected processor to perform the actions of the methods.
As to claim 14, Keung teaches the machine-translation device of Claim 13, the processor is further configured to: generate a training word sequence of the target language according to the source training sentence and the dictionary data; generate a plurality of training word-embedding vectors of the target language according to the training word sequence (p. 1355, contextual word embeddings have been successfully applied to various NLP tasks, including named entity recognition, document classification, and textual entailment); and input the training word-embedding vectors to generate the target training sentence (Fig. 1; pages 1355-1356, the addition of a language adversarial task during fine tuning for multilingual BERT can significantly improve the zero-resource cross-lingual transfer performance).  Keung does not explicitly discuss the generator network comprises a Transformer model. Chaoyin teaches a network comprises a generator model, a transformer model (abstract). It would have been obvious before the effective filing date of the claims invention to incorporate the teachings of Chaoyin into the teachings of Keung and West for the 
As to claim 15, Keung teaches input the training word embedding vectors that generates the target training sentence (Fig. 1; pages 1355-1356); add a language discriminator module which uses the BERT embeddings to classify whether the input sentence was written in English or non-English language (Fig. 1 and p. 1356); Artetxe teaches a plural word a plurality of word-embedding vectors of the source language based on the source training sentence (p. 1 – universal language agnostic sentence embeddings that is vector representations of sentences that are general with respect to two dimensions; p. 3 – using a single, language agnostic BiLSTM encoder to build our sentence embeddings, which is coupled with an auxiliary decoder and trained on parallel corpora); input the word-embedding vectors of the source language to the other conversion two way encoder representation mode (pages 3-6) so as to obtain a training sentence-embedding vector (p. 5 - sentence embedding); and Chaoyin teaches a network comprises a generator model, a transformer model (abstract). It would have been obvious to generate target sentence via the transformer model for the purpose of attending to different positions of the input sentence to compute a representation of the sentence.
As to claim 16, Keung teaches the machine-translation device of Claim 13, wherein the discriminator network comprises a bidirectional encoder representation from Transformer (BERT) model, and in the 3Application No. 16/699,477 discriminator network, the processor is further configured to: generate a plurality of target training word-embedding vectors based on the target training sentence; and input the target training word-embedding 
Allowable Subject Matter
5.	Claims 4-5, 9-10, 17 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims (claims 2&3, claims 7&8, claim 15, respectively).
Conclusion
6.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to QUYNH H NGUYEN whose telephone number is (571)272-7489. The examiner can normally be reached Monday-Friday 7AM-3PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached on 571-272-7488. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


/QUYNH H NGUYEN/Primary Examiner, Art Unit 2652