DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-7 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Niu (“Bi-Directional Differentiable Input Reconstruction for Low-Resource Neural Machine Translation”, cited in 11/29/20 Information Disclosure Statement).
Claim 1: A method for training a machine translation model, comprising:
acquiring a bidirectional translation model (Niu section 1, “Our approach builds on the bi-directional NMT model”) to be trained and training data, the training data comprising a source corpus and a target corpus corresponding to the source corpus (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus”);
training the bidirectional translation model for N cycles  (Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”; Niu section 4.2, “we checkpoint the model every 1000 updates”, iterative updating of model), each cycle of training comprising a forward translation process of translating the source corpus into a pseudo target corpus and a reverse translation process of translating the pseudo target corpus into a pseudo source corpus (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus and appending the swapped version to the original.”, including moth source and target; Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”, including both source and target data and subjecting both to forward translation and backward reconstruction (into pseudo target and pseudo source)), and N being a positive integer greater than 1 (Niu section 3.2, “coherent intermediate translations”);
acquiring a forward translation similarity and a reverse translation similarity, the forward translation similarity being a similarity between the target corpus and the pseudo target corpus, and the reverse translation similarity being a similarity between the source corpus and the pseudo source corpus (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, translation quality measurement is determined by low distance (i.e. high similarity)); and
when a sum of the forward translation similarity and the reverse translation similarity converges, determining that training of the bidirectional translation model is completed (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, sufficient translation quality is determined by low distance (i.e. converged similarity); Niu section 4.2, “Training stops after 10 checkpoints without improvement”, converged final result).
Claim 2: The method of claim 1 (see above), wherein training the bidirectional translation model for N cycles comprises:
setting a reconstructor in the bidirectional translation model (Niu section 1, “A single bi-directional model is used as a translator and a reconstructor”); and
implementing the reverse translation process through the reconstructor (Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”).
Claim 3: The method of claim 2 (see above), wherein the training the bidirectional translation model for N cycles comprises:
acquiring, in the forward translation process, the pseudo target corpus through a differentiable sampling function  (Niu section 3.1, “We use differentiable sampling to side-step beam search and back-propagate error signals.”).
Claim 4: The method of claim 3 (see above), wherein the training the bidirectional translation model for N cycles further comprises:
acquiring, in an ith cycle of training, an error between the target corpus and the pseudo target corpus through the differentiable sampling function, i being a positive integer greater than or equal to 1 and less than N (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, error determination; Niu section 3.1, “We use differentiable sampling to side-step beam search and back-propagate error signals.”); and
regulating, in the (i+1)th cycle of training, one or more training parameters of the bidirectional translation model based on the error acquired in the ith cycle of training (Niu section 4.2, “we checkpoint the model every 1000 updates”, iterative updating of model).
Claim 5: The method of claim 3 (see above), wherein the differentiable sampling function comprises a Gumbel-Softmax function (Niu section 1, “Translations are sampled using the Straight-Through Gumbel Softmax (STGS) Estimator”).
Claim 6: The method of claim 1 (see above), wherein the acquiring the forward translation similarity and the reverse translation similarity comprises:
acquiring a value of a log-likelihood function of the target corpus and the pseudo target corpus, and a value of a log-likelihood function of the source corpus and the pseudo source corpus (Niu section 3.1, equations (3) & (4), log-likelihood functions applied to forward translation and backward reconstruction).
Claim 7: The method of claim 1 (see above), wherein the training data is set with a first language label (Niu section 3, “The language is marked by a tag”) or (Note: This is a recitation in the alternative, readable upon either option) a second language label, the training data set with the first language label is the source corpus and (Note: This is a recitation in the alternative, readable upon either option), the training data set with the second language label is the source corpus and the training data set with the first language label is the target corpus.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 8-19 are rejected under 35 U.S.C. 103(a) as being unpatentable over Niu in view of Lee (US 2020011771).
With respect to claim 8, Niu discloses:
Claim 8: A device for training a machine translation model, comprising:
a processor (see secondary reference below); and
memory storing instructions executable by the processor (see secondary reference below),
wherein when the instructions are executed by the processor, the processor is configured to:
acquire a bidirectional translation model (Niu section 1, “Our approach builds on the bi-directional NMT model”) to be trained and training data, the training data comprising a source corpus and a target corpus corresponding to the source corpus (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus”);
train the bidirectional translation model for N cycles  (Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”; Niu section 4.2, “we checkpoint the model every 1000 updates”, iterative updating of model), each cycle of training comprising a forward translation process of translating the source corpus into a pseudo target corpus and a reverse translation process of translating the pseudo target corpus into a pseudo source corpus (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus and appending the swapped version to the original.”, including moth source and target; Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”, including both source and target data and subjecting both to forward translation and backward reconstruction (into pseudo target and pseudo source)), and N being a positive integer greater than 1 (Niu section 3.2, “coherent intermediate translations”);
acquire a forward translation similarity and a reverse translation similarity, the forward translation similarity being a similarity between the target corpus and the pseudo target corpus, and the reverse translation similarity being a similarity between the source corpus and the pseudo source corpus (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, translation quality measurement is determined by low distance (i.e. high similarity)); and
(Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, sufficient translation quality is determined by low distance (i.e. converged similarity); Niu section 4.2, “Training stops after 10 checkpoints without improvement”, converged final result).
Niu does not expressly disclose the elements annotated “see secondary reference below” above, i.e. the processor and memory.
Lee discloses:
…a processor (Lee paragraph 0063, processor); and
memory storing instructions executable by the processor (Lee paragraph 0063, stored software instructions),…
Niu and Lee are combinable because they are from the field of bidirectional machine language translator models (Lee paragraph 0063).
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to use a processor and memory as taught by Lee to implement the Niu bidirectional machine language translator model.
The suggestion/motivation for doing so would have been to implement the bidirectional machine language translator model of 
Therefore, it would have been obvious to combine Niu with Lee to obtain the invention as specified in claim 8.
Applying these teachings to claims 9-19:
Claim 9: The device of claim 8 (see above), wherein the processor is further configured to set a reconstructor in the bidirectional translation model (Niu section 1, “A single bi-directional model is used as a translator and a reconstructor”), and the reverse translation process is implemented through the reconstructor (Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”).
Claim 10: The device of claim 9 (see above), wherein the processor is further configured to:
acquire, in the forward translation process, the pseudo target corpus through a differentiable sampling function  (Niu section 3.1, “We use differentiable sampling to side-step beam search and back-propagate error signals.”).
Claim 11: The device of claim 10 (see above), wherein the processor is further configured to:
acquire, in an ith cycle of training, an error between the target corpus and the pseudo target corpus through the differentiable sampling function, i being a positive integer greater than or equal to 1 and less than N (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, error determination; Niu section 3.1, “We use differentiable sampling to side-step beam search and back-propagate error signals.”); and
regulate, in the (i+1)th cycle of training, one or more training parameters of the bidirectional translation model based on the error acquired in the ith cycle of training (Niu section 4.2, “we checkpoint the model every 1000 updates”, iterative updating of model).
Claim 12: The device of claim 10 (see above), wherein the differentiable sampling function comprises a Gumbel-Softmax function (Niu section 1, “Translations are sampled using the Straight-Through Gumbel Softmax (STGS) Estimator”).
Claim 13: The device of claim 8 (see above), wherein the processor is further configured to:
acquire a value of a log-likelihood function of the target corpus and the pseudo target corpus and a value of a log-likelihood function of the source corpus and the pseudo source corpus (Niu section 3.1, equations (3) & (4), log-likelihood functions applied to forward translation and backward reconstruction).
Claim 14: The device of claim 8 (see above), wherein the processor is further configured to:
set a first language label (Niu section 3, “The language is marked by a tag”) or (Note: This is a recitation in the alternative, readable upon either option) a second language label for the training data, the training data set with the first language label is the source corpus and the training data set with the second language label is the target corpus, or (Note: This is a recitation in the alternative, readable upon either option), the training data set with the second language label is the source 
Claim 15: A non-transitory computer-readable storage medium having instructions stored therein (Lee paragraph 0063, stored software instructions) for execution by one or more processors (Lee paragraph 0063, processor) of a terminal to enable the terminal to execute a method for training machine translation model, the method comprising:
acquiring a bidirectional translation model (Niu section 1, “Our approach builds on the bi-directional NMT model”) to be trained and training data, the training data comprising a source corpus and a target corpus corresponding to the source corpus (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus”);
training the bidirectional translation model for N cycles  (Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”; Niu section 4.2, “we checkpoint the model every 1000 updates”, iterative updating of model), each cycle of training comprising a forward translation process of translating the source corpus into a pseudo target corpus and a reverse translation process of translating the pseudo target corpus into a pseudo source corpus (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus and appending the swapped version to the original.”, including moth source and target; Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”, including both source and target data and subjecting both to forward translation and backward reconstruction (into pseudo target and pseudo source)), and N being a positive integer greater than 1 (Niu section 3.2, “coherent intermediate translations”);
acquiring a forward translation similarity and a reverse translation similarity, the forward translation similarity being a similarity between the target corpus and the pseudo target corpus, and the reverse translation similarity being a similarity between the source corpus and the pseudo source corpus (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, translation quality measurement is determined by low distance (i.e. high similarity)); and
when a sum of the forward translation similarity and the reverse translation similarity converges, determining that training of the bidirectional translation model is completed (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, sufficient translation quality is determined by low distance (i.e. converged similarity); Niu section 4.2, “Training stops after 10 checkpoints without improvement”, converged final result).
Claim 16: The non-transitory computer-readable storage medium of claim 15 (see above), wherein the training the bidirectional translation model for N cycles comprises:
setting a reconstructor in the bidirectional translation model (Niu section 1, “A single bi-directional model is used as a translator and a reconstructor”);
implementing the reverse translation process through the reconstructor (Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”); and
acquiring, in the forward translation process, the pseudo target corpus through a differentiable sampling function (Niu section 3.1, “We use differentiable sampling to side-step beam search and back-propagate error signals.”).
Claim 17: The non-transitory computer-readable storage medium of claim 16 (see above), wherein the training the bidirectional translation model for N cycles further comprises:
acquiring, in an ith cycle of training, an error between the target corpus and the pseudo target corpus through the differentiable sampling function, i being a positive integer greater than or equal to 1 and less than N (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, error determination; Niu section 3.1, “We use differentiable sampling to side-step beam search and back-propagate error signals.”); and
regulating, in the (i+1)th cycle of training, one or more training parameters of the bidirectional translation model based on the error acquired in the ith cycle of training (Niu section 4.2, “we checkpoint the model every 1000 updates”, iterative updating of model).
Claim 18: The non-transitory computer-readable storage medium of claim 16 (see above), wherein the differentiable sampling function comprises a Gumbel-Softmax function (Niu section 1, “Translations are sampled using the Straight-Through Gumbel Softmax (STGS) Estimator”).
Claim 19: The non-transitory computer-readable storage medium of claim 15 (see above), wherein the acquiring the forward translation similarity and the reverse translation similarity comprises:
acquiring a value of a log-likelihood function of the target corpus and the pseudo target corpus and a value of a log-likelihood function of the source corpus and the pseudo source corpus (Niu section 3.1, equations (3) & (4), log-likelihood functions applied to forward translation and backward reconstruction).
Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Niu in view of Lee as applied to claim 1 above, and further in view of Foster (US 20090083023).
With respect to claim 20, Niu in view of Lee teaches:
Claim 20: A machine translation system implementing the method of claim 1 (see above), comprising one or more processing circuits configured to implement operations of the method (Lee paragraph 0063, processor), and at least one of a display screen  and an audio component configured to output a translation result (see secondary reference below); wherein:
the reverse translation corpus is introduced into training to enrich corpuses (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus and appending the swapped version to the original.”; Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”, both source and target training data are used in forward translation and backward reconstruction), such that a model training efficiency is improved under condition of few resources for a minority (Niu section 1, “It achieves consistent improvements across various low-resource language pairs and directions, showing its effectiveness in making better use of limited parallel data.”); and
with introduction of the bidirectional translation method, the reverse translation model is trained at the same time, to thereby improve quality of the reverse translation model (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus and appending the swapped version to the original.”; Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”, both source and target training data are used in forward translation and backward reconstruction; Niu section 1, “It achieves consistent improvements across various low-resource language pairs and directions, showing its effectiveness in making better use of limited parallel data.”, improved quality).
Niu in view of Lee does not expressly disclose the elements annotated “see secondary reference below” above, i.e. the display screen or audio output.
Foster discloses:
…at least one of (Note: This is a recitation in the alternative, readable upon either option) a display screen (Foster paragraph 0045, translation unit presented via web page (i.e. via an internet device display)) and an audio component configured to output a translation result…
(Lee paragraph 0063, Foster paragraph 0019).
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to use a web page display as taught by Foster to present the result of a translation arrangement as taught by Niu in view of Lee.
The suggestion/motivation for doing so would have been to present the results of translation via a standard web-accessible interface.
Therefore, it would have been obvious to combine Niu in view of Lee with Foster to obtain the invention as specified in claim 20.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Afrasiabi discloses an example of forward and backward translation models.
Any inquiry concerning the contents of this communication or earlier communications from the examiner should be directed to Stephen M. Brinich at 571-272-7430 (voice) or 571-273-7430 (fax).
Any inquiry relating to the status of this application, entry of papers into this application, or other any inquiries of 
The examiner can normally be reached on weekdays 7:30-4:00 Eastern Time.
If attempts to contact the examiner and the Customer Service Center are unsuccessful, supervisor Claire Wang can be contacted at 571-270-1051.
Hand-carried correspondence may be delivered to the Customer Service Window, located at the Randolph Building, 401 Dulany Street, Alexandria, VA 22314.
/Stephen M Brinich/
Examiner, Art Unit 2663