DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments (6/4/22 Remarks: page 10, line 19 - page 14, line 2) with respect to the rejection of claims 1-7 under 35 USC §102 and the rejections of claims 8-20 under 35 USC §103 have been fully considered but they are not persuasive.
Examiner notes that the rejection of claims 3-4 under 35 USC §103 and the rejection of claims 10-11 & 17 under 35 USC §103 have been obviated by the claims’ cancellation.
With respect to independent claims 1, 8, & 15, Applicant argues (6/4/22 Remarks: page 11, line 6 - page 12, line 3) that the art of record (Niu, “Bi-Directional Differentiable Input Reconstruction...”) discloses a method or means that operates according to the similarity between the source corpus and the pseudo source corpus, and does not teach or suggest the claimed arrangement in which both the similarity between the source corpus and the pseudo source corpus and the similarity between the target corpus and the pseudo target corpus are considered.
However, as noted in the claim mapping accompanying the 3/15/22 Office Action and the present Office Action, Niu discloses an arrangement in which both forward and backward reconstruction is performed, with source and target swapped and the swapped version appended to the original, so that a combination of both source corpus data and target corpus data is used in evaluating the similarities between corpus and pseudo corpus (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus and appending the swapped version to the original.”, including both source and target; Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”, including both source and target data and subjecting both to forward translation and backward reconstruction (into pseudo target and pseudo source).
With respect to independent claims 1, 8, & 15, Applicant argues (6/4/22 Remarks: page 12, lines 4-24) that Niu fails to teach or suggest a method or means that operates based on a convergence of the sum of forward and reverse translation similarity such that the sum approaches a value.
However, as noted in the claim mapping accompanying the 3/15/22 Office Action and the present Office Action, Niu discloses an arrangement in which a training objective is defined by a sum of forward and reverse reconstruction likelihood measures (defining a similarity measure of the reconstruction to the original being reconstructed) (Niu section 3.1 and Equations 3 & 4, “we use L = LT+LR as the final training objective for f→e”).
With respect to the features originally recited in (cancelled) claim 4, Applicant argues (6/4/22 Remarks: page 12, line 25 - page 13, line 11) that Niu does not teach or suggest training using an error between the target corpus and the pseudo target corpus during a cycle of training, because Niu allegedly does not teach or suggest the claimed arrangement in which both the similarity between the source corpus and the pseudo source corpus and the similarity between the target corpus and the pseudo target corpus are considered.
As noted above, However, as noted in the claim mapping accompanying the 3/15/22 Office Action and the present Office Action, Niu discloses an arrangement in which both forward and backward reconstruction is performed, with source and target swapped and the swapped version appended to the original, so that a combination of both source corpus data and target corpus data is used in evaluating the similarities between corpus and pseudo corpus.
Applicant argues (6/4/22 Remarks: page 14, lines 1-2) that the dependent claims (claims 2-3, 5-7, 9, 12-14, 16, & 18-20) are allowable for the same reasons advanced above with respect to claims 1, 8, & 15.
Applicant’s arguments with respect to claims 1, 8, & 15 are addressed above.
Claim Rejections - 35 USC § 102
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-2 & 5-7 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Niu (“Bi-Directional Differentiable Input Reconstruction for Low-Resource Neural Machine Translation”, cited in 11/29/20 Information Disclosure Statement).
Claim 1: A method for training a machine translation model, comprising:
acquiring a bidirectional translation model (Niu section 1, “Our approach builds on the bi-directional NMT model”) to be trained and training data, the training data comprising a source corpus and a target corpus corresponding to the source corpus (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus”);
training the bidirectional translation model for N cycles  (Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”; Niu section 4.2, “we checkpoint the model every 1000 updates”, iterative updating of model), each cycle of training comprising a forward translation process of translating the source corpus into a pseudo target corpus and a reverse translation process of translating the pseudo target corpus into a pseudo source corpus (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus and appending the swapped version to the original.”, including both source and target; Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”, including both source and target data and subjecting both to forward translation and backward reconstruction (into pseudo target and pseudo source)), and N being a positive integer greater than 1 (Niu section 3.2, “coherent intermediate translations”);
acquiring a forward translation similarity and a reverse translation similarity, the forward translation similarity being a similarity between the target corpus and the pseudo target corpus, and the reverse translation similarity being a similarity between the source corpus and the pseudo source corpus (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, translation quality measurement is determined by low distance (i.e. high similarity)); and
when a sum of the forward translation similarity and the reverse translation similarity converges, determining that training of the bidirectional translation model is completed (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, sufficient translation quality is determined by low distance (i.e. converged similarity); Niu section 4.2, “Training stops after 10 checkpoints without improvement”, converged final result), wherein the sum of the forward translation similarity and the reverse translation similarity converges indicates the sum of the sum of the forward translation similarity and the reverse translation similarity approaches a value (Niu section 3.1 and Equations 3 & 4, “we use L = LT+LR as the final training objective for f->e.”),
wherein the training the bidirectional translation model for N cycles comprises:
acquiring, in the forward translation process, the pseudo target corpus through a differentiable sampling function  (Niu section 3.1, “We use differentiable sampling to side-step beam search and back-propagate error signals.”),
acquiring, in an ith cycle of training, an error between the target corpus and the pseudo target corpus through the differentiable sampling function, i being a positive integer greater than or equal to 1 and less than N (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, error determination; Niu section 3.1, “We use differentiable sampling to side-step beam search and back-propagate error signals.”); and
regulating, in the (i+1)th cycle of training, one or more training parameters of the bidirectional translation model based on the error acquired in the ith cycle of training (Niu section 4.2, “we checkpoint the model every 1000 updates”, iterative updating of model).
Claim 2: The method of claim 1 (see above), wherein training the bidirectional translation model for N cycles further comprises:
setting a reconstructor in the bidirectional translation model (Niu section 1, “A single bi-directional model is used as a translator and a reconstructor”); and
implementing the reverse translation process through the reconstructor (Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”).
Claim 5: The method of claim 3 (see above), wherein the differentiable sampling function comprises a Gumbel-Softmax function (Niu section 1, “Translations are sampled using the Straight-Through Gumbel Softmax (STGS) Estimator”).
Claim 6: The method of claim 1 (see above), wherein the acquiring the forward translation similarity and the reverse translation similarity comprises:
acquiring a value of a log-likelihood function of the target corpus and the pseudo target corpus, and a value of a log-likelihood function of the source corpus and the pseudo source corpus (Niu section 3.1, equations (3) & (4), log-likelihood functions applied to forward translation and backward reconstruction).
Claim 7: The method of claim 1 (see above), wherein the training data is set with a first language label (Niu section 3, “The language is marked by a tag”) or (Note: This is a recitation in the alternative, readable upon either option) a second language label, the training data set with the first language label is the source corpus and the training data set with the second language label is the target corpus, or (Note: This is a recitation in the alternative, readable upon either option), the training data set with the second language label is the source corpus and the training data set with the first language label is the target corpus.
Claim Rejections - 35 USC § 103
Claims 8-9, 12-16, & 18-19 are rejected under 35 U.S.C. 103(a) as being unpatentable over Niu in view of Lee (US 20200117715, cited in 3/15/22 Office Action).
With respect to claim 8, Niu discloses:
Claim 8: A device for training a machine translation model, comprising:
a processor (see secondary reference below); and
memory storing instructions executable by the processor (see secondary reference below),
wherein when the instructions are executed by the processor, the processor is configured to:
acquire a bidirectional translation model (Niu section 1, “Our approach builds on the bi-directional NMT model”) to be trained and training data, the training data comprising a source corpus and a target corpus corresponding to the source corpus (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus”);
train the bidirectional translation model for N cycles  (Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”; Niu section 4.2, “we checkpoint the model every 1000 updates”, iterative updating of model), each cycle of training comprising a forward translation process of translating the source corpus into a pseudo target corpus and a reverse translation process of translating the pseudo target corpus into a pseudo source corpus (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus and appending the swapped version to the original.”, including both source and target; Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”, including both source and target data and subjecting both to forward translation and backward reconstruction (into pseudo target and pseudo source)), and N being a positive integer greater than 1 (Niu section 3.2, “coherent intermediate translations”);
acquire a forward translation similarity and a reverse translation similarity, the forward translation similarity being a similarity between the target corpus and the pseudo target corpus, and the reverse translation similarity being a similarity between the source corpus and the pseudo source corpus (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, translation quality measurement is determined by low distance (i.e. high similarity)); and
when a sum of the forward translation similarity and the reverse translation similarity converges, determining that training of the bidirectional translation model is completed (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, sufficient translation quality is determined by low distance (i.e. converged similarity); Niu section 4.2, “Training stops after 10 checkpoints without improvement”, converged final result), wherein the sum of the forward translation similarity and the reverse translation similarity converges indicates the sum of the sum of the forward translation similarity and the reverse translation similarity approaches a value (Niu section 3.1 and Equations 3 & 4, “we use L = LT+LR as the final training objective for f->e.”),
wherein the processor is further configured to:
acquire, in the forward translation process, the pseudo target corpus through a differentiable sampling function  (Niu section 3.1, “We use differentiable sampling to side-step beam search and back-propagate error signals.”)
acquire, in an ith cycle of training, an error between the target corpus and the pseudo target corpus through the differentiable sampling function, i being a positive integer greater than or equal to 1 and less than N (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, error determination; Niu section 3.1, “We use differentiable sampling to side-step beam search and back-propagate error signals.”); and
regulate, in the (i+1)th cycle of training, one or more training parameters of the bidirectional translation model based on the error acquired in the ith cycle of training (Niu section 4.2, “we checkpoint the model every 1000 updates”, iterative updating of model)
Niu does not expressly disclose the elements annotated “see secondary reference below” above (i.e. the processor and memory).
Lee discloses:
…a processor (Lee paragraph 0063, processor); and
memory storing instructions executable by the processor (Lee paragraph 0063, stored software instructions)…
Niu and Lee are combinable because they are from the field of bidirectional machine language translator models (Lee paragraph 0037).
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to use a processor and memory as taught by Lee to implement the Niu bidirectional machine language translator model.
The suggestion/motivation for doing so would have been to implement the bidirectional machine language translator model of Niu using hardware similar to that taught as suitable for implementing the bidirectional machine language translator model of Lee.
Therefore, it would have been obvious to combine Niu with Lee to obtain the invention as specified in claim 8.
Applying these teachings to claims 9, 12-16, & 18-19:
Claim 9: The device of claim 8 (see above), wherein the processor is further configured to set a reconstructor in the bidirectional translation model (Niu section 1, “A single bi-directional model is used as a translator and a reconstructor”), and the reverse translation process is implemented through the reconstructor (Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”).
Claim 12: The device of claim 10 (see above), wherein the differentiable sampling function comprises a Gumbel-Softmax function (Niu section 1, “Translations are sampled using the Straight-Through Gumbel Softmax (STGS) Estimator”).
Claim 13: The device of claim 8 (see above), wherein the processor is further configured to:
acquire a value of a log-likelihood function of the target corpus and the pseudo target corpus and a value of a log-likelihood function of the source corpus and the pseudo source corpus (Niu section 3.1, equations (3) & (4), log-likelihood functions applied to forward translation and backward reconstruction).
Claim 14: The device of claim 8 (see above), wherein the processor is further configured to:
set a first language label (Niu section 3, “The language is marked by a tag”) or (Note: This is a recitation in the alternative, readable upon either option) a second language label for the training data, the training data set with the first language label is the source corpus and the training data set with the second language label is the target corpus, or (Note: This is a recitation in the alternative, readable upon either option), the training data set with the second language label is the source corpus and the training data set with the first language label is the target corpus.
Claim 15: A non-transitory computer-readable storage medium having instructions stored therein (Lee paragraph 0063, stored software instructions) for execution by one or more processors (Lee paragraph 0063, processor) of a terminal to enable the terminal to execute a method for training machine translation model, the method comprising:
acquiring a bidirectional translation model (Niu section 1, “Our approach builds on the bi-directional NMT model”) to be trained and training data, the training data comprising a source corpus and a target corpus corresponding to the source corpus (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus”);
training the bidirectional translation model for N cycles  (Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”; Niu section 4.2, “we checkpoint the model every 1000 updates”, iterative updating of model), each cycle of training comprising a forward translation process of translating the source corpus into a pseudo target corpus and a reverse translation process of translating the pseudo target corpus into a pseudo source corpus (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus and appending the swapped version to the original.”, including both source and target; Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”, including both source and target data and subjecting both to forward translation and backward reconstruction (into pseudo target and pseudo source)), and N being a positive integer greater than 1 (Niu section 3.2, “coherent intermediate translations”);
acquiring a forward translation similarity and a reverse translation similarity, the forward translation similarity being a similarity between the target corpus and the pseudo target corpus, and the reverse translation similarity being a similarity between the source corpus and the pseudo source corpus (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, translation quality measurement is determined by low distance (i.e. high similarity)); and
when a sum of the forward translation similarity and the reverse translation similarity converges, determining that training of the bidirectional translation model is completed (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, sufficient translation quality is determined by low distance (i.e. converged similarity); Niu section 4.2, “Training stops after 10 checkpoints without improvement”, converged final result), wherein the sum of the forward translation similarity and the reverse translation similarity converges indicates the sum of the sum of the forward translation similarity and the reverse translation similarity approaches a value (Niu section 3.1 and Equations 3 & 4, “we use L = LT+LR as the final training objective for f->e.”),
wherein the training the bidirectional translation model for N cycles further comprises:
acquiring, in the forward translation process, the pseudo target corpus through a differentiable sampling function  (Niu section 3.1, “We use differentiable sampling to side-step beam search and back-propagate error signals.”),
acquiring, in an ith cycle of training, an error between the target corpus and the pseudo target corpus through the differentiable sampling function, i being a positive integer greater than or equal to 1 and less than N (Niu section 1, “Suppose sentence f is translated forward to e using model θfe and then translated back to f^ using model θef, then e is more likely to be a good translation if the distance between f^ and f is small”, error determination; Niu section 3.1, “We use differentiable sampling to side-step beam search and back-propagate error signals.”); and
regulating, in the (i+1)th cycle of training, one or more training parameters of the bidirectional translation model based on the error acquired in the ith cycle of training (Niu section 4.2, “we checkpoint the model every 1000 updates”, iterative updating of model).
Claim 16: The non-transitory computer-readable storage medium of claim 15 (see above), wherein the training the bidirectional translation model for N cycles comprises:
setting a reconstructor in the bidirectional translation model (Niu section 1, “A single bi-directional model is used as a translator and a reconstructor”);
implementing the reverse translation process through the reconstructor (Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”).
Claim 18: The non-transitory computer-readable storage medium of claim 16 (see above), wherein the differentiable sampling function comprises a Gumbel-Softmax function (Niu section 1, “Translations are sampled using the Straight-Through Gumbel Softmax (STGS) Estimator”).
Claim 19: The non-transitory computer-readable storage medium of claim 15 (see above), wherein the acquiring the forward translation similarity and the reverse translation similarity comprises:
acquiring a value of a log-likelihood function of the target corpus and the pseudo target corpus and a value of a log-likelihood function of the source corpus and the pseudo source corpus (Niu section 3.1, equations (3) & (4), log-likelihood functions applied to forward translation and backward reconstruction).
Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Niu in view of Lee as applied to claim 8 above, and further in view of Foster (US 20090083023, cited in 3/15/22 Office Action).
With respect to claim 20, Niu in view of Lee teaches:
Claim 20: A machine translation system implementing the method of claim 1 (see above), comprising one or more processing circuits configured to implement operations of the method (Lee paragraph 0063, processor), and at least one of a display screen  and an audio component configured to output a translation result (see secondary reference below); wherein:
the reverse translation corpus is introduced into training to enrich corpuses (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus and appending the swapped version to the original.”; Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”, both source and target training data are used in forward translation and backward reconstruction), such that a model training efficiency is improved under condition of few resources for a minority language (Niu section 1, “It achieves consistent improvements across various low-resource language pairs and directions, showing its effectiveness in making better use of limited parallel data.”); and
with introduction of the bidirectional translation method, the reverse translation model is trained at the same time, to thereby improve quality of the reverse translation model (Niu section 3, “The training data corpus is then built by swapping the source and target sentences of a parallel corpus and appending the swapped version to the original.”; Niu section 3.1, “Our bi-directional model performs both forward translation and backward reconstruction.”, both source and target training data are used in forward translation and backward reconstruction; Niu section 1, “It achieves consistent improvements across various low-resource language pairs and directions, showing its effectiveness in making better use of limited parallel data.”, improved quality).
Niu in view of Lee does not expressly disclose the elements annotated “see secondary reference below” above (i.e. the display screen or audio output).
Foster discloses:
…at least one of (Note: This is a recitation in the alternative, readable upon either option) a display screen (Foster paragraph 0045, translation unit presented via web page (i.e. via an internet device display)) and an audio component configured to output a translation result…
Niu in view of Lee and Foster are combinable because they are from the field of bidirectional machine language translator models (Lee paragraph 0063, Foster paragraph 0019).
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to use a web page display as taught by Foster to present the result of a translation arrangement as taught by Niu in view of Lee.
The suggestion/motivation for doing so would have been to present the results of translation via a standard web-accessible interface.
Therefore, it would have been obvious to combine Niu in view of Lee with Foster to obtain the invention as specified in claim 20.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning the contents of this communication or earlier communications from the examiner should be directed to Stephen M. Brinich at 571-272-7430 (voice) or 571-273-7430 (fax).
Any inquiry relating to the status of this application, entry of papers into this application, or other any inquiries of a general nature concerning application processing should be directed to the Tech Center 2600 Customer Service center at 571-272-2600 or to the USPTO Contact Center at 800-786-9199 or 571-272-1000.
The examiner can normally be reached on weekdays 7:30-4:00 Eastern Time.
If attempts to contact the examiner and the Customer Service Center are unsuccessful, supervisor Claire Wang can be contacted at 571-270-1051.
Hand-carried correspondence may be delivered to the Customer Service Window, located at the Randolph Building, 401 Dulany Street, Alexandria, VA 22314.
/S. M. B./
Examiner, Art Unit 2663

/SEAN M CONNER/     Primary Examiner, Art Unit 2663