DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Introduction
This office action is in response to communication filed on 10/02/2020, Claims 1-10 are pending, and Claims 1-10 have been examined.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 06/02/2021 and 10/02/2020 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are:
Claim 6 recites “a translation sentence acquiring module, configured for”, “a reference optimal translation sentence determining module, configured for”, “a stitched translation sentence acquiring module, configured for” and “a finally-selected optimal translation sentence determining module, configured for”
Claim 7 recites “the stitched translation sentence acquiring module is specifically configured for”.
Claim 8 recites “the finally-selected optimal translation sentence determining module is specifically configured for”.
Structure for these limitations is found in the specification on Pg 9, Para 7, all lines, processor….memory….instructions.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-10 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Independent Claims 1 and 6 recites “translating each of source sentences in a source paragraph by using at least one first translation model to obtain at least one translation sentence corresponding to each of the source sentences;”, “sequentially determining a target source sentence, and an upper adjacent source sentence and a lower adjacent source sentence of the target source sentence according to the sequence of the source sentences in the source paragraph,”, “taking a finally-selected optimal translation sentence corresponding to the upper adjacent source sentence of the target source sentence as a reference optimal translation sentence corresponding to the upper adjacent source sentence,”, “and according to the results of the at least one first translation model, taking an initially-selected optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence as a reference optimal translation sentence corresponding to the lower adjacent source sentence;”, “stitching at least one target translation sentence corresponding to the target source sentence with the reference optimal translation sentence corresponding to the upper adjacent source sentence of the target source sentence and the reference optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence respectively, to obtain at least one stitched translation sentence;”, “obtaining an optimal stitched translation sentence based on the at least one stitched translation sentence,”, “and taking a target translation sentence corresponding to the optimal stitched translation sentence as a finally-selected optimal translation sentence of the target source sentence.” 
Claim 6 also recites “a translation sentence acquiring module, configured for”, “a reference optimal translation sentence determining module, configured for”, “a stitched translation sentence acquiring module, configured for” and “a finally-selected optimal translation sentence determining module, configured for”
	The limitations “translating each of source sentences in a source paragraph by using at least one first translation model to obtain at least one translation sentence corresponding to each of the source sentences;”, “sequentially determining a target source sentence, and an upper adjacent source sentence and a lower adjacent source sentence of the target source sentence according to the sequence of the source sentences in the source paragraph,”, “taking a finally-selected optimal translation sentence corresponding to the upper adjacent source sentence of the target source sentence as a reference optimal translation sentence corresponding to the upper adjacent source sentence,”, “and according to the results of the at least one first translation model, taking an initially-selected optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence as a reference optimal translation sentence corresponding to the lower adjacent source sentence;”, “stitching at least one target translation sentence corresponding to the target source sentence with the reference optimal translation sentence corresponding to the upper adjacent source sentence of the target source sentence and the reference optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence respectively, to obtain at least one stitched translation sentence;”, “obtaining an optimal stitched translation sentence based on the at least one stitched translation sentence,”, “and taking a target translation sentence corresponding to the optimal stitched translation sentence as a finally-selected optimal translation sentence of the target source sentence.” as drafted, covers a mental process, as this could be done by mentally or by hand with pen and paper.
This judicial exception is not integrated into a practical application. Claim 6 recites “a translation sentence acquiring module, configured for”, “a reference optimal translation sentence determining module, configured for”, “a stitched translation sentence acquiring module, configured for” and “a finally-selected optimal translation sentence determining module, configured for”. These limitations are being interpreted under 35 U.S.C 112(f) and therefore contain the structural limitations provided by the specification. However, these limitations direct towards using a computer for the method, and does not impose any meaningful limits on practicing the abstract idea. Claim 1 does not include any additional limitations.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The addition of the generic computer components recited above with regard to claim 6 do not amount to more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Claims 1 and 6 do not recite any additional limitations. The claims as drafted, are not patent eligible.

	Claims 2 and 7 recite the additional limitations of “stitching at least one target translation sentence corresponding to the target source sentence with the reference optimal translation sentence corresponding to the upper adjacent source sentence of the target source sentence and the reference optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence respectively” and “according to the sequence of the source sentences”. These limitations cover mental processes, as they could be done by mentally or by hand with pen and paper.
	This judicial exception is not integrated into a practical application. Claim 7 recites “the stitched translation sentence acquiring module is specifically configured for”. These limitations are being interpreted under 35 U.S.C 112(f) and therefore contain the structural limitations provided by the specification. However, these limitations direct towards using a computer for the method, and does not impose any meaningful limits on practicing the abstract idea. Claim 2 does not include any additional limitations.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The addition of the generic computer components recited above with regard to claim 7 do not amount to more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Claims 2 and 7 do not recite any additional limitations. The claims as drafted, are not patent eligible.

Claims 3 and 8 recite the additional limitations of “stitching the target source sentence with the upper adjacent source sentence and the lower adjacent source sentence to obtain the stitched source sentence corresponding to the target source sentence;”, “combining the at least one stitched translation sentence with the stitched source sentence respectively to generate at least one parallel corpus pair;”(Claim 8 does not have the word “respectively” in that limitation), “inputting the at least one parallel corpus pair into a second translation model respectively to generate a score for each of the parallel corpus pairs;”, “and taking the stitched translation sentence corresponding to the parallel corpus pair with the highest score as the optimal stitched translation sentence” These limitations cover mental processes, as they could be done by mentally or by hand with pen and paper.
This judicial exception is not integrated into a practical application. Claim 8 recites “the finally-selected optimal translation sentence determining module is specifically configured for”. These limitations are being interpreted under 35 U.S.C 112(f) and therefore contain the structural limitations provided by the specification. However, these limitations direct towards using a computer for the method, and does not impose any meaningful limits on practicing the abstract idea. Claim 3 does not include any additional limitations.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The addition of the generic computer components recited above with regard to claim 8 do not amount to more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Claims 3 and 8 do not recite any additional limitations. The claims as drafted, are not patent eligible.

Claim 4 recites the additional limitations of “the second translation model comprises an encoding layer and a decoding layer;”, -32- “inputting the at least one parallel corpus pair into the second translation model respectively to generate a score for each of the parallel corpus pairs comprises: inputting the stitched source sentence into the encoding layer to generate a corresponding encoding vector;”, “generating a corresponding reference decoding vector according to each of the stitched translation sentences;”, “and inputting the encoding vector and the reference decoding vector into the decoding layer, to obtain a confidence level of each of the stitched translation sentences and taking the confidence level of each of the stitched translation sentences as the score for each of parallel corpus pairs.” These limitations cover mental processes, as they could be done by mentally or by hand with pen and paper.
This judicial exception is not integrated into a practical application, as Claim 4 comprises no additional limitations. 
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception, as Claim 4 does not recite any additional limitations. The claim as drafted, is not patent eligible.

Claim 5 recites the additional limitations of “obtaining an optimal stitched translation sentence based on the at least one stitched translation sentence comprises: inputting the at least one stitched translation sentence into a language model respectively, to generate a score corresponding to each of the stitched translation sentences;”, “and taking the stitched translation sentence with the highest score as the optimal stitched translation sentence.” These limitations cover mental processes, as they could be done by mentally or by hand with pen and paper.
This judicial exception is not integrated into a practical application, as Claim 5 comprises no additional limitations. 
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception, as Claim 5 does not recite any additional limitations. The claim as drafted, is not patent eligible.

Claim 9 recites the additional limitations of “reordering results of a translation model according to claim 1.” These limitations cover mental processes, as they could be done by mentally or by hand with pen and paper. As shown in the rejection to Claim 1 above.
This judicial exception is not integrated into a practical application. Claim 9 recites “A computing device, comprising a memory, a processor and computer instructions stored on the memory and capable of running on the processor, wherein the instructions are executed by the processor to implement steps of the method for”. These limitations direct towards using a computer for the method, and does not impose any meaningful limits on practicing the abstract idea.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The addition of the generic computer components recited above with regard to claim 9 do not amount to more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Claim 9 does not recite any additional limitations. The claims as drafted, are not patent eligible.

Claim 10 recites “the method for reordering results of a translation model according to claim 1” These limitations cover mental processes, as they could be done by mentally or by hand with pen and paper. As shown in the rejection to Claim 1 above.
This judicial exception is not integrated into a practical application. Claim 10 recites “A computer-readable non-transitory storage medium with computer instructions stored thereon, wherein the instructions are executed by the processor to implement steps”. These limitations direct towards using a computer for the method, and does not impose any meaningful limits on practicing the abstract idea.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The addition of the generic computer components recited above with regard to claim 10 do not amount to more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Claim 10 does not recite any additional limitations. The claims as drafted, are not patent eligible.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Voita et al. “When a Good Translation is Wrong in Context” hereinafter Voita, and further in view of  Maruf et al “Document Context Neural Machine Translation with Memory Networks” hereinafter Maruf, and further in view of Song et al. (US 10255275 B2) .

Regarding Claim 1:
Voita teaches a method for reordering results of a translation model, comprising: translating source sentences in a source paragraph by using at least one first translation model to obtain at least one translation sentence corresponding to each of the source sentences(Pg 5, 4.2 MODEL, Ln 1-2, We introduce a two-pass framework: first, the sentence is translated with a context-agnostic model); 
determining a target source sentence, and an upper adjacent source sentence according to the sequence of the source sentences in the source paragraph(Pg 6, Fig 4, shows Source sentence and Context 1 sentence), 
taking a finally-selected optimal translation sentence corresponding to the upper adjacent source sentence of the target source sentence as a reference optimal translation sentence corresponding to the upper adjacent source sentence(Pg 6, Fig 4, Translation of ctx1. This is a final translation, as it is not labeled a First-pass Translation like the other sentence. First pass and second pass translations are discussed in Pg 5-6, 4.2 Model, Para 1, all lines), 
stitching at least one target translation sentence corresponding to the target source sentence with the reference optimal translation sentence corresponding to the upper adjacent source sentence of the target source sentence respectively, to obtain at least one stitched translation sentence(Pg 6, Fig 4, shows that First-pass Translation, and Translation of ctx1, are input into CADec together); 
Voita does not explicitly teach translating each of source sentences in a source paragraph….; sequentially determining…. .
In the same field of Machine Translation, Maruf teaches each of source sentences in a source paragraph….; sequentially determining (Pg 1277, Para 3, Decoding, Ln 5-12, block coordinate descent optimisation algorithm… we initialise the translation of each sentence using the base neural MT model … We then repeatedly visit each sentence in the document, and update its translation using our document-context dependent NMT model …while the translations of other sentences are kept fixed).
It would have been obvious for one skilled in the art, at the effective time of filling to modify Voita with the initial translation and iterative updating of translated sentences, of Maruf, as it helps solve the optimization problem in translating a document(Pg 1277, Para 3, Decoding, Ln 1-6).
The combination of Voita and Maruf does not teach determining a lower adjacent source sentence of the target source sentence
and according to the results of the at least one first translation model, taking an initially-selected optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence as a reference optimal translation sentence corresponding to the lower adjacent source sentence;
stitching at least one target translation sentence corresponding to the target source sentence with the reference optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence.
In the same field of Machine Translation, Maruf teaches determining a lower adjacent source sentence of the target source sentence(Pg 1277, 4 Context Dependent…, Para 2, Ln 1-8, generates target translation…conditions generation on… all other sentences of the document and their translations. This means that the translation and source side next sentence is included in the context sentences used)
and according to the results of the at least one first translation model, taking an initially-selected optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence as a reference optimal translation sentence corresponding to the lower adjacent source sentence(Pg 1277, Para 3, Decoding, Ln 5-12, block coordinate descent optimisation algorithm… we initialise the translation of each sentence using the base neural MT model);
stitching at least one target translation sentence corresponding to the target source sentence with the reference optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence(Pg 1277, 4 Context Dependent…, Para 2, Ln 1-8, generates target translation…conditions generation on… all other sentences of the document and their translations. This means that the translation side next sentence is included in the context sentences used, as all of them are used. Also Para 3, Ln 1-4, the source and target document contexts as external memories, and attends to relevant parts of these external memories when generating the translation of a sentence. Voita teaches how context is attached and used with the other sentences, as in previous citations. Maruf teaches using translation side, future sentences as context).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Voita and Maruf, with the source and translation side future sentence context of Maruf, as it improves performance(Abstract, Ln 18-24).
The combination of Voita and Maruf does not teach obtaining an optimal stitched translation sentence based on the at least one stitched translation sentence, and taking a target translation sentence corresponding to the optimal stitched translation sentence as a finally-selected optimal translation sentence of the target source sentence.
In the same field of Machine Translation Song teaches obtaining an optimal stitched translation sentence based on the at least one stitched translation sentence, and taking a target translation sentence corresponding to the optimal stitched translation sentence as a finally-selected optimal translation sentence of the target source sentence(Abstract, Ln 5-15, generate translation probabilities from the text to be translated to the pending candidate translations based on …a predetermined translation probability prediction model ….select …..candidate translations that have the translation probabilities higher than other pending candidate translations. The translation sentence being “stitched” is taught by Voita as shown in previous citations. Picking the final translation as the candidates sentence of the highest scored pair is form Song. Voita also teaches scoring the examples, Pg 4, 3 Text Sets, Ln 16-17, which is part of this limitation, and is also referenced in the “comprising” of this limitation, in rejection to dependent Claims 3 and 5).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination Voita and Maruf, with the scoring and selection of candidate translation sentences of Song, as it evaluates translation quality of the candidate translations, therefore improving quality of candidate translations(Col 3, Ln 33-37).

Regarding Claim 2:
The combination of Voita, Maruf and Song teaches the method for reordering results of a translation model according to claim 1, and Voita teaches wherein,-31- stitching at least one target translation sentence corresponding to the target source sentence with the reference optimal translation sentence corresponding to the upper adjacent source sentence of the target source sentence and the reference optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence respectively, comprises: 
stitching at least one target translation sentence corresponding to the target source sentence with the reference optimal translation sentence corresponding to the upper adjacent source sentence of the target source sentence and the reference optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence respectively, according to the sequence of the source sentences(Pg 6, Fig 4, shows stitching context sentences, as the context is input in CADec along with First-pass Translation .Pg 6, Para 3, Ln 3-6, , cj are several preceding sentences along with their translations. Using lower adjacent sentences as context are taught by the second combination with Maruf, in the rejection to Claim 1).

Regarding Claim 3:
The combination of Voita, Maruf and Song teaches the method for reordering results of a translation model according to claim 1, and Voita teaches wherein, obtaining an optimal stitched translation sentence based on the at least one stitched translation sentence comprises: stitching the target source sentence with the upper adjacent source sentence to obtain the stitched source sentence corresponding to the target source sentence(Pg 6, Fig 4, Shows Source context is input into CADec beside source sentence); 
combining the at least one stitched translation sentence with the stitched source sentence respectively to generate at least one parallel corpus pair(Pg 6, Fig 4, shows parallel source and translation); 
inputting the at least one parallel corpus pair into a second translation model respectively to generate a score for each of the parallel corpus pairs(Pg 6, Fig 4, shows inputting. Pg 4, 3 Text Sets, Ln 16-17, The system is asked to score each candidate example).
The combination of Voita, Maruf and Song does not teach stitching the target source sentence with the lower adjacent source sentence.
In the same field of Machine Translation, Maruf teaches stitching the target source sentence with the lower adjacent source sentence(Pg 1277, 4 Context Dependent, Para 2, Ln 1-8, generates target translation…conditions generation on… all other sentences of the document and their translations. This means that the source side next sentence is included in the context sentences used, as all of them are used. Also Para 3, Ln 1-4, the source and target document contexts as external memories, and attends to relevant parts of these external memories when generating the translation of a sentence. Voita teaches how context is attached and used with the other sentences, as in previous citations. Maruf teaches using source side, future sentences as context).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Voita, Maruf and Song with the source side future sentence context of Maruf, as it improves performance(Abstract, Ln 18-24).
The combination of Voita, Maruf and Song does not teach and taking the stitched translation sentence corresponding to the parallel corpus pair with the highest score as the optimal stitched translation sentence.
In the same field of Machine Translation, Song teaches and taking the stitched translation sentence corresponding to the parallel corpus pair with the highest score as the optimal stitched translation sentence(Abstract, Ln 5-15, generate translation probabilities from the text to be translated to the pending candidate translations based on …a predetermined translation probability prediction model ….select a predetermined number of pending candidate translations that have the translation probabilities higher than other pending candidate translations. Translation sentence being stitched is taught by Voita in Claim 1).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Voita, Maruf and Song with the scoring and selection of candidate translation sentences of Song, as it evaluates translation quality of the candidate translations, therefore improving quality of candidate translations(Col 3, Ln 33-37).

Regarding Claim 4:
The combination of Voita, Maruf and Song teaches the method for reordering results of a translation model according to claim 3, and Voita teaches wherein, the second translation model comprises an encoding layer and a decoding layer(Pg 6, Context-aware decoder(CADec), Para 2, Ln 1-8, CADec is composed of a stack of N = 6 identical layers and is similar to the decoder of the original Transformer. It has … and attention to encoder outputs…base models encoder); -32- 
inputting the at least one parallel corpus pair into the second translation model respectively to generate a score for each of the parallel corpus pairs comprises: inputting the stitched source sentence into the encoding layer to generate a corresponding encoding vector(Pg 6, Context-aware decoder(CADec), Para 2, Ln 6-10, We use the states from the last layer of the base model’s encoder of the current source sentence and all context sentences as input to the first multi-head attention.  Fig 4 also shows Base encoder, pointing to source side sentences); 
generating a corresponding reference decoding vector according to each of the stitched translation sentences(Pg 6, Context-aware decoder(CADec), Para 2, Ln 10-14, For the second multi-head attention we input both last states of the base decoder and the target-side token embedding layer; this is done for translations of the source and also all context sentences. Fig 4 also shows Base decoder, pointing to translation side sentences); 
and inputting the encoding vector and the reference decoding vector into the decoding layer(Pg 6, Fig 4, shows both being input into CADec. Pg 6, Context-aware decoder(CADec), Para 2, Ln 1-8, CADec is composed of a stack of N = 6 identical layers and is similar to the decoder of the original Transformer), 
to obtain a confidence level of each of the stitched translation sentences and taking the confidence level of each of the stitched translation sentences as the score for each of parallel corpus pairs(Pg 4, 3 Text Sets, Ln 16-17, The system is asked to score each candidate example. Ln 7-12, example (sequence of sentences and their reference translation from the data) and several contrastive translations which differ from the true one only in the considered aspect. All contrastive translations we use are correct plausible translations at a sentence level. Shows what an example is and shows there are examples using possible correct translations).

Regarding Claim 5:
The combination of Voita, Maruf and Song teaches the method for reordering results of a translation model according to claim 1, and Voita teaches wherein, obtaining an optimal stitched translation sentence based on the at least one stitched translation sentence comprises: inputting the at least one stitched translation sentence into a language model respectively, to generate a score corresponding to each of the stitched translation sentences(Pg 4, 3 Text Sets, Ln 16-17, The system is asked to score each candidate example. Ln 7-12, example (sequence of sentences and their reference translation from the data)); 
The combination of Voita, Maruf and Song does not teach and taking the stitched translation sentence with the highest score as the optimal stitched translation sentence.
In the same field of Machine translation, Song teaches and taking the stitched translation sentence with the highest score as the optimal stitched translation sentence (Abstract, Ln 5-15, generate translation probabilities from the text to be translated to the pending candidate translations based on …a predetermined translation probability prediction model ….select a predetermined number of pending candidate translations that have the translation probabilities higher than other pending candidate translations. The translation sentence being “stitched” is taught by Voita as shown in previous citations, picking the final translation as the candidates sentence of the highest scored pair is form Song).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Voita, Maruf and Song with the scoring and selection of candidate translation sentences of Song, as it evaluates translation quality of the candidate translations, therefore improving quality of candidate translations(Col 3, Ln 33-37).

Regarding Claim 6:
An apparatus for reordering results of a translation model, comprising: 
translating source sentences in a source paragraph by using at least one first translation model to obtain at least one translation sentence corresponding to each of the source sentences(Pg 5, 4.2 MODEL, Ln 1-2, We introduce a two-pass framework: first, the sentence is translated with a context-agnostic model); 
determining a target source sentence, and an upper adjacent source sentence according to the sequence of the source sentences in -33- the source paragraph(Pg 6, Fig 4, shows Source sentence and Context 1 sentence), 
taking a finally-selected optimal translation sentence corresponding to the upper adjacent source sentence of the target source sentence as a reference optimal translation sentence corresponding to the upper adjacent source sentence(Pg 6, Fig 4, Translation of ctx1. This is a final translation, as it is not labeled a First-pass Translation like the other sentence. First pass and second pass translations are discussed in Pg 5-6, 4.2 Model, Para 1, all lines), 
stitching at least one target translation sentence corresponding to the target source sentence with the reference optimal translation sentence corresponding to the upper adjacent source sentence of the target source sentence respectively, to obtain at least one stitched translation sentence(Pg 6, Fig 4, shows that First-pass Translation, and Translation of ctx1, are input into CADec together).
Voita does not explicitly teach translating each of source sentences in a source paragraph….; sequentially determining…. .
In the same field of Machine Translation, Maruf teaches each of source sentences in a source paragraph….; sequentially determining (Pg 1277, Para 3, Decoding, Ln 5-12, block coordinate descent optimisation algorithm… we initialise the translation of each sentence using the base neural MT model … We then repeatedly visit each sentence in the document, and update its translation using our document-context dependent NMT model …while the translations of other sentences are kept fixed).
It would have been obvious for one skilled in the art, at the effective time of filling to modify Voita with the initial translation and iterative updating of translated sentences, of Maruf, as it helps solve the optimization problem in translating a document(Pg 1277, Para 3, Decoding, Ln 1-6).
The combination of Voita and Maruf does not teach a translation sentence acquiring module, configured for…. a reference optimal translation sentence determining module, configured for,…..a stitched translation sentence acquiring module, configured for…. and a finally-selected optimal translation sentence determining module, configured for.
In the same field of Machine translation, Song teaches a translation sentence acquiring module, configured for…. a reference optimal translation sentence determining module, configured for,…..a stitched translation sentence acquiring module, configured for…. and a finally-selected optimal translation sentence determining module, configured for(Col 10, Ln 24-31, processors; and memory… instructions are processed by the one or more processors).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Voita and Maruf, with the computer components of Song, as the computer gives the necessary structure for the system to operate(Col 10, Ln 1-2).
The combination of Voita, Maruf and Song does not teach determining a lower adjacent source sentence of the target source sentence
and according to the results of the at least one first translation model, taking an initially-selected optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence as a reference optimal translation sentence corresponding to the lower adjacent source sentence;
stitching at least one target translation sentence corresponding to the target source sentence with the reference optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence.
In the same field of Machine Translation, Maruf teaches determining a lower adjacent source sentence of the target source sentence(Pg 1277, 4 Context Dependent…, Para 2, Ln 1-8, generates target translation…conditions generation on… all other sentences of the document and their translations. This means that the translation and source side next sentence is included in the context sentences used)
and according to the results of the at least one first translation model, taking an initially-selected optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence as a reference optimal translation sentence corresponding to the lower adjacent source sentence(Pg 1277, Para 3, Decoding, Ln 5-12, block coordinate descent optimisation algorithm… we initialise the translation of each sentence using the base neural MT model);
stitching at least one target translation sentence corresponding to the target source sentence with the reference optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence(Pg 1277, 4 Context Dependent…, Para 2, Ln 1-8, generates target translation…conditions generation on… all other sentences of the document and their translations. This means that the translation side next sentence is included in the context sentences used, as all of them are used. Also Para 3, Ln 1-4, the source and target document contexts as external memories, and attends to relevant parts of these external memories when generating the translation of a sentence. Voita teaches how context is attached and used with the other sentences, as in previous citations. Maruf teaches using translation side, future sentences as context).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Voita, Maruf and Song, with the source and translation side future sentence context of Maruf, as it improves performance(Abstract, Ln 18-24).
The combination of Voita, Maruf and Song does not teach obtaining an optimal stitched translation sentence based on the at least one stitched translation sentence, and taking a target translation sentence corresponding to the optimal stitched translation sentence as a finally-selected optimal translation sentence of the target source sentence.
In the same field of machine Translation, Song teaches obtaining an optimal stitched translation sentence based on the at least one stitched translation sentence, and taking a target translation sentence corresponding to the optimal stitched translation sentence as a finally-selected optimal translation sentence of the target source sentence(Abstract, Ln 5-15, generate translation probabilities from the text to be translated to the pending candidate translations based on …a predetermined translation probability prediction model ….select …..candidate translations that have the translation probabilities higher than other pending candidate translations. The translation sentence being “stitched” is taught by Voita as shown in previous citations. Picking the final translation as the candidates sentence of the highest scored pair is form Song. Voita also teaches scoring the examples, Pg 4, 3 Text Sets, Ln 16-17, which is also referenced in the “comprising” of this limitation in Claim 8).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination Voita, Maruf and Song, with the scoring and selection of candidate translation sentences of Song, as it evaluates translation quality of the candidate translations, therefore improving quality of candidate translations(Col 3, Ln 33-37).

	Regarding Claim 7:
	The combination of Voita, Maruf and Song teaches the apparatus for reordering results of a translation model according to claim 6, and Voita teaches wherein, the stitched translation sentence acquiring module is specifically configured for(This module is taught with the combination of Song’s computer components in Claim 6): 
according to the sequence of the source sentences, stitching at least one target translation sentence corresponding to the target source sentence with the reference optimal translation sentence corresponding to the upper adjacent source sentence of the target source sentence and the reference optimal translation sentence corresponding to the lower adjacent source sentence of the target source sentence respectively(Pg 6, Fig 4, shows stitching context sentences, as the context is input in CADec along with First-pass Translation .Pg 6, Para 3, Ln 3-6, , cj are several preceding sentences along with their translations. Using lower adjacent sentences as context are taught by the second combination with Maruf, in the rejection to Claim 6).

Regarding Claim 8:
The combination of Voita, Maruf and Song teaches the apparatus for reordering results of a translation model according to claim 6, and Voita teaches wherein, the finally-selected optimal translation sentence determining module is specifically configured for(This module is taught with the combination of Song’s computer components in Claim 6): -34- 
stitching the target source sentence with the upper adjacent source sentence to obtain the stitched source sentence corresponding to the target source sentence(Pg 6, Fig 4, Shows Source context is input into CADec beside source sentence); 
combining the at least one stitched translation sentence with the stitched source sentence to generate at least one parallel corpus pair(Pg 6, Fig 4, shows parallel source and translation); 
inputting the at least one parallel corpus pair into a second translation model respectively to generate a score for each of the parallel corpus pairs (Pg 6, Fig 4, shows inputting. Pg 4, 3 Text Sets, Ln 16-17, The system is asked to score each candidate example); 
The combination of Voita, Maruf and Song does not teach stitching the target source sentence with the lower adjacent source sentence.
In the same field of Machine Translation, Maruf teaches stitching the target source sentence with the lower adjacent source sentence(Pg 1277, 4 Context Dependent, Para 2, Ln 1-8, generates target translation…conditions generation on… all other sentences of the document and their translations. This means that the source side next sentence is included in the context sentences used, as all of them are used. Also Para 3, Ln 1-4, the source and target document contexts as external memories, and attends to relevant parts of these external memories when generating the translation of a sentence. Voita teaches how context is attached and used with the other sentences, as in previous citations. Maruf teaches using source side, future sentences as context).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Voita, Maruf and Song with the source side future sentence context of Maruf, as it improves performance(Abstract, Ln 18-24).
The combination of Voita, Maruf and Song does not teach and taking the stitched translation sentence corresponding to the parallel corpus pair with the highest score as the optimal stitched translation sentence.
In the same field of Machine Translation, Song teaches and taking the stitched translation sentence corresponding to the parallel corpus pair with the highest score as the optimal stitched translation sentence(Abstract, Ln 5-15, generate translation probabilities from the text to be translated to the pending candidate translations based on …a predetermined translation probability prediction model ….select a predetermined number of pending candidate translations that have the translation probabilities higher than other pending candidate translations. Translation sentence being stitched is taught by Voita in Claim 6).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Voita, Maruf and Song with the scoring and selection of candidate translation sentences of Song, as it evaluates translation quality of the candidate translations, therefore improving quality of candidate translations(Col 3, Ln 33-37).
	
Regarding Claim 9:
The combination of Voita, Maruf and Song teaches the method for reordering results of a translation model according to claim 1(As shown in rejection to Claim 1), but does not teach a computing device, comprising a memory, a processor and computer instructions stored on the memory and capable of running on the processor, wherein the instructions are executed by the processor to implement steps of.
In the same field of Machine Translation, Song teaches a computing device, comprising a memory, a processor and computer instructions stored on the memory and capable of running on the processor, wherein the instructions are executed by the processor to implement steps of(Col 10, Ln 24-31, processors; and memory… instructions are processed by the one or more processors).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Voita, Maruf and Song, with the computer components of Song, as the computer gives the necessary structure for the system to operate(Col 10, Ln 1-2).

Regarding Claim 10:
The combination of Voita, Maruf and Song teaches the method for reordering results of a translation model according to claim 1(As shown in rejection to Claim 1), but does not teach a computer-readable non-transitory storage medium with computer instructions stored thereon, wherein the instructions are executed by the processor to implement steps of.
In the same field of Machine Translation, Song teaches a computer-readable non-transitory storage medium with computer instructions stored thereon, wherein the instructions are executed by the processor to implement steps of(Col 10, Ln 24-31, instructions are processed by the one or more processors. Col 32, Ln 63-65, RAM).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Voita, Maruf and Song, with the computer components of Song, as the computer gives the necessary structure for the system to operate(Col 10, Ln 1-2).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Yamagishi et al. “Improving Context-aware Neural Machine Translation with Target-side Context”
Machine Translation using context of other sentences.
Scherrer et al. “Analysing concatenation approaches to document-level NMT in two different domains”
Comparing different Machine translation with context methods.
	Maruf et al. “Contextual Neural Model for Translating Bilingual Multi-Speaker Conversations”
Machine Translation with context and multiple speakers.
	Maruf et al. “Selective Attention for Context-aware Neural Machine Translation”
Machine Translation with context of other sentences.
	Bawden et al. “Evaluating Discourse Phenomena in Neural Machine Translation”
Machine Translation with context of other sentences.
	Miculicich et al. “Document-Level Neural Machine Translation with Hierarchical Attention Networks”
Machine Translation with context of other sentences.
	Voita et al. “Context-Aware Neural Machine Translation Learns Anaphora Resolution”
Machine Translation with context of other sentences.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDER G MARLOW whose telephone number is (571)272-4536. The examiner can normally be reached Monday - Thursday 10:00 am - 8:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richmond Dorvil can be reached on (571)272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ALEXANDER G MARLOW/Assistant Examiner, Art Unit 2658                                                                                                                                                                                                        

/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658