Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 08/22/2022 has been entered.
 
Amendments
Claims 1-8, 10-11, 19-24, and 27 are amended. Claims 1-11 and 19-27 are pending and have been considered.

Claim Objections
Claims 1, 4, 6, 9-11, 19-20, and 25-26 are objected to because of the following informalities:  
In Claim 1, line 3 should recite “respective probability scores” to agree with the last 4 lines.
Regarding Claim 1, lines 9-11, the phrase “at the plurality of positions in the sequence” appears to be redundant with “for each respective one of said plurality of positions” and generally makes the claim unclear. Claim 1, lines 11-12 recites: “at least one of the positions has at least one of the plurality of candidate sentences with multiple imputed words”. The meaning of the term “has” is unclear to the Examiner. For purposes of examination, Examiner interprets the claim to mean at least one of the plurality of candidate sentences corresponds to at least one of the positions, and the at least one of the plurality of candidate sentences has multiple imputed words representing multiple concepts. 
Regarding Claim 7, lines 2-3, examiner does not understand what “a function from the respective position embedding” means. Examiner interprets this limitation to mean “a function of the respective position embedding”.
Regarding Claim 4, lines 2-3 should recite “before performing the simultaneous beam search” to improve the clarity of the claim
Regarding Claim 9, lines 3-4, it is unclear if generating the rejoin token stops the beam search for every path, including those paths the rejoin token is not attached to.
In Claim 10, lines 3-4 recite “the initial classifier representing a probability”. Examiner does not understand how a classifier itself represents a probability.
In claim 11, the end of line 3 recites “the the”.
Claims 19-20 are objected to for the corresponding limitation of claim 1.
Regarding Claim 25, line 3, examiner does not understand what “a function from the respective position embedding” means. Examiner interprets this limitation to mean “a function of the respective position embedding”.
The last line of claim 25 recites “the positional embedding.” In Claim 24, it is possible to generate multiple positional embeddings, and claim 25 incorporates claim 24. Therefore, it is unclear which of the multiple positional embeddings is being referred to by the limitation “the positional embedding”.
Regarding Claim 26, lines 4-5, it is unclear if generating the rejoin token stops the beam search for every path, including those paths the rejoin token is not attached to. Appropriate correction is required

Claim Rejections - 35 USC § 112

The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-11 and 19-27 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
In Claim 1, lines 12, the limitation “multiple words representing multiple concepts” lacks support in the specification. At best, specification [092] states, “For at least one such point, preferably there are different candidates generated with at least one candidate having a single imputed word and at least one candidate having multiple imputed missing words.” The specification does not disclose that the multiple imputed missing words represent a single concept or multiple concepts. Claims 2-11 are rejected for failing to cure the deficiencies of claim 1 upon which they depend.
In Claim 19, in the third line from the top of p. 5, the limitation “multiple words representing multiple concepts” lacks support in the specification. At best, specification [092] states, “For at least one such point, preferably there are different candidates generated with at least one candidate having a single imputed word and at least one candidate having multiple imputed missing words.” The specification does not disclose that the multiple imputed missing words represent a single concept or multiple concepts. 
Regarding claim 20, the claim is directed to an apparatus that implements the same features as the product of claim 19, and is therefore rejected for at least the same reasons therein. Claims 21-27 are rejected for failing to cure the deficiencies of claim 20 upon which they depend.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 4, 6-7, 8, 11, and 23-25 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 4 is indefinite because Examiner cannot determine the relationship between the path, the position, and the initial classifier. Claims 1, 2 and 4 do not mention creating a path, so it is unclear how a path can be pruned before one has been created. For purposes of examination, Examiner interprets the claim to mean pruning away at least one path based on a result of an initial classifier being lower than a predetermined threshold for the at least one of the plurality of positions.
Each of Claim 6, line 5, Claim 11, line 2, and Claim 24, line 5 recites the term “some”. This is a relative term which renders the claim indefinite. The term “some” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. For purposes of examination, Examiner interprets the term “some” to mean a predetermined number of input elements. Claim 7 is rejected for failing to cure the deficiency of claim 6 upon which it depends. Claim 25 is rejected for failing to cure the deficiency of claim 24 upon which it depends.
Claim 8 is indefinite because the phrase “compared to the input data” does not make sense to the examiner. The candidate replacement sentence is either grammatically correct or incorrect, and it is either semantically correct or incorrect. The claim does not indicate it has a degree of correctness that is comparable to the input data. It is unclear which aspect of the input data is being compared. For purposes of examination, Examiner herein examines Claim 8 without considering the phrase “compared to the input data”. 
Claim 23 recites pruning away at least one path before the first search step. However, Claim 22 recites generating, in a first search step, one or more paths. It does not make sense to prune away a path before one has been generated. For purposes of examination, Examiner interprets the claim to mean pruning away at least one path based on a result of an initial classifier being lower than a predetermined threshold for the at least one of the plurality of positions.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-11 and 19-27 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
CLAIM 1
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
(1) dividing a portion of input data into a sequence of input elements, each element in the input elements comprising a word or a gap between words; 
(2) simultaneously identifying a plurality of positions in the sequence at which missing or erroneous data is potentially to be imputed, each of the plurality of positions comprising a gap between a pair of adjacent words; 
(3) for each respective one of said plurality of positions, generating a plurality of candidate sentences that have one or more words imputed at the plurality of positions in the sequence, wherein at least one of the positions has at least one of the plurality of candidate sentences with multiple imputed words representing multiple concepts; 
(4) generating… a probability score for each of the plurality of candidate sentences; and 
Limitations 1-3 are mental processes (see bolded terms) of judgements which can reasonably be performed in one’s mind with the aid of pencil and paper. Limitation 4 is a mathematical calculation of computing probability scores. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites
the following additional elements:
training a first artificial neural network based on a training data set comprising data sequences and respective scores;
using the first artificial neural network
outputting, based on the generated probability scores, a selection of one or more of the plurality of candidate sentences.
Training and using a first neural network are generally linking the abstract ideas to the particular
technology environment of machine learning, as discussed in MPEP 2106.05(h). Outputting results is an insignificant extra-solution activity because it is well-known, as discussed in MPEP 2106.05(g). Adding
insignificant extra-solution activity is not sufficient to integrate the additional elements into a practical
application. Accordingly, the additional elements do not integrate the abstract idea into a practical
application because they do not impose any meaningful limits on practicing the abstract idea. The claim
is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly
more than the judicial exceptions. Training and using a first neural network are generally linking the
abstract ideas to the particular technology environment of machine learning, as discussed in MPEP
2106.05(h). Outputting results is an insignificant extra-solution activity because it is well-known, as discussed in MPEP 2106.05(g). Outputting information is well-known in the art, as disclosed by Wical (US Patent 6,460,034,see PTO-892 filed 01/03/2022) at C. 9, L. 30-32: “A screen module, such as screen module 230, which processes information for display on a computer output display, is well known in the art.” The claim is not patent eligible.

CLAIM 2 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:  performing a simultaneous beam search for the missing or erroneous data from the plurality of positions in the sequence. This limitation is a mental process of a judgement which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recite additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. The claim is not patent eligible.

CLAIM 3 incorporates the rejection of claim 2.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 2 are incorporated. The claim recites the following limitations: 
(1) the simultaneous beam search comprises generating, in a first search step, one or more paths for each of the plurality of positions, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at its respective position
(2) [generating] an associated probability score of the candidate element.
The first limitation is a mental process of a judgement which can reasonably be performed in one’s mind with the aid of pencil and paper. The second limitation is a mathematical calculation. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recite additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. The claim is not patent eligible.

CLAIM 4 incorporates the rejection of claim 2.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 2 are incorporated. The claim recites the following limitations:  wherein at least one path from at least one of the plurality of positions in the sequence is pruned away, before the simultaneous beam search, based on a result… being lower than a predetermined threshold for the at least one of the plurality of positions.
	Pruning a path based on a result and determining that the result is lower than a predetermined threshold are both mental processes of judgements which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
an initial classifier
An initial classifier is generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. An initial classifier is generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). The claim is not patent eligible.

CLAIM 5 incorporates the rejection of claim 2.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 2 are incorporated. The claim recites the following limitations:
(1) the simultaneous beam search explores a plurality of imputation candidates for the plurality of positions in parallel and
(2) prunes away one or more of the plurality of candidate sentences based on their respective probability scores.
Exploring a plurality of imputation candidates with a beam search and pruning away one or more of them are mental processes (see bolded terms) of judgements which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recite additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. The claim is not patent eligible.

CLAIM 6 incorporates the rejection of claim 3.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 3 are incorporated. The claim recites the following limitations:
(1) for each respective one of said positions: prior to the first search step, generating a respective positional embedding for the respective position, the positional embedding being a vector generated… as a function of some or all of the input elements before and/or after the respective position in the sequence, and as a function of the respective position in the sequence; 
(2) wherein the vector provides context about the respective position in the sequence,
(3) wherein the respective position includes a beginning of sequence (BOS) position or an end of sequence (EOS) position.
Limitation 1 is a mathematical calculation of generating a positional embedding. Limitations 2 and 3 are mental processes of a judgements of providing context and including BOS/EOS positions, which can which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
a second neural network
A second neural network is generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. A second neural network is generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). The claim is not patent eligible.

CLAIM 7 incorporates the rejection of claim 6.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 6 are incorporated. The claim recites the following limitations:  a function from the respective positional embedding. This limitation is mathematical computation. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
generating a classifier for each of the plurality of positions, wherein the classifier is used as
Generating a classifier and using the classifier are generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. Generating a classifier and using the classifier are generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). The claim is not patent eligible.

CLAIM 8 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:  the candidate replacement sentence is grammatically correct or semantically correct, or both, compared to the input data. This limitation further limits the abstract ideas of claim 1. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recite additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. The claim is not patent eligible.

CLAIM 9 incorporates the rejection of claim 3.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 3 are incorporated. The claim recites the following limitations:
(1) generating, in a second search step, a rejoin token to be added to the candidate element of at least one of the one or more paths generated in the first search step, 
(2) the rejoin token representing stopping the simultaneous beam search along the at least one of the one or more paths.
Limitation 1 and 2 are mental processes of judgements of creating a token and representing a stopping condition, which can which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recite additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. The claim is not patent eligible.

 CLAIM 10 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:
(1) for each of said positions, the probability score generated… for each of the plurality of candidate sentences is also a function of…
(2) representing a probability that the respective position has a missing or erroneous element.
Limitation 1 is a mathematical calculation of generating a probability score based on a function. Limitation 2 is a mental process of a judgement of representing a probability which can be reasonably performed in one’s mind with the aid of pencil and paper.  Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
by the trained first artificial neural network 
an initial classifier for a respective position
The trained first artificial neural network and an initial classifier are generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. The trained first artificial neural network and an initial classifier are generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). The claim is not patent eligible.

CLAIM 11 incorporates the rejection of claim 10.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 10 are incorporated. The claim recites the following limitations: …generated… as a function of some or all of the input elements before and/or after the respective position in the sequence, and as a function of the respective position in the sequence. This limitation is a mathematical computation. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
the initial classifier…  by a second neural network
The initial classifier and a second neural network are generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. The initial classifier and a second neural network are generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). The claim is not patent eligible.

CLAIM 19
Step 1: The claim recites a product, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
(1) dividing a portion of input data into a sequence of input elements, each element in the input elements comprising a word or a gap between words; 
(2) simultaneously identifying a plurality of positions in the sequence at which missing or erroneous data is potentially to be imputed, each of the plurality of positions comprising a gap between a pair of adjacent words; 
(3) for each respective one of said plurality of positions, generating a plurality of candidate sentences that have one or more words imputed at the plurality of positions in the sequence, wherein for at least one of the positions, a first of the plurality of candidate sentences has a single imputed word and a second of the plurality of candidate sentences has multiple imputed words representing multiple concepts; 
(4) generating… a probability score for each of the plurality of candidate sentences; and 
Limitations 1-3 are mental processes (see bolded terms) of judgements which can reasonably be performed in one’s mind with the aid of pencil and paper. Limitation 4 is a mathematical calculation of computing probability scores. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
A non-transitory computer-readable medium storing instructions that upon execution by a processor perform operations of automatically
training a first artificial neural network based on a training data set comprising data sequences and respective scores; 
using the trained first artificial neural network
outputting, based on the generated probability scores, a selection of one or more of the plurality of candidate sentences.
A non-transitory computer-readable medium storing instructions; a processor; and training and using a first neural network are generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). Outputting results is an insignificant extra-solution activity because it is well-known, as discussed in MPEP 2106.05(g). Adding insignificant extra-solution activity is not sufficient to integrate the additional elements into a practical application. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. A non-transitory computer-readable medium storing instructions; a processor; and training and using a first neural network are generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). Outputting results is an insignificant extra-solution activity because it is well-known, as discussed in MPEP 2106.05(g). Outputting information is well-known in the art, as disclosed by Wical (US Patent 6,460,034, see PTO-892 filed 01/03/2022) at C. 9, L. 30-32: “A screen module, such as screen module 230, which processes information for display on a computer output display, is well known in the art.” The claim is not patent eligible.

Regarding claim 20, the claim is directed to an apparatus that implements the same features as the product of claim 19, and is therefore rejected for at least the same reasons therein. Claim 20 recites the additional elements of one or more central processing units. These additional elements are generally linking the abstract ideas to the technological environment of machine learning, as discussed in MPEP 2106.05(h). The additional elements do not integrate the abstract idea into a practical application, and they are not sufficient to amount to significantly more than the judicial exception.

CLAIM 21 incorporates the rejection of claim 20.
Step 1: The claim recites an apparatus, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 20 are incorporated. The claim recites the following limitations:  performing a simultaneous beam search for the missing or erroneous data from the plurality of positions in the sequence. This limitation is a mental process of a judgement which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recite additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. The claim is not patent eligible.

CLAIM 22 incorporates the rejection of claim 21.
Step 1: The claim recites an apparatus, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 21 are incorporated. The claim recites the following limitations: 
(1) the simultaneous beam search comprises generating, in a first search step, one or more paths for each of the plurality of positions, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at its respective position
(2) [generating] an associated probability score of the candidate element.
The first limitation is a mental process of a judgement which can reasonably be performed in one’s mind with the aid of pencil and paper. The second limitation is a mathematical calculation. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recite additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. The claim is not patent eligible.

CLAIM 23 incorporates the rejection of claim 22.
Step 1: The claim recites an apparatus, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 22 are incorporated. The claim recites the following limitations:  wherein at least one path from at least one of the plurality of positions in the sequence is pruned away, before the first search step, based on a result… being lower than a predetermined threshold for the at least one of the plurality of positions.
	Pruning a path based on a result and determining the result is lower than a predetermined threshold are mental processes of judgements which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
an initial classifier
An initial classifier is generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. An initial classifier is generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). The claim is not patent eligible.

CLAIM 24 incorporates the rejection of claim 22.
Step 1: The claim recites an apparatus, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 22 are incorporated. The claim recites the following limitations:  
	(1) for each respective one of said positions: prior to the first search step, generating a respective positional embedding for the respective position, the positional embedding being a vector generated… as a function of some or all of the input elements before and/or after the respective position in the sequence, and as a function of the respective position in the sequence; 
(2) wherein the vector provides context about the respective position in the sequence,
(3) wherein the respective position includes a beginning of sequence (BOS) position or an end of sequence (EOS) position.
Limitation 1 is a mathematical calculation of generating a positional embedding. Limitations 2 and 3 are mental processes of a judgements of providing context and including BOS/EOS positions, which can which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
a second neural network
A second neural network is generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. A second neural network is generally linking the abstract ideas to the particular technology environment of machine learning, as discussed in MPEP 2106.05(h). The claim is not patent eligible.

CLAIM 25 incorporates the rejection of claim 24.
Step 1: The claim recites an apparatus, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 24 are incorporated. The claim recites the following limitations:  dynamically computing a rejoin embedding using a function from the positional embedding. This limitation is mathematical computation. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recite additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. The claim is not patent eligible.

CLAIM 26 incorporates the rejection of claim 22.
Step 1: The claim recites an apparatus, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 22 are incorporated. The claim recites the following limitations:
(1) generating, in a second search step, a rejoin token to be added to the candidate element of at least one of the one or more paths generated in the first search step, 
(2) the rejoin token representing stopping the simultaneous beam search along the at least one of the one or more paths.
Limitation 1 and 2 are mental process of a judgement of creating a token and representing a stopping condition, which can which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recite additional elements. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. The claim is not patent eligible.

CLAIM 27 incorporates the rejection of claim 20.
Step 1: The claim recites an apparatus, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 20 are incorporated. The claim recites the following limitations:  the generating the plurality of candidate sentences that have the one or more words imputed at the plurality of positions is performed. This limitation is a mental process of a judgement which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:  an instant messaging service that receives the input data from a first user and automatically forwards one of the one or more of the plurality of candidate sentences to a second user. An instant messaging service amounts to generally linking the abstract ideas to the technological environment of machine learning, as discussed in MPEP 2106.05(h). Receiving input data from a first user and automatically forwarding one of the one or more of the plurality of candidate sentences to a second user amounts to mere data-gathering, an insignificant extra-solution activity, as discussed in MPEP 2106.05(g). Adding insignificant extra-solution activity is not sufficient to integrate the additional elements into a practical application. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exceptions. An instant messaging service amounts to generally linking the abstract ideas to the technological environment of machine learning, as discussed in MPEP 2106.05(h). Receiving input data from a first user and automatically forwarding one of the one or more of the plurality of candidate sentences to a second user amounts to mere data-gathering, an insignificant extra-solution activity, as discussed in MPEP 2106.05(g). Receiving input data from a first user amounts to receiving data over a network, and automatically forwarding one of the one or more of the plurality of candidate sentences to a second user amounts to transmitting data over a network, as discussed in MPEP 2106.05(d)(II), example (i). The claim is not patent eligible.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-11 and 19-26 are rejected under 35 U.S.C. 103 as being unpatentable over Dahlmeier et al. (“A Beam-Search Decoder for Grammatical Error Correction”) in view of Hopkins et al. (“Tuning as Ranking”) and Sun et al. (“Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning”, cited in the PTO-892 filed 01/03/2022).

	Regarding CLAIM 1, Dahlmeier teaches: A computer-implemented method comprising automatically: 
training a first artificial neural network based on a training data set comprising data sequences… ; (First artificial neural network: P. 571, col. 1, § 3.4. A “metric score for each hypothesis” is taught at p. 571, col. 2, line 4. Also, P. 569, col. 1 teaches: “The weights of the decoder model are discriminatively trained on a development set of error-annotated sentences.”)
dividing a portion of input data into a sequence of input elements, each element in the input elements comprising a word or a gap between words; (The beam search taught by P. 571, col. 2, § 3.5, lines 1-14 and Fig. 1 and its caption on p. 572 shows evidence of dividing the input sentence into words and spaces)
simultaneously identifying a plurality of positions in the sequence at which missing or erroneous data is potentially to be imputed, each of the plurality of positions comprising a gap between a pair of adjacent words; (P. 570, col. 1, § 3.1. In Fig. 1 on p. 572, in the first row, a first gap between “In” and “other” and a second gap between “other” and “they” are identified. The word “the” is imputed in the first gap, and the word “hand” is imputed in the second gap. The beam search of Fig. 1 is further discussed at P. 569, col. 2, § 3, lines 1-13.)
for each respective one of said plurality of positions, generating a plurality of candidate sentences that have one or more words imputed at the plurality of positions in the sequence, (Dahlmeier’s “proposers” generate candidate sentences. See P. 569, col. 2, § 3, lines 1-13; P. 570, col. 1, lines 1-2 and § 3.1; and Fig. 1 (p. 572), the second row of sentences. Fig. 1 does not explicitly show that the beam continues from the imputed word “hand” in row 2, but P. 572, col. 2 states “From all hypotheses in the pool, we select the top k hypotheses and add them to the beam for the next search iteration” and “hand” has the next highest score in row 1 after “the”.)
generating, using the trained first artificial neural network, a probability score for each of the plurality of candidate sentences; and (P. 571, col. 1, § 3.4, lines 1-7. In Fig. 1, each hypothesis has a score. Note the decoder score is different from the scores of the hypothesis computed by the expert models.)
outputting, based on the generated probability scores, a selection of one or more of the plurality of candidate sentences. (§ 3.5 on pp. 571-2 teaches finding the top k hypotheses. Fig. 1 teaches the highest scoring hypothesis found is “On the other hand, they might be right.”)
Dahlmeier in § 3.4 teaches Pairwise Ranking Optimization (PRO) to optimize the decoder which requires a sentence-level score for each hypothesis. However, Dahlmeier does not explicitly teach: a training data set comprising respective scores; 
wherein at least one of the positions has at least one of the plurality of candidate sentences with multiple imputed words representing multiple concepts;
	But Hopkins teaches: a training data set comprising respective scores; (P. 1355, col. 2 teaches: “We create a labeled training instance for this problem by computing difference vector                         
                            x
                            (
                            i
                            ,
                             
                            j
                            )
                             
                            -
                             
                            x
                            (
                            i
                            ,
                             
                            j
                            '
                             
                            )
                        
                    , and labeling it as a positive or negative instance based on whether, respectively, the first or second vector is superior according to gold function.” Positive or negative labels are scores.)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Hopkin’s positive and negative instances into Dahlmeier’s PRO. A motivation for the combination is that ranking policies generalizes well to unseen data. (Hopkins, p. 1355, col. 1, § 4.1)
Dahlmeier teaches a plurality of candidate sentences with a single imputed word (P. 570, § 3.1: “A change corresponds to a correction of a single word or phrase.”). However, neither Dahlmeier nor Hopkins explicitly teaches: wherein at least one of the positions has at least one sentence with multiple imputed words representing multiple concepts;
	But Sun teaches: wherein at least one of the positions has at least one sentence with multiple imputed words representing multiple concepts; (P. 2, col. 1 discloses: “As a challenging testbed for our method, we propose a fill-in-the-blank image captioning task.” P. 2, col. 2 discloses: “Figure 1(b) shows an example result of our algorithm – “A man on skis is teaching a child how to ski” – which smoothly fits within its context while still describing the image content.” The underlined portion is the imputed text. The phrases “skis”, “is teaching” and “a child” are different concepts.)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Sun’s method of imputing multiple words representing different concepts into Dahlmeier’s beam search. A motivation for the combination is that some gaps require multiple words to smoothly fits within its context. (Sun, P. 2, col. 2 discloses: “Figure 1(b) shows an example result of our algorithm – “A man on skis is teaching a child how to ski” – which smoothly fits within its context while still describing the image content.”)

Regarding CLAIM 2, the combination of Dahlmeier, Hopkins, and Sun teaches: The method of claim 1, 
Dahlmeier teaches: further comprising performing a simultaneous beam search for the missing or erroneous data from the plurality of positions in the sequence. (P. 571, col. 2, § 3.5, lines 1-24 and P. 572, col. 2, lines 6-8; and P. 572, Fig. 1 and corresponding caption.)

	Regarding CLAIM 3, the combination of Dahlmeier, Hopkins, and Sun teaches: The method of claim 2, 
Dahlmeier teaches: wherein the simultaneous beam search comprises generating, in a first search step, one or more paths for each of the plurality of positions, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at its respective position, and an associated probability score of the candidate element. (P. 571, col. 2, § 3.5, lines 1-24 and P. 572, col. 2, lines 6-8; and P. 572, Fig. 1 and corresponding caption. The associated probability score is further taught by P. 571, § 3.4, lines 1-7.)

Regarding CLAIM 4, the combination of Dahlmeier, Hopkins, and Sun teaches: The method of claim 2, 
Dahlmeier teaches: wherein at least one path from at least one of the plurality of positions in the sequence is pruned away, before the simultaneous beam search, based on a result of an initial classifier being lower than a predetermined threshold for the at least one of the plurality of positions. (For purposes of examination, Examiner interprets the claim to mean pruning away at least one path based on a result of an initial classifier being lower than a predetermined threshold for the at least one of the plurality of positions. Taught by the P. 571, col. 2, § 3.5, lines 11-24 and P. 572, col. 2, lines 6-8, where the result of an initial classifier is a hypothesis score from the decoder.)

Regarding CLAIM 5, the combination of Dahlmeier, Hopkins, and Sun teaches: The method of claim 2, 
Dahlmeier teaches: wherein the simultaneous beam search explores a plurality of imputation candidates for the plurality of positions in parallel and prunes away one or more of the plurality of candidate sentences based on their respective probability scores. (P. 571, col. 2, § 3.5, lines 11-24 and P. 572, col. 2, lines 6-8, where a respective probability score is a hypothesis score from the decoder.)

Regarding CLAIM 6, the combination of Dahlmeier, Hopkins, and Sun teaches: The method of claim 3, comprising
However, Dahlmeier and Hopkins do not explicitly teach: for each respective one of said positions: prior to the first search step, generating a respective positional embedding for the respective position, the positional embedding being a vector generated by a second neural network as a function of some or all of the input elements before and/or after the respective position in the sequence, and as a function of the respective position in the sequence; 
wherein the vector provides context about the respective position in the sequence, wherein the respective position includes a beginning of sequence (BOS) position or an end of sequence (EOS) position.
But Sun teaches: for each respective one of said positions: prior to the first search step, generating a respective positional embedding for the respective position, the positional embedding being a vector generated by a second neural network as a function of some or all of the input elements before and/or after the respective position in the sequence, and as a function of the respective position in the sequence; (P. 3, col. 1, “Notation”, lines 1-6 teaches notations and that the input sequence and output sequence have the same length. “Unidirectional RNNs (URNNs)” teaches a unidirectional RNN compresses the history of the input sequence into a hidden state vector, of which an output embedding is a function. P. 3, col. 2, “RNNs for decoding” teaches using the encoded representation X which is an embedding. In this mapping, the encoded representation X is the output of y of the URNN.)
wherein the vector provides context about the respective position in the sequence, (P. 3, col. 1, “Unidirectional RNN (URNNs)” shows the hidden vector has a time step t corresponding to the input time step t.)
wherein the respective position includes a beginning of sequence (BOS) position or an end of sequence (EOS) position. (The BRI of this limitation is that the first input                         
                            
                                
                                    x
                                
                                
                                    1
                                
                            
                        
                     is a beginning of sequence position and the last input                         
                            
                                
                                    x
                                
                                
                                    t
                                
                            
                        
                     is an end of sequence position.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Sun’s Unidirectional RNN as an encoder which takes words as inputs and . A motivation for the combination is to compress the history of inputs into a hidden state vector to create a dense vector embedding. (Sun, P. 3, col. 1: “Unidirectional RNN (URNNs) model the probability of yt given the history of inputs x1, . . . , xt by “compressing” the history into a hidden state vector ht”. P. 3, col. 2: “For machine translation tasks, X may represent an encoding of some source language sequence”.)

Regarding CLAIM 7, the combination of Dahlmeier, Hopkins, and Sun teaches: The method of claim 6, 
Dahlmeier and Hopkins do not explicitly teach: further comprising generating a classifier for each of the plurality of positions, wherein the classifier is used as a function from the respective positional embedding. 
	But Sun teaches: further comprising generating a classifier for each of the plurality of positions, wherein the classifier is used as a function from the respective positional embedding. (Interpreted as generating a classification. P. 3, col. 2, “RNNs for decoding” teaches using a decoder (classifier) which is a function of some encoded representation X to generate the conditional probability of an output at a time step. The mapping of claim 6 explains how X was generated from the input sequence.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Sun’s decoder to predict the output sequence given an encoded representation. A motivation for the combination is to perform machine translation. (Sun, P. 3, col. 2, “RNNs for decoding)

Regarding CLAIM 8, the combination of Dahlmeier, Hopkins, and Sun teaches: The method of claim 1,
Dahlmeier teaches: wherein the candidate replacement sentence is grammatically correct or semantically correct, or both, compared to the input data. (Interpreted as the candidate replacement sentence being grammatically correct or semantically correct. P. 570, §3.2, first sentence.) 

Regarding CLAIM 9, the combination of Dahlmeier, Hopkins, and Sun teaches: The method of claim 3, 
Dahlmeier teaches: further comprising generating, in a second search step, a rejoin token to be added to the candidate element of at least one of the one or more paths generated in the first search step, the rejoin token representing stopping the simultaneous beam search along the at least one of the one or more paths. (The BRI of a rejoin token includes stopping a beam search along one of the paths. Beams whose scores are below the top k hypotheses are stopped, as taught on P. 572, col. 1, lines 6-8.) 

Regarding CLAIM 10, the combination of Dahlmeier, Hopkins, and Sun teaches: The method of claim 1, 
Dahlmeier teaches: wherein for each of said positions, the probability score generated by the trained first artificial neural network for each of the plurality of candidate sentences is also a function of an initial classifier for a respective position, the initial classifier representing a probability that the respective position has a missing or erroneous element. (On p. 570, § 3.2 teaches: “The second type of experts is based on linear classifiers and is specialized for particular error categories.” § 3.2-3.3 on pp. 570-571 teaches the expert models generating scores. Dahlmeier’s decoder, which corresponds to the trained first artificial neural network as claimed, is a function of the expert models.)

Regarding CLAIM 11, the combination of Dahlmeier, Hopkins, and Sun teaches: The method of claim 10, 
Dahlmeier teaches: wherein the initial classifier is generated… as a function of some or all of the input elements before and/or after the respective position in the sequence, and as a function of the position of the respective position in the sequence. (P. 570,§ 3.2 teaches: “We employ two types of expert models. The first type of expert model is a standard N-gram language model. The language model expert is not specialized for any particular type of error.” A standard N-gram language model is a function of words before and/or after the word being analyzed.)
However, neither Dahlmeier nor Hopkins explicitly teaches: the classifier is generated by a second neural network … as a function of the position of the respective position in the sequence.
But Sun teaches: the classifier is generated by a second neural network … as a function of the position of the respective position in the sequence. (Sun teaches a bidirectional RNN which is a function of respective token positions. See P. 3, col. 2, “Bidirectional RNNs” and “BiRNNs as URNNs” and Fig. 2b on p. 4)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have generated Dahlmeier’s initial classifier with Sun’s bidirectional RNN. A motivation for the combination is that a bidirectional RNN computes the conditional probability of an output token given all other tokens. (Sun, p. 3, col. 2, “It is straightforward to show that the conditional probability of yt given all other tokens”)

	Regarding CLAIM 19, Dahlmeier teaches: A non-transitory computer-readable medium storing instructions that upon execution by a processor perform operations of automatically: (The experimental and results section are evidence of a non-transitory computer-readable medium and a processor)
training a first artificial neural network based on a training data set comprising data sequences… ; (First artificial neural network: P. 571, col. 1, § 3.4. A “metric score for each hypothesis” is taught at p. 571, col. 2, line 4. Also, P. 569, col. 1 teaches: “The weights of the decoder model are discriminatively trained on a development set of error-annotated sentences.”)
dividing a portion of input data into a sequence of input elements, each element in the input elements comprising a word or a gap between words; (The beam search taught by P. 571, col. 2, § 3.5, lines 1-14 and Fig. 1 and its caption on p. 572 shows evidence of dividing the input sentence into words and spaces)
simultaneously identifying a plurality of positions in the sequence at which missing or erroneous data is potentially to be imputed, each of the plurality of positions comprising a gap between a pair of adjacent words; (P. 570, col. 1, § 3.1. In Fig. 1 on p. 572, in the first row, a first gap between “In” and “other” and a second gap between “other” and “they” are identified. The word “the” is imputed in the first gap, and the word “hand” is imputed in the second gap. The beam search of Fig. 1 is further discussed at P. 569, col. 2, § 3, lines 1-13.)
for each respective one of said plurality of positions, generating a plurality of candidate sentences that have one or more words imputed at the plurality of positions in the sequence, (Dahlmeier’s “proposers” generate candidate sentences. See P. 569, col. 2, § 3, lines 1-13; P. 570, col. 1, lines 1-2 and § 3.1; and Fig. 1 (p. 572), the second row of sentences. Fig. 1 does not explicitly show that the beam continues from the imputed word “hand” in row 2, but P. 572, col. 2 states “From all hypotheses in the pool, we select the top k hypotheses and add them to the beam for the next search iteration” and “hand” has the next highest score in row 1 after “the”.)
wherein for at least one of the positions, a first of the plurality of candidate sentences has a single imputed word and… (P. 570, col. 1, § 3.1. In Fig. 1 on p. 572, in the first row, a first gap between “In” and “other” and a second gap between “other” and “they” are identified. The word “the” is imputed in the first gap, and the word “hand” is imputed in the second gap. The beam search of Fig. 1 is further discussed at P. 569, col. 2, § 3, lines 1-13.)
generating, using the trained first artificial neural network, a probability score for each of the plurality of candidate sentences; and (P. 571, col. 1, § 3.4, lines 1-7. In Fig. 1, each hypothesis has a score. Note the decoder score is different from the scores of the hypothesis computed by the expert models.)
outputting, based on the generated probability scores, a selection of one or more of the plurality of candidate sentences. (§ 3.5 on pp. 571-2 teaches finding the top k hypotheses. Fig. 1 teaches the highest scoring hypothesis found is “On the other hand, they might be right.”)
Dahlmeier in § 3.4 teaches Pairwise Ranking Optimization (PRO) to optimize the decoder which requires a sentence-level score for each hypothesis. However, Dahlmeier does not explicitly teach: a training data set comprising respective scores; 
a second of the plurality of candidate sentences has multiple imputed words representing multiple concepts; 
But Hopkins teaches: a training data set comprising respective scores; (P. 1355, col. 2 teaches: “We create a labeled training instance for this problem by computing difference vector                         
                            x
                            (
                            i
                            ,
                             
                            j
                            )
                             
                            -
                             
                            x
                            (
                            i
                            ,
                             
                            j
                            '
                             
                            )
                        
                    , and labeling it as a positive or negative instance based on whether, respectively, the first or second vector is superior according to gold function.” Positive or negative labels are scores.)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Hopkin’s positive and negative instances into Dahlmeier’s PRO. A motivation for the combination is that ranking policies generalizes well to unseen data. (Hopkins, p. 1355, col. 1, § 4.1)
Dahlmeier teaches a plurality of candidate sentences with a single imputed word (P. 570, § 3.1: “A change corresponds to a correction of a single word or phrase.”). However, neither Dahlmeier nor Hopkins explicitly teaches: a second sentence has multiple imputed words representing multiple concepts; 
But Sun teaches: a second sentence has multiple imputed words representing multiple concepts; (P. 2, col. 1 discloses: “As a challenging testbed for our method, we propose a fill-in-the-blank image captioning task.” P. 2, col. 2 discloses: “Figure 1(b) shows an example result of our algorithm – “A man on skis is teaching a child how to ski” – which smoothly fits within its context while still describing the image content.” The underlined portion is the imputed text. The phrases “skis”, “is teaching” and “a child” are different concepts.) 
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Sun’s method of imputing multiple words representing different concepts into Dahlmeier’s beam search. A motivation for the combination is that some gaps require multiple words to smoothly fits within its context. (Sun, P. 2, col. 2 discloses: “Figure 1(b) shows an example result of our algorithm – “A man on skis is teaching a child how to ski” – which smoothly fits within its context while still describing the image content.”)

Regarding claim 20, the claim is directed to an apparatus that implements the same features as the product of claim 19, and is therefore rejected for at least the same reasons therein. Claim 20 recites the feature of one or more central processing units. Dahlmeier’s experiments and results are evidence of one or more central processing units.

Regarding CLAIM 21, the combination of Dahlmeier, Hopkins, and Sun teaches: The computer apparatus of claim 20, wherein the central processing units are further programmed to perform operations of:
Dahlmeier teaches: automatically performing a simultaneous beam search for the missing or erroneous data from the plurality of positions in the sequence. (P. 571, col. 2, § 3.5, lines 1-24 and P. 572, col. 2, lines 6-8; and P. 572, Fig. 1 and corresponding caption.)

Regarding CLAIM 22, the combination of Dahlmeier, Hopkins, and Sun teaches: The computer apparatus of claim 21, 
Dahlmeier teaches: wherein the simultaneous beam search comprises generating, in a first search step, one or more paths for each of the plurality of positions, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at its respective position, and an associated probability score of the candidate element. (P. 571, col. 2, § 3.5, lines 1-24 and P. 572, col. 2, lines 6-8; and P. 572, Fig. 1 and corresponding caption. The associated probability score is further taught by P. 571, § 3.4, lines 1-7.)

Regarding CLAIM 23, the combination of Dahlmeier, Hopkins, and Sun teaches: The computer apparatus of claim 22, 
Dahlmeier teaches: wherein at least one path from at least one of the plurality of positions in the sequence is pruned away, before the first search step, based on a result of an initial classifier being lower than a predetermined threshold for the at least one of the plurality of positions. (For purposes of examination, Examiner interprets the claim to mean pruning away at least one path based on a result of an initial classifier being lower than a predetermined threshold for the at least one of the plurality of positions. Taught by the P. 571, col. 2, § 3.5, lines 11-24 and P. 572, col. 2, lines 6-8, where the result of an initial classifier is a hypothesis score from the decoder.)

Regarding  CLAIM 24, the combination of Dahlmeier, Hopkins, and Sun teaches: The computer apparatus of claim 22, wherein the central processing units are further programmed to perform
However, Dahlmeier and Hopkins do not explicitly teach: for each respective one of said positions: prior to the first search step, generating a respective positional embedding for the respective position, the positional embedding being a vector generated by a second neural network as a function of some or all of the input elements before and/or after the respective position in the sequence, and as a function of the respective position in the sequence; 
wherein the vector provides context about the respective position in the sequence, wherein the respective position includes a beginning of sequence (BOS) position or an end of sequence (EOS) position.
But Sun teaches: for each respective one of said positions: prior to the first search step, generating a respective positional embedding for the respective position, the positional embedding being a vector generated by a second neural network as a function of some or all of the input elements before and/or after the respective position in the sequence, and as a function of the respective position in the sequence; (P. 3, col. 1, “Notation”, lines 1-6 teaches notations and that the input sequence and output sequence have the same length. “Unidirectional RNNs (URNNs)” teaches a unidirectional RNN compresses the history of the input sequence into a hidden state vector, of which an output embedding is a function. P. 3, col. 2, “RNNs for decoding” teaches using the encoded representation X which is an embedding. In this mapping, the encoded representation X is the output of y of the URNN.)
wherein the vector provides context about the respective position in the sequence, (P. 3, col. 1, “Unidirectional RNN (URNNs)” shows the hidden vector has a time step t corresponding to the input time step t.)
wherein the respective position includes a beginning of sequence (BOS) position or an end of sequence (EOS) position. (The BRI of this limitation is that the first input                         
                            
                                
                                    x
                                
                                
                                    1
                                
                            
                        
                     is a beginning of sequence position and the last input                         
                            
                                
                                    x
                                
                                
                                    t
                                
                            
                        
                     is an end of sequence position.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Sun’s Unidirectional RNN as an encoder which takes words as inputs and . A motivation for the combination is to compress the history of inputs into a hidden state vector to create a dense vector embedding. (Sun, P. 3, col. 1: “Unidirectional RNN (URNNs) model the probability of yt given the history of inputs x1, . . . , xt by “compressing” the history into a hidden state vector ht”. P. 3, col. 2: “For machine translation tasks, X may represent an encoding of some source language sequence”.)

Regarding CLAIM 25, the combination of Dahlmeier, Hopkins, and Sun teaches: The computer apparatus of claim 24, wherein the central processing units are further programmed to perform operations of
	Dahlmeier teaches: dynamically computing a rejoin embedding (The BRI of a rejoin embedding includes stopping a beam search along one of the paths. Beams whose scores are below the top k hypotheses are stopped, as taught on P. 572, col. 1, lines 6-8.)
	However, neither Dahlmeier nor Hopkins explicitly teaches: using a function from the positional embedding.
	But Sun teaches: using a function from the positional embedding. (P. 3, col. 1, “Unidirectional RNNs (URNNs)” teaches a unidirectional RNN compresses the history of the input sequence into a hidden state vector, of which an output embedding is a function. P. 3, col. 2, “RNNs for decoding” teaches using the encoded representation X which is an embedding. In this mapping, the encoded representation X is the output of y of the URNN.)

Regarding CLAIM 26, the combination of Dahlmeier, Hopkins, and Sun teaches: The computer apparatus of claim 22, wherein the central processing units are further programmed to perform operations of	Dahlmeier teaches: generating, in a second search step, a rejoin token to be added to the candidate element of at least one of the one or more paths generated in the first search9 Attorney Docket No.: 403824-US-NP step, the rejoin token representing stopping the simultaneous beam search along the at least one of the one or more paths. (The BRI of a rejoin token includes stopping a beam search along one of the paths. Beams whose scores are below the top k hypotheses are stopped, as taught on P. 572, col. 1, lines 6-8.)

Claim 27 is rejected under 35 U.S.C. 103 as being unpatentable over Dahlmeier et al. (“A Beam-Search Decoder for Grammatical Error Correction”) in view of Hopkins et al. (“Tuning as Ranking”), Sun et al. (“Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning”, cited in the PTO-892 filed 01/03/2022), and Sharifi et al. (US 20180039608 A1).

Regarding CLAIM 27, the combination of Dahlmeier, Hopkins, and Sun teaches: The computer apparatus of claim 20,
Dahlmeier teaches: wherein generating the plurality of candidate sentences that have the one or more words imputed at the plurality of positions is performed (§ 3.5 on pp. 571-572, and Fig. 1)
However, neither Dahlmeier, Hopkins, nor Sun explicitly teaches: by an instant messaging service that receives the input data from a first user and automatically forwards one of the one or more of the plurality of candidate sentences to a second user.
But Sharifi teaches: by an instant messaging service that receives the input data from a first user and automatically forwards one of the one or more of the plurality of candidate sentences to a second user. (¶ [0004], where “automatically forwards” includes displaying. ¶ [0033] teaches a language model.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have implemented Dalhmeier/Hopkins/Sun’s system as a language model in Sharifi’s messaging service. A motivation for the combination is to correct previously received instant messages. (Sharifi, Abstract)

Response to Arguments
Applicant's arguments filed 07/25/2022 have been fully considered but they are not persuasive. 
Claim Rejections Under 35 U.S.C. 101 (Remarks pp. 8-13): Applicant argues that claims 1-11 and 19-27 are patent eligible under 35 U.S.C. 101. Examiner respectfully disagrees. Regarding Claim 1, the amended limitation of training a first artificial neural network based on a training data set comprising data sequences and respective scores is merely an additional element generally linking the abstract ideas of claim 1 to the technological environment of machine learning. The machine learning aspects of training and claim 1 are recited at a high level. They do not integrate the judicial exceptions into a practical application, and they are not sufficient to amount to significantly more than the judicial exceptions. The limitations of claim 8 further limit the abstract ideas of claim 1.
Regarding applicant’s argument in the bottom paragraph on p. 9, Applicant has not persuasively explained how the limitations numbered 1 to 3 are similar to detecting suspicious activity by using network monitors and analyzing network packets. The limitations numbered 1 to 3 contain both abstract ideas and additional elements. Only the abstract ideas should be analyzed in Step 2A, Prong 1.
Applicant argues at the top of p. 10 that Claim 1 is similar to Example 39. Examiner respectfully disagrees. Claim 1 differs from Example 39 because the machine learning aspects of claim 1 are additional elements that do not integrate the judicial exceptions into a practical application, and they are not sufficient to amount to significantly more than the judicial exceptions.
Applicant argues at the bottom of p. 10 that a real-time text-based communication session such as an IM chat session is a practical application. Examiner respectfully disagrees with Applicant’s argument. First, the claims do not explicitly recite a real-time text-based communication session such as an IM chat session. The closest claim to this embodiment is claim 27 which recites “an instant messaging service”. An instant messaging service is an additional element generally linking the abstract ideas to the technological environment of machine learning. The instant messaging service does not improve the machine learning aspects of claim 1. Examiner does not understand Applicant’s argument on p. 11. Examiner did not determine that any limitation was conventional in Step 2A, Prong 2 in either the Final Rejection filed 05/24/22 or the present office action. The limitations starting with “training”, “simultaneously identifying”, and “generating” contain both abstract ideas and additional elements. Only the additional elements should be analyzed in Step 2A Prong 2.
Examiner finds Applicant’s arguments unpersuasive on p. 12-13 regarding Step 2B. Only the limitation starting with “outputting” was determined to be well-understood, routing, conventional activity. Outputting is not an inventive concept. The limitations starting with “training”, “simultaneously identifying”, and “generating” contain both abstract ideas and additional elements. Only the additional elements should be analyzed in Step 2B. The limitations starting with “training”, “simultaneously identifying”, and “generating” contain both abstract ideas and additional elements, so Examiner does not understand why the abstract ideas are being discussed in this section.

Claim Rejections Under 35 U.S.C. 103 (Remarks pp. 13-17): Applicant’s arguments with respect to claim(s) 1-11 and 19-27 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Asher H. Jablon whose telephone number is (571)270-7648. The examiner can normally be reached Monday - Friday, 9:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Al Kawsar can be reached on (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/A.H.J./Examiner, Art Unit 2127                                                                                                                                                                                                        

/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127