Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Amendments
Claims 1-20 are pending and have been considered. Claims 1-4, 7-9, 11-12, 14, and 18-20 have been amended.
Priority
Acknowledgment is made of applicant's claim for foreign priority based on an application filed in the United Kingdom of Great Britain and Northern Ireland on March 20, 2018. A certified copy of the GB1804433.9 application as required by 37 CFR 1.55 was received on 04/16/2021 and was entered into the application folder.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 19 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because it is directed towards transitory signals per se. Neither the claim nor the specification preclude the storage from being transitory signals per se. Applicant may amend claim 19, line 1 to recite “non-transitory
Claims 1-3, 5, 10, and 14-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Claim 1 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitations: 
(1) dividing a portion of input data into a sequence of smaller input elements, each element in the input elements comprising a word or a gap between words;
(2) identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed; 
(3) for each respective one of said points: - in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, and 
(4) an associated probability score of the candidate element, the probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and 
(5) - in each of a plurality of subsequent successive search steps, selecting a set of one or more of the paths from one or more of the search steps to extend, the selection being based on the associated probability scores, and 
(6) generating a respective set of one or more extended paths from each respective one of the selected set of paths, each extended path comprising the candidate element or elements from the respective path combined with an additional candidate element, and 
(7) an associated probability score for the combination, this probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the probability score for the respective path; and 
performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and 
	Limitations 1-3, 5-6, and 8 are mental processes (see bolded terms). Limitations 4 and 7 are mathematical processes but for the first neural network (see bolded terms). Accordingly, the claim recites an abstract idea. 
The claim recites the additional element: by a first neural network. This additional element is not integrated into a practical application because it is not an improvement to a computer or other technology. It is merely the use of a known technology to achieve the abstract idea of calculating a probability.
The claim recites a second additional element: outputting, based on the comparison, a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths. This additional element is not integrated into a practical application because it is merely outputting results, which is insignificant extra-solution activity under Step 2A Prong Two. See MPEP 2106.05(g). Adding insignificant extra-solution activity is not sufficient to integrate the additional elements into a practical application.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional element of a neural network is not an improvement to the functioning of a computer or to any other technology under MPEP 2106.05(a); and it is a field of use of using a neural network to perform a calculation under MPEP 2106.05(h). 
The additional element of outputting result(s) is well-understood, routine, and conventional under MPEP 2106.05(d). MPEP 2106.05(d)(II). states an example of activity that the courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity include (i.) Receiving or transmitting data over a network. Outputting results is 

Claim 2 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitation: 
wherein in each of the successive search steps for each point, the set of one or more paths to extend is selected from the immediately search step. 
This limitation is a mental process of selecting. Accordingly, the claim recites an abstract idea. 
This judicial exception is not integrated into a practical application because the claim does not include any additional elements. The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Claim 3 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitations:
(1) following each of one, some or all of said search steps for each point, pruning away lower scoring ones of the paths based on the probability scores, thus leaving only one or some of the paths remaining; 
(2) wherein for each of said plurality of points, in each of the successive search steps, said set of paths to be extended are the paths remaining after any pruning.
	Limitation 1 is a mental process of deciding to prune lower scoring paths, interpreted as deciding not to extend lower scoring paths. Limitation 2 is a mental process of deciding to extend higher scoring paths. Accordingly, the claim recites an abstract idea. This judicial exception is not integrated into a practical application because the claim does not include any additional elements. The claim does 

Claim 5 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitation:
wherein each successive search step does not proceed for any of the points until the immediately preceding search step has been performed for all of the points.
This limitation is a mental process merely of deciding not proceed until the conditions are met. Accordingly, the claim recites an abstract idea. This judicial exception is not integrated into a practical application because the claim does not include any additional elements. The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Claim 10 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitation:
wherein for each of said points, the probability score generated… for each path in the first search step is also a function of an initial classifier for the respective position, the classifier representing a probability that the respective point has a missing or erroneous element.
This limitation is a mathematical processes, but for the recitation of the first neural network, of computing a probability score as a function of an initial classifier representing a probability. Accordingly, the claim recites an abstract idea. 
This judicial exception is not integrated into a practical application because the claim does not include any additional elements. The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Claim 14 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitations:
for each of said points: prior to the first search step, including an end-of-sequence element in the input sequence at the end of the sequence to represent the end of the portion of input data, and/or 
including a start-of-sequence element in the input sequence at the start of the sequence to represent the start of the portion of input data; 
wherein the input elements… include the end-of- sequence element and/or the start-of-sequence element.
	These limitations are mental processes of deciding to include an end-of-sequence element and a start-of-sequence element in the sequences and input elements. Accordingly, the claim recites an abstract idea. 
The claim recites the additional elements: the first, second, third and/or fourth neural network. These additional elements are not integrated into a practical application because they are not an improvement to a computer or other technology. They are merely the use of a known technology to achieve the abstract idea of calculating a probability.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional element at least one neural network is not an improvement to the functioning of a computer or to any other technology under MPEP 2106.05(a); and it is a field of use of using neural networks to perform a calculation under MPEP 2106.05(h). The additional element does not amount to significantly more than the judicial exception because there is no nexus between the neural network and the judicial exceptions. The claim is not patent eligible.


wherein in one, some or all of the search steps for each of some or all of the points, the generating of the paths comprises generating a respective set of multiple paths for each respective one of at least some said points, 
the multiple paths for the respective point each comprising a different candidate element and associated probability score…
These limitations are mental processes of deciding to generate multiple paths per point and deciding the multiple paths should comprise a different candidate element and associated probability score. Accordingly, the claim recites an abstract idea.
The claim recites the additional element: based on the first neural network. This additional element is not integrated into a practical application because it is not an improvement to a computer or other technology. It is merely the use of a known technology to achieve the abstract idea of calculating a probability.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional element of a neural network is not an improvement to the functioning of a computer or to any other technology under MPEP 2106.05(a); and it is a field of use of using a neural network to perform a calculation under MPEP 2106.05(h). The claim is not patent eligible.

Claim 16 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitation: 
wherein amongst the multiple paths for each respective point having multiple paths in the current search step, the candidate elements for one of the paths includes a rejoin-sequence element 
This limitation is a mental process of deciding to stop the search for the respective point rejoining the candidate element or elements from the preceding search steps to the input sequence. Accordingly, the claim recites an abstract idea. 
This judicial exception is not integrated into a practical application because the claim does not include any additional elements. The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Claim 17 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitation:
wherein said points are gaps between the input elements where missing data is potentially to be imputed.
This limitation is a mental process of marking the gaps between input elements as said points. Accordingly, the claim recites an abstract idea. 
This judicial exception is not integrated into a practical application because the claim does not include any additional elements. The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Claim 18 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitation:
wherein the portion of input data comprises a portion of text, and the input elements from the received text are words or characters.

This judicial exception is not integrated into a practical application because the claim does not include any additional elements. The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Claim 19, when amended to read “non-transitory computer readable storage”, recites a product, one of the four statutory categories of patentable subject matter. The claim recites the following limitations:  
(1) dividing a portion of input data into a sequence of smaller input elements, each element in the input elements comprising a word or a gap between words;
(2) identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed; 
(3) for each respective one of said points: - in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, and 
(4) an associated probability score of the candidate element, the probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and 
(5) - in each of a plurality of subsequent successive search steps, selecting a set of one or more of the paths from one or more of the search steps to extend, the selection being based on the associated probability scores, and 
generating a respective set of one or more extended paths from each respective one of the selected set of paths, each extended path comprising the candidate element or elements from the respective path combined with an additional candidate element, and 
(7) an associated probability score for the combination, this probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the probability score for the respective path; and 
(8) performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and 
	Limitations 1-3, 5-6, and 8 are mental processes (see bolded terms). Limitations 4 and 7 are mathematical processes but for the first neural network (see bolded terms). Accordingly, the claim recites an abstract idea. 
The claim recites the additional elements: a computer program, code, computer readable storage, a computer apparatus. These elements are not integrated into a practical application because they are mere instructions to apply an exception using a computer under MPEP 2106.05(f).
The claim recites the additional element of a first neural network. This additional element is not integrated into a practical application because it is not an improvement to a computer or other technology. It is merely the use of a known technology to achieve the abstract idea of calculating a probability.
The claim recites another additional element: outputting, based on the comparison, a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths. This additional element is not integrated into a practical application because it is merely outputting results, which is insignificant extra-solution activity under Step 2A Prong Two. See MPEP 2106.05(g). Adding insignificant extra-solution activity is not sufficient to integrate the additional elements into a practical application.

The additional element of outputting result(s) is well-understood, routine, and conventional under MPEP 2106.05(d). MPEP 2106.05(d)(II). states an example of activity that the courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity include (i.) Receiving or transmitting data over a network. Outputting results is similar enough to Receiving or transmitting data such that it is considered well-understood, routine, and conventional activity under Step 2B. The claim is not patent eligible.

Claim 20 recites an apparatus, one of the four statutory categories of patentable subject matter. The claim recites the following limitations:  
(1) dividing a portion of input data into a sequence of smaller input elements, each element in the input elements comprising a word or gap between words;
(2) identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed; 
(3) for each respective one of said points: - in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, and 
(4) an associated probability score of the candidate element, the probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and 

(6) generating a respective set of one or more extended paths from each respective one of the selected set of paths, each extended path comprising the candidate element or elements from the respective path combined with an additional candidate element, and 
(7) an associated probability score for the combination, this probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the probability score for the respective path; and 
(8) performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and 
Limitations 1-3, 5-6, and 8 are mental processes (see bolded terms). Limitations 4 and 7 are mathematical processes but for the first neural network (see bolded terms). Accordingly, the claim recites an abstract idea. 
The claim recites the following additional elements:
A computer apparatus 
One or more processing units
by a first neural network. 
A computer apparatus and processing unit(s) are not integrated into a practical application because they are mere instructions to apply an exception using a computer under MPEP 2106.05(f). A neural network is not integrated into a practical application because it is not an improvement to a computer or other technology. It is merely the use of a known technology to achieve the abstract idea of calculating a probability.

The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
The additional element of a computer apparatus is mere instructions to apply an exception using a computer under See MPEP 2106.05(f). 
The additional element of a neural network is not an improvement to the functioning of a computer or to any other technology under MPEP 2106.05(a); and it is a field of use of using a neural network to perform a calculation under MPEP 2106.05(h). 
The additional element of outputting result(s) is well-understood, routine, and conventional under MPEP 2106.05(d). MPEP 2106.05(d)(II) states an example of activity that the courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity include (i.) Receiving or transmitting data over a network. Outputting results is similar enough to Receiving or transmitting data such that it is considered well-understood, routine, and conventional activity under Step 2B. The claim is not patent eligible.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-5, 10, and 12-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dahlmeier et al. (WO 2013191662 A1), in view of Aoyama et al. (US 20100211377 A1) and Lee et al. (U.S. 2018/0336183 filed May 22, 2017).
Regarding Claim 1, Dahlmeier teaches: A computer-implemented method comprising automatically: 
each element in the input elements comprising a word or a gap between words; (The broadest reasonable interpretation of an element amounts to a group of one or more words, including a sentence. An input element is interpreted as an input sentence. Dahlmeier Abstract teaches: “According to one aspect, there is provided a method for correcting grammatical errors of an input sentence”)
identifying… at which missing or erroneous data is potentially to be imputed; (Abstract: “there is provided a method for correcting grammatical errors of an input sentence”)
for each respective one of said…: 
- in a first search step, generating a respective set of one or more paths for the respective…, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective… (Dahlmeier ¶ 28, where “path comprises a candidate element” is interpreted as a new hypothesis) and an associated probability score of the candidate element, (Dahlmeier ¶ 39: “the decoder combines the features associated with each hypothesis into an overall hypothesis score”, where according to Dahlmeier ¶15, “The term "score" may mean a value, between 0.0 and 1.0, that measures a probability of the grammatical correctness of a hypothesis.”) the probability score being generated by a first neural network (Dahlmeier ¶ 34: “Supervised classifiers can be used for particular grammatical errors by letting the classifier predict the correct word for a particular sentence context”  The decoder in ¶ 39-40 is interpreted as being a neural network which accepts a weight vector and a feature map comprising the results of the classifier and scores them. ¶ [0040]: “The hypotheses are evaluated by the expert models and scored using the decoder model.”)
- in each of a plurality of subsequent successive search steps, selecting a set of one or more of the paths from one or more of the search steps to extend, the selection being based on the associated (Dahlmeier ¶ 12: “the input sentence may be a sentence generated from the hypothesis of a previous iteration, i.e. the original sentence having undergone one or more successive iterations of grammatical error correction.” In the claim limitation, preceding path is interpreted as a previous hypothesis, and preceding search step is interpreted as a proposer step in a previous iteration.)
performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and outputting, based on the comparison, a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths. (Dahlmeier ¶ 24: “In step 106, each of the new hypotheses is analysed to compute a score for each of the plurality of new hypotheses. In step 108, the scores of the plurality of new hypotheses are compared. In step 110, an output sentence from the new hypotheses with the highest score is generated.”)

Dahlmeier teaches dividing a sentence into words and punctuation. However, Dahlmeier does not explicitly teach dividing text into sentences. Thus, Dahlmeier does not explicitly teach: dividing a portion of input data into a sequence of smaller input elements
a plurality of points in the sequence
as a function of some or all of the input elements before and/or after the respective… in the sequence, and
dividing a portion of input data into a sequence of smaller input elements (Aoyama teaches step S104 in Fig. 4 and ¶ [0049]: divide text in block into sentences. A portion of input data is taught by the text, a sequence of smaller input elements is taught by a sequence of sentences.)
a plurality of points in the sequence (The BRI of a “point” as it relates to NLP is a portion of the text, like a sentence. A plurality of points in the sequence is interpreted as a plurality of sentences in Aoyama’s text as in step S104 in Fig. 4 and ¶ [0049].)
	Aoyama is in the same field of endeavor as the claimed invention, namely natural language processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have divided input text into sentences, as taught by Aoyama, with a motivation to correct grammatical errors of the sentences according to Dahlmeier’s methods.
The Dahlmeier/Aoyama combination does not explicitly teach: as a function of some or all of the input elements before and/or after the respective… in the sequence, and
	But Lee teaches this limitation in ¶ [0104]: “Rules specifying sentences as also [having] sequential dependency, where a following sentence depends on the preceding sentence, may also be utilized.” The broadest reasonable interpretation of the limitation is: as a function of the input element (i.e., the preceding sentence) before the respective point in the sequence (i.e., the sentence being embedded).
	Lee is in the same field of endeavor as the claimed invention, namely natural language processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Lee’s system into the combination of Dahlmeier and Aoyama’s system by generating the combination’s particular sentence embedding/vector as a function of the preceding sentence, with a motivation to compare semantic similarity independent of string similarity (Lee ¶ [0002]: “Word embedding is used by NLP systems as one mechanism for reasoning over natural language sentences. Without word embedding, an NLP system operates on strings of characters, similar groups of words can be considered differently by the NLP system. For example, “The President of the United States visited New York City last week” and “Mr. Trump came to NYC on May 4” have high semantic similarity, but low string similarity.”)

Regarding claim 2, the combination of Dahlmeier, Aoyama, and Lee teaches: The method of claim 1, 
Further, Dahlmeier teaches: wherein in each of the successive search steps for each point, the set of one or more paths to extend is selected from the immediately preceding search step. (Examiner interprets Dahlmeier ¶ 12 to include the immediate previous iteration: “the input sentence may be a sentence generated from the hypothesis of a previous iteration, i.e. the original sentence having undergone one or more successive iterations of grammatical error correction.”)

Regarding claim 3, the combination of Dahlmeier, Aoyama, and Lee teaches: The method of claim 2, 
Further, Dahlmeier teaches (where Aoyama teaches “point(s)”): comprising: following each of one, some or all of said search steps for each point, pruning away lower scoring ones of the paths based on the probability scores, thus leaving only one or some of the paths remaining; wherein for each of said plurality of points, in each of the successive search steps, said set of paths to be extended are the paths remaining after any pruning. (Dahlmeier ¶ 41, where paths are interpreted as hypotheses: “the search space may be pruned by only accepting the most promising hypotheses to the pool of hypotheses for future consideration. Thus, the plurality of new hypotheses, when performing grammatical error correction of an input sentence, may be based on hypotheses with the highest score range from a previous iteration. If a hypothesis has a higher score compared to the best hypothesis found so far in previous iterations, it is added to the pool.”)

Regarding claim 4, the combination of Dahlmeier, Aoyama, and Lee teaches: The method of claim 3, 
Further, Dahlmeier teaches (where Aoyama teaches “point(s)”): wherein the method comprises: following each of some or all of the search steps, skimming off the element or combination of elements from each of some or all of the paths generated from across some or all of the points in the sequence into a candidate pool, the element or combination of elements from each of the skimmed-off paths forming a respective candidate result in the candidate pool; and (Dahlmeier ¶ 41: “If a hypothesis has a higher score compared to the best hypothesis found so far in previous iterations, it is added to the pool.”)
applying another neural network to each entry to generate a new probability score for each candidate result in the candidate pool; (The broadest reasonable interpretation of “another neural network” includes the same decoder performing this limitation in Dahlmeier ¶ 39.)
wherein said comparing comprises comparing the new probability scores in the candidate pool, and said selection comprises a selection of one or more of the candidate results having the highest of the new probability scores; (Dahlmeier ¶ 39-41 teaches that the decoder scores each hypothesis and selects the best one, which involves comparing scores)
wherein said skimming comprises, for each current one of the search steps, after the current search step is completed across all the points in the sequence, skimming off the element or combination of elements from each of only a selected subset of the paths generated in the current search step into the candidate pool as candidate results, wherein the subset is selected as those paths having greater than a threshold probability score, or those in a highest portion according to the probability score; and 

(The last two limitations are taught by Dahlmeier ¶ 41: “Therefore, the search space may be pruned by only accepting the most promising hypotheses to the pool of hypotheses for future consideration. Thus, the plurality of new hypotheses, when performing grammatical error correction of an input sentence, may be based on hypotheses with the highest score range from a previous iteration. If a hypothesis has a higher score compared to the best hypothesis found so far in previous iterations, it is added to the pool.”)

Regarding claim 5, the combination of Dahlmeier, Aoyama, and Lee teaches:: The method of claim 1, 
Dahlmeier teaches (where Aoyama teaches “point(s)”): wherein each successive search step does not proceed for any of the points until the immediately preceding search step has been performed for all of the points. (The method of Dahlmeier ¶ 12 broadly reads on this claim limitation. Dahlmeier ¶ 12: “the input sentence may be a sentence generated from the hypothesis of a previous iteration, i.e. the original sentence having undergone one or more successive iterations of grammatical error correction.”)

Regarding claim 10, the combination of Dahlmeier, Aoyama, and Lee teaches: The method of claim 1, 
Dahlmeier teaches (where Aoyama teaches “point(s)”): wherein for each of said points, the probability score generated by the first neural network for each path in the first search step is also a function of an initial classifier for the respective position, the classifier representing a probability that the respective point has a missing or erroneous element. (Dahlmeier ¶ 39 teaches a first neural network for each hypothesis (path) is a function of a classifier. ¶ 35 lists classifiers for typical grammatical errors.)

Regarding claim 12, the combination of Dahlmeier, Aoyama, and Lee teaches: The method of claim 1, 
Dahlmeier teaches (where Aoyama teaches “point(s)”): wherein the method comprises: following each of some or all of the search steps, skimming off the element or combination of elements from each of some or all of the paths generated from across some or all of the points in the sequence into a candidate pool, the element or combination of elements from each of the skimmed-off paths forming a respective candidate result in the candidate pool; and (Dahlmeier ¶ 41: “If a hypothesis has a higher score compared to the best hypothesis found so far in previous iterations, it is added to the pool.”)
applying another neural network to each entry to generate a new probability score for each candidate result in the candidate pool; (The broadest reasonable interpretation of “another neural network” includes the same decoder performing this limitation in Dahlmeier ¶ 39.)
wherein said comparing comprises comparing the new probability scores in the candidate pool, and said selection comprises a selection of one or more of the candidate results having the highest of the new probability scores. (Dahlmeier ¶ 39-41 teaches that the decoder scores each hypothesis and selects the best one, which involves comparing scores)

Regarding claim 13, the combination of Dahlmeier, Aoyama, and Lee teaches: The method of claim 12, 
Dahlmeier teaches: wherein said skimming comprises, for each current one of the search steps, after the current search step is completed across all the points in the sequence, skimming off the  (Dahlmeier ¶ 41: “Therefore, the search space may be pruned by only accepting the most promising hypotheses to the pool of hypotheses for future consideration. Thus, the plurality of new hypotheses, when performing grammatical error correction of an input sentence, may be based on hypotheses with the highest score range from a previous iteration. If a hypothesis has a higher score compared to the best hypothesis found so far in previous iterations, it is added to the pool.”)

Regarding claim 14, the combination of Dahlmeier, Aoyama, and Lee teaches: The method of claim 1 
Dahlmeier teaches: of which at least one neural network is a function (The decoder of Dahlmeier ¶ 39)
However, the Dahlmeier/Aoyama combination does not explicitly teach: comprising, for each of said points: prior to the first search step, including an end-of-sequence element in the input sequence at the end of the sequence to represent the end of the portion of input data, and/or including a start- of-sequence element in the input sequence at the start of the sequence to represent the start of the portion of input data; wherein the input elements of which… is a function include the end-of-sequence element and/or the start-of- sequence element.
But Aoyama teaches: comprising, for each of said points: (Each of said points is interpreted as each sentence in the text. The broadest reasonable interpretation of this limitation is determining whether the sentence should be preceded by a start-of-sequence element if it is the first sentence or succeeded by an end-of-sequence element if it is the last sentence.)
(Aoyama ¶ [0048]: “Next, the content analysis section 10 divides the entire text to be processed into blocks (step 102). Here, the block may be a text string consisting of multiple sentences, as well as a paragraph shown in FIG. 2 and FIG. 3. The division into blocks may be made by delimiting each block between </p> and <p> in the case of an HTML document, for example.” The HTML delimiter <p> is a start-of-sequence element and the HTML delimiter </p> is an end-of-sequence element. )
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have delimited the start and end of Dahlmeier’s sequence with HTML delimiters <p> and </p> as taught by Aoyama, with a motivation to divide the content into paragraphs and extract region/culture-specific data. (Aoyama Abstract: “In a verification support apparatus, a content analysis section analyzes a content to divide the content into paragraphs, extract region/culture-specific data, and store the analysis results in an analysis result storage section.”)

Regarding claim 15, the combination of Dahlmeier, Aoyama, and Lee teaches: The method of claim 1, 
Dahlmeier teaches (where Aoyama teaches “point(s)”): wherein in one, some or all of the search steps for each of some or all of the points, the generating of the paths comprises generating a respective set of multiple paths for each respective one of at least some said points, (generating multiple hypotheses is interpreted as generating a set of multiple paths. Dahlmeier ¶ 27: “Proposers take a hypothesis and output a set of new hypotheses, where each new hypothesis is the result of making an incremental change to the current hypothesis. Accordingly, proposers generate a plurality of new hypothesis from a current hypothesis.”)
the multiple paths for the respective point each comprising a different candidate element and associated probability score based on the first neural network. (Dahlmeier ¶ 27: “Experts subsequently score these hypotheses on particular aspects of grammaticality. Accordingly, experts analyse each of the new hypothesis to compute a score for each of the plurality of new hypotheses.”)

Regarding claim 16, the combination of Dahlmeier, Aoyama, and Lee teaches: The method of claim 15, 
Dahlmeier teaches (where Aoyama teaches “point(s)”): wherein amongst the multiple paths for each respective point having multiple paths in the current search step, the candidate elements for one of the paths includes a rejoin-sequence element representing stopping the search for the respective point and rejoining the candidate element or elements from the preceding search steps to the input sequence. (Dahlmeier ¶ 28 teaches making a single correction of a phrase in the current hypothesis. The broadest reasonable interpretation of a phrase in light of the specification includes a sentence. This is interpreted as a rejoin-sequence element as recited by claim 16.)


Regarding claim 17, the combination of Dahlmeier, Aoyama, and Lee teaches: The method of claim 1, wherein said points are gaps between the input elements where missing data is potentially to be imputed. (Input elements are sentences. The broadest reasonable interpretation of a gap between the input elements where missing data is potentially to be imputed is a particular sentence between two surrounding sentences which is missing grammatically correct words and punctuation. Dahlmeier’s Abstract teaches “there is provided a method for correcting grammatical errors of an input sentence”. Aoyama teaches that sentences are in a paragraph.)

Regarding claim 18, the combination of Dahlmeier, Aoyama, and Lee teaches: The method of claim 1,
Aoyama teaches: wherein the portion of input data comprises a portion of text, (Aoyama teaches step S104 in Fig. 4 and ¶ [0049]: divide text in block into sentences.)
Dahlmeier teaches: and the input elements from the received text are words or characters. (Dahlmeier Abstract teaches the input elements are sentences, which, under the broadest reasonable interpretation in light of the specification, are made of words and characters.) 

Regarding claim 19, Dahlmeier teaches: A computer program comprising code embodied on computer readable storage and configured so as when run on a computer apparatus to perform operations of automatically: (Dahlmeier ¶ 7: “computer program code”)
each element in the input elements comprising a word or a gap between words; (The broadest reasonable interpretation of an element amounts to a group of one or more words, including a sentence. An input element is interpreted as an input sentence. Dahlmeier Abstract teaches: “According to one aspect, there is provided a method for correcting grammatical errors of an input sentence”)
identifying… at which missing or erroneous data is potentially to be imputed; (Identifying the sentences with missing or erroneous data (Abstract))
for each respective one of said…: 
- in a first search step, generating a respective set of one or more paths for the respective…, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective… (Dahlmeier ¶ 28, where “path comprises a candidate element” is interpreted as a new hypothesis) and an associated probability score of the candidate element, (Dahlmeier ¶ 39: “the decoder combines the features associated with each hypothesis into an overall hypothesis score”, where according to Dahlmeier ¶15, “The term "score" may mean a value, between 0.0 and 1.0, that measures a probability of the grammatical correctness of a hypothesis.”) the probability score being generated by a first neural network (Dahlmeier ¶ 34: “Supervised classifiers can be used for particular grammatical errors by letting the classifier predict the correct word for a particular sentence context”  The decoder in ¶ 39 is interpreted as being a neural network which accepts a weight vector and a feature map comprising the results of the classifier and scores them. ¶ [0040]: “The hypotheses are evaluated by the expert models and scored using the decoder model.”)
- in each of a plurality of subsequent successive search steps, selecting a set of one or more of the paths from one or more of the search steps to extend, the selection being based on the associated probability scores, and generating a respective set of one or more extended paths from each respective one of the selected set of paths, each extended path comprising the candidate element or elements from the respective path combined with an additional candidate element, and an associated probability score for the combination, this probability score being generated by the first neural network as a function of some or all of the input elements before and/or after the respective… in the sequence, and as a function of the probability score for the respective path; and (Dahlmeier ¶ 12: “the input sentence may be a sentence generated from the hypothesis of a previous iteration, i.e. the original sentence having undergone one or more successive iterations of grammatical error correction.” In the claim limitation, preceding path is interpreted as a previous hypothesis, and preceding search step is interpreted as a proposer step in a previous iteration.)
performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and outputting, based on the comparison, a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths. (Dahlmeier ¶ 24: “In step 106, each of the new hypotheses is analysed to compute a score for each of the plurality of new hypotheses. In step 108, the scores of the plurality of new hypotheses are compared. In step 110, an output sentence from the new hypotheses with the highest score is generated.”)

Dahlmeier teaches dividing a sentence into words and punctuation. However, Dahlmeier does not explicitly teach dividing text into sentences. Thus, Dahlmeier does not explicitly teach: dividing a portion of input data into a sequence of smaller input elements
a plurality of points in the sequence
as a function of some or all of the input elements before and/or after the respective… in the sequence, and

	However, Aoyama teaches: dividing a portion of input data into a sequence of smaller input elements (Aoyama teaches step S104 in Fig. 4 and ¶ [0049]: divide text in block into sentences. A portion of input data is taught by the text, a sequence of smaller input elements is taught by a sequence of sentences.)
a plurality of points in the sequence (The BRI of a “point” as it relates to NLP is a portion of the text, like a sentence. A plurality of points in the sequence is interpreted as a plurality of sentences in Aoyama’s text as in step S104 in Fig. 4 and ¶ [0049].)
	Aoyama is in the same field of endeavor as the claimed invention, namely natural language processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have divided input text into sentences, as taught by Aoyama, with a motivation to correct grammatical errors of the sentences according to Dahlmeier’s methods.
The Dahlmeier/Aoyama combination does not explicitly teach: as a function of some or all of the input elements before and/or after the respective… in the sequence, and
where a following sentence depends on the preceding sentence, may also be utilized.” The broadest reasonable interpretation of the limitation is: as a function of the input element (i.e., the preceding sentence) before the respective point in the sequence (i.e., the sentence being embedded).
	Lee is in the same field of endeavor as the claimed invention, namely natural language processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Lee’s system into the combination of Dahlmeier and Aoyama’s system by generating the combination’s particular sentence embedding/vector as a function of the preceding sentence, with a motivation to compare semantic similarity independent of string similarity (Lee ¶ [0002]: “Word embedding is used by NLP systems as one mechanism for reasoning over natural language sentences. Without word embedding, an NLP system operates on strings of characters, similar groups of words can be considered differently by the NLP system. For example, “The President of the United States visited New York City last week” and “Mr. Trump came to NYC on May 4” have high semantic similarity, but low string similarity.”)

Regarding claim 20, Dahlmeier teaches: Computer apparatus comprising one or more processing units, the processing units programmed to perform operations of automatically: (Dahlmeier ¶ 7: “computer system”, “at least one processor”)
each element in the input elements comprising a word or a gap between words; (The broadest reasonable interpretation of an element amounts to a group of one or more words, including a sentence. An input element is interpreted as an input sentence. Dahlmeier Abstract teaches: “According to one aspect, there is provided a method for correcting grammatical errors of an input sentence”)
(Identifying the sentences with missing or erroneous data (Abstract))
for each respective one of said…: 
- in a first search step, generating a respective set of one or more paths for the respective…, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective… (Dahlmeier ¶ 28, where “path comprises a candidate element” is interpreted as a new hypothesis) and an associated probability score of the candidate element, (Dahlmeier ¶ 39: “the decoder combines the features associated with each hypothesis into an overall hypothesis score”, where according to Dahlmeier ¶15, “The term "score" may mean a value, between 0.0 and 1.0, that measures a probability of the grammatical correctness of a hypothesis.”) the probability score being generated by a first neural network (Dahlmeier ¶ 34: “Supervised classifiers can be used for particular grammatical errors by letting the classifier predict the correct word for a particular sentence context” The decoder in ¶ 39 is interpreted as being a neural network which accepts a weight vector and a feature map comprising the results of the classifier and scores them. ¶ [0040]: “The hypotheses are evaluated by the expert models and scored using the decoder model.”)
- in each of a plurality of subsequent successive search steps, selecting a set of one or more of the paths from one or more of the search steps to extend, the selection being based on the associated probability scores, and generating a respective set of one or more extended paths from each respective one of the selected set of paths, each extended path comprising the candidate element or elements from the respective path combined with an additional candidate element, and an associated probability score for the combination, this probability score being generated by the first neural network as a function of some or all of the input elements before and/or after the respective… in the sequence, and as a function of the probability score for the respective path; and (Dahlmeier ¶ 12: “the input sentence may be a sentence generated from the hypothesis of a previous iteration, i.e. the original sentence having undergone one or more successive iterations of grammatical error correction.” In the claim limitation, preceding path is interpreted as a previous hypothesis, and preceding search step is interpreted as a proposer step in a previous iteration.)
performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and outputting, based on the comparison, a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths. (Dahlmeier ¶ 24: “In step 106, each of the new hypotheses is analysed to compute a score for each of the plurality of new hypotheses. In step 108, the scores of the plurality of new hypotheses are compared. In step 110, an output sentence from the new hypotheses with the highest score is generated.”)

Dahlmeier teaches dividing a sentence into words and punctuation. However, Dahlmeier does not explicitly teach dividing text into sentences. Thus, Dahlmeier does not explicitly teach: dividing a portion of input data into a sequence of smaller input elements
a plurality of points in the sequence
as a function of some or all of the input elements before and/or after the respective… in the sequence, and

	However, Aoyama teaches: dividing a portion of input data into a sequence of smaller input elements (Aoyama teaches step S104 in Fig. 4 and ¶ [0049]: divide text in block into sentences. A portion of input data is taught by the text, a sequence of smaller input elements is taught by a sequence of sentences.)
a plurality of points in the sequence (The BRI of a “point” as it relates to NLP is a portion of the text, like a sentence. A plurality of points in the sequence is interpreted as a plurality of sentences in Aoyama’s text as in step S104 in Fig. 4 and ¶ [0049].)
	Aoyama is in the same field of endeavor as the claimed invention, namely natural language processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have divided input text into sentences, as taught by Aoyama, with a motivation to correct grammatical errors of the sentences according to Dahlmeier’s methods.
The Dahlmeier/Aoyama combination does not explicitly teach: as a function of some or all of the input elements before and/or after the respective… in the sequence, and
	But Lee teaches this limitation in ¶ [0104]: “Rules specifying sentences as also [having] sequential dependency, where a following sentence depends on the preceding sentence, may also be utilized.” The broadest reasonable interpretation of the limitation is: as a function of the input element (i.e., the preceding sentence) before the respective point in the sequence (i.e., the sentence being embedded).
	Lee is in the same field of endeavor as the claimed invention, namely natural language processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Lee’s system into the combination of Dahlmeier and Aoyama’s system by generating the combination’s particular sentence embedding/vector as a function of the preceding sentence, with a motivation to compare semantic similarity independent of string similarity (Lee ¶ [0002]: “Word embedding is used by NLP systems as one mechanism for reasoning over natural language sentences. Without word embedding, an NLP system operates on strings of characters, similar groups of words can be considered differently by the NLP system. For example, “The President of the United States visited New York City last week” and “Mr. Trump came to NYC on May 4” have high semantic similarity, but low string similarity.”)

Claims 6-9 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Dahlmeier, in view Aoyama and Lee, and further in view of Kumar et al. (U.S. 9,911,413 published March 6, 2018) 

Regarding claim 6, the combination of Dahlmeier, Aoyama, and Lee teaches: The method of claim 1 comprising, for each respective one of said points: (each point is interpreted as a sentence)
Dahlmeier teaches: prior to the first search step, generating a respective embedding for the respective point, the embedding being a vector generated… (The embedding is being interpreted as a classifier encoder as taught by Dahlmeier ¶ 34: “Supervised classifiers can be used for particular grammatical errors by letting the classifier predict the correct word for a particular sentence context. The sentence context is encoded in a set of features which forms the input X; the possible correction choices form the classes Y.”)
wherein in the first search step for each of said points, the candidate elements of the respective one or more paths are generated based on a decoder state that is a function of the respective embedding. (The decoder receives the feature map                         
                            
                                
                                    f
                                
                                
                                    e
                                
                            
                        
                     from classifiers - Dahlmeier ¶ 39, equation 4)
Lee teaches: as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the position of the respective point in the sequence (Lee ¶ [0104]: “Rules specifying sentences as also [having] sequential dependency, where a following sentence depends on the preceding sentence, may also be utilized.” The broadest reasonable interpretation of the limitation is: as a function of the input element (i.e., the preceding sentence) before the respective point in the sequence (i.e., the sentence being embedded), and as a function of the position of the respective point (i.e., the position of the embedded sentence is preceded by another sentence).
[generated] by a second neural network 
But Kumar teaches: [generated] by a second neural network (Kumar col. 3, lns. 44-56 and Fig. 2 teaches a natural language classifier 140 generated by an artificial neural network which receives a linguistic representation as input values and outputs a probability distribution.)
Kumar is in the same field of endeavor as Dahlmeier, namely, training artificial neural network classifiers for natural language processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Kumar’s system into the combination of Dahlmeier, Aoyama, and Lee’s system by implementing the embedding classifier with neural network, with a motivation to train the classifier by reinforcement learning (Kumar col. 3, ln. 44)

Regarding claim 7, the combination of Dahlmeier, Aoyama, Kumar, and Lee teaches: The method of claim 6, 
Further, Dahlmeier teaches: wherein the method further comprises, for each of said points: between each successive subsequent search step and the preceding search step, at least for the selected set of preceding paths, updating the decoder state as a function of the candidate elements in the respective preceding path; (The decoder state is interpreted as variable ‘s’ in Dahlmeier Eq. 4. The decoder state is updated between iterations in the Beam-Search Decoding Algorithm Pseudo Code at the line                         
                            
                                
                                    s
                                
                                
                                    b
                                    e
                                    s
                                    t
                                
                            
                            ←
                            
                                
                                    s
                                
                                
                                    h
                                
                            
                        
                     on page 14.)
wherein in each of the subsequent search steps, for each of the extended paths, the additional element of the respective path is generated based on the updated decoder state for the respective path. Dahlmeier ¶ 28: “The proposer modules (also referred to as "proposers") generate new hypotheses from the current hypothesis.)

Regarding claim 8, the combination of Dahlmeier, Aoyama, Kumar, and Lee teaches: The method of claim 6, further comprising, for each respective one of said points, the probability score generated by the first neural network for each path in the first search step also being a function of an initial classifier for the respective position, the classifier representing a probability that the respective point has a missing or erroneous element; and (Dahlmeier ¶ 34 teaches a supervised classifier. In ¶ 36, the classifier scores the original preposition “at” at 0.1 out of 1.0. Under the broadest reasonable interpretation, the classifier scores assigned to the prepositions represent a probability that the respective point (i.e., sentence) has an erroneous element.)
wherein the classifier is generated… as a function of the respective embedding. (the embedding is being interpreted as a classifier encoding data as taught by Dahlmeier ¶ 34: “classifiers”, “encoded”)
Although Dahlmeier teaches that the classifier may be a discriminative supervised classifier (¶ 5, 31) Dahlmeier does not explicitly teach: [wherein the classifier is generated] by a third neural network.
Kumar teaches: [wherein the classifier is generated] by a third neural network. (Kumar col. 3, lns. 44-56 teaches a set of natural language classifiers 141 generated by an artificial neural network which receives a linguistic representation as input values and outputs a probability distribution. The third neural network is interpreted as being another classifier neural network.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Kumar’s system into the combination of Dahlmeier, Aoyama, Kumar, and Lee’s system by implementing the classifier with a neural network, with a motivation to train the classifier by reinforcement learning (Kumar col. 3, ln. 44)

Regarding claim 9, the combination of Dahlmeier, Aoyama, Kumar, and Lee teaches: The method of claim 8, 
Further, Dahlmeier teaches: wherein the method comprises: following each of some or all of the search steps, skimming off the element or combination of elements from each of some or all of the paths generated from across some or all of the points in the sequence into a candidate pool, the element or combination of elements from each of the skimmed-off paths forming a respective candidate result in the candidate pool; and (Dahlmeier ¶ 41: “If a hypothesis has a higher score compared to the best hypothesis found so far in previous iterations, it is added to the pool.”)
applying a fourth neural network to each entry to generate a new probability score for each candidate result in the candidate pool; (Interpreted as the same decoder from Dahlmier ¶ 39)
wherein said comparing comprises comparing the new probability scores in the candidate pool, and said selection comprises a selection of one or more of the candidate results having the highest of the new probability scores; and (Dahlmeier ¶ 39-41 teaches that the decoder scores each hypothesis and selects the best one, which involves comparing scores)
wherein some or all of the first, second, third or fourth neural networks are subgraphs of a same wider network, and are trained together. (The first neural network is the only one of a subgraph of a same wider neural network)

Regarding claim 11, the combination of Dahlmeier, Aoyama, and Lee teaches: The method of claim 10, wherein the classifier
Lee teaches: as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the position of the respective point in the sequence. (Lee ¶ [0104]: “Rules specifying sentences as also [having] sequential dependency, where a following sentence depends on the preceding sentence, may also be utilized.” The broadest reasonable 
However, the combination of Dahlmeier, Aoyama, and Lee does not explicitly teach: is generated by another neural network nor
But Kumar teaches: [generated] by another neural network (Kumar col. 3, lns. 44-56 and Fig. 2 teaches a natural language classifier 140 generated by an artificial neural network which receives a linguistic representation as input values and outputs a probability distribution.)
Kumar is in the same field of endeavor as Dahlmeier, namely, training artificial neural network classifiers for natural language processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Kumar’s system into the combination of Dahlmeier, Aoyama, and Lee’s system by implementing the embedding classifier with neural network, with a motivation to train the classifier by reinforcement learning (Kumar col. 3, ln. 44)

Response to Arguments
	The objections and rejections mentioned in this section were issued in the office action filed 03/02/2021, and the applicant’s Remarks were filed 06/15/2021. 

Objections to the Claims: The objection to Claim 3 has been withdrawn due to the claim amendments. 

Claim Interpretation under 35 U.S.C. 112(f): Claim 20 is no longer being interpreted under 35 U.S.C. 112(f) due to the claim amendments describing the computer apparatus as comprising processing units.

Claim Rejections under 35 U.S.C. 112(b): The 35 U.S.C. 112(b) antecedent basis rejections of Claims 3-4, 8-9, 11-12, 14 and 18 have been withdrawn due to the claim amendments in each claim.

Claim Rejections under 35 U.S.C. 101:
Applicant argues on p. 11: “Applicant respectfully submits that when viewed as a whole, the eligibility of the rejected claims is self-evident” and that “’Such claims do not need to proceed through the full analysis herein as their eligibility will be self-evident.’” Respectfully, the 2019 PEG analysis must be followed and the analysis must proceed to Step 2A Prong 1 because Examiner has identified judicial exceptions in Claims 1-3, 5, 10, and 14-20. Applicant argues on p. 12: “Applicant submits the present application provides functionality which couldn’t be done before”. This is not true because all functionalities in Claims 1-3, 5, 10 and 14-20 are mental processes or mathematical calculations.
A. Analysis under Step 2A Prong 1 (Remarks pp. 12-15)
Under the broadest reasonable interpretation, Claims 1, 19, and 20 recite abstract ideas. Abstract ideas of the type mental process are dividing a portion of input data, identifying a plurality of points, generating a respective set of one or more paths for the respective point, selecting a set of one or more paths, generating a respective set of one or more extended paths, and performing a comparison. Abstract ideas of the type mathematical calculation are generating probability scores. 
Applicant argues that the present Application solves two technical problems faced by imputation techniques. The argument is moot because determining whether the claim solves a technical problem is not part of the inquiry in Step 2A Prong One, as described in MPEP 2106.04 (II.)(A.)(1.).
B. Analysis under Step 2A Prong 2 (Remarks pp. 15-18)
Applicant argues that Claims 1, 19, and 20 are eligible at least because the claims have practical applications described in specification paragraphs [0009], [0012], and [0014]. Examiner reminds 
Applicant argues that each claim as a whole integrates the alleged abstract idea into a practical application. The additional elements are a first neural network and outputting result(s). The claims are focused on improvements to the abstract idea, not on the neural network or outputting results, which fails Step 2A Prong 2. This office action explains that claim 1 is rejected under 35 U.S.C. 101 because a neural network is not an improvement to the functioning of a computer or to any other technology under MPEP 2106.05(a); and it is a field of use of using a neural network to perform a calculation under MPEP 2106.05(h). Outputting results is mere-data gathering which is an extra-solution activity under MPEP 2106.05(g).
Applicant cites portions from the specification describing how the invention improves processing. These improvements do not contribute to the inquiry in Step 2A Prong 2 because the improvements are absent from the claims.
C. Analysis under Step 2B (Remarks pp. 18-20)
Applicant argues the claim 1 limitations amount to significantly more than the judicial exception because the claim offers significant improvement over the art by citing portions of the specification. Analysis in the Step 2B should focus on the text of the claims, not the specification. Even considering the portions from specification paragraph [0014] cited by Applicant, all the steps are mental processes or mathematical calculations. The rejection under 35 U.S.C. 101 is maintained.

Claim Rejections under 35 U.S.C. 102/103
Applicant argues that Dahlmeier fails to teach or disclose each and every element of amended independent claims 1, and dependent claims 2-5, 10, and 12-18. During the May 13, 2021 interview, the 
Upon further review of the Claim 1 amendment, Examiner has changed the claim mapping as necessitated by the amendments. The limitation “each element in the input elements comprising a word or a gap between words” is taught by Dahlmeier. The broadest reasonable interpretation of the limitation is that an element is a sentence comprising a word. A gap between words is interpreted as a missing sentence. The limitation “dividing a portion of input data into a sequence of smaller input elements” is now mapped to Aoyama et al., which teaches splitting a text block into sentences.
Regarding the 103 claim rejections, Applicant argues that Dahlmeier in view of Kumar fails to teach or disclose each and every element of dependent claims 6, 8-9 and 11. Examiner has changed the claim mappings as necessitated by the amendments. The claim rejections rejected over Kumar are maintained.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Jung et al. (US 20150317315) teaches correcting an incomplete sentence with missing words at the end of ¶ [0037].
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Asher Jablon whose telephone number is (571)270-7648.  The examiner can normally be reached on Monday - Friday, 9:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.







/ERIC NILSSON/Primary Examiner, Art Unit 2122