Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending and have been considered.
Priority
Acknowledgment is made of applicant's claim for foreign priority based on an application filed in the United Kingdom of Great Britain and Northern Ireland on March 20, 2018. It is noted, however, that applicant has not filed a certified copy of the GB1804433.9 application as required by 37 CFR 1.55.
Claim Objections
Claims 3 is objected to because of the following informalities:  Claim 3 recites: “said set of preceding paths to be expended are the paths remaining after any pruning.” Examiner is uncertain whether expended should read expanded. Appropriate correction is required.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an 

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “computer apparatus” in claim 20 line 1.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

This application includes one or more claim limitations that use the word “means” or “step” but are nonetheless not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph because the claim limitation(s) recite(s) sufficient structure, materials, or acts to entirely perform the recited function.  Such claim limitation(s) is/are: “step” in claims 1-10, 12-16 and 19-20.
not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are not being interpreted to cover only the corresponding structure, material, or acts described in the specification as performing the claimed function, and equivalents thereof.
If applicant intends to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to remove the structure, materials, or acts that performs the claimed function; or (2) present a sufficient showing that the claim limitation(s) does/do not recite sufficient structure, materials, or acts to perform the claimed function.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 3, 8-9, 11-12, 14, and 18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 3 recites the limitation “said set of preceding paths to be expended” in the last line. There is insufficient antecedent basis for this limitation in the claims. The limitation is being interpreted as “said set of preceding paths to be extended”. 
Claim 4 recites the limitation “a fourth neural network” in line 7. There is insufficient antecedent basis for this limitation in the claims because second and third neural networks were not recited. The 
Claim 8, lines 1-8 recite the same limitations as claim 6 without further limiting them. It is unclear how these claim limitations should be interpreted. Examiner is interpreting claim 8 line 1 as “The method of claim 1, comprising…”.
Claim 9 recites the limitation "the same wider network" in the last line.  There is insufficient antecedent basis for this limitation in the claim. The limitation is being interpreted as “a same wider network”. Claim 9 is also rejected for failing to cure the deficiencies of claim 8 upon which it depends.
Claim 11 recites the limitation “a third neural network” in line 1. There is insufficient antecedent basis for this limitation in the claims because a second neural network was not recited. The limitation is being interpreted as “a neural network”. 
Claim 12 recites the limitation “a fourth neural network in line 7. There is insufficient antecedent basis for this limitation in the claims because second and third neural networks were not recited. The limitation is being interpreted as “a neural network”. 
Claim 14 recites the limitation “the… second, third, and/or fourth neural network” in the second-to-last line. There is insufficient antecedent basis in claims 1 and 14 for “the second neural network”, “the third neural network”, and “the fourth neural network”. This limitation is being interpreted as “a second, a third, and/or a fourth neural network”. 
Claim 18 recites the limitation “the elements from the received text”. There is insufficient antecedent basis for this limitation. This limitation is being interpreted as the smaller input elements from the portion of text are words or characters.
Applicant can overcome these 35 U.S.C. 112(b) rejections by amending the claims to match Examiner’s interpretations. Examiner respectfully asks Applicant to review the claims for any more potential limitations lacking antecedent basis.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-3, 5, 10, and 14-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Claim 1 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitations: 
(1) dividing a portion of input data into a sequence of smaller input elements;
(2) identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed; 
(3) for each respective one of said points: - in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, and 
(4) an associated probability score, the probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and 
(5) - in each of a plurality of subsequent successive search steps, selecting a set of one or more of the preceding paths from one or more of the preceding search steps to extend, the selection being based on the associated probability scores, and 
(6) generating a respective set of one or more extended paths from each respective one of the selected set of preceding paths, each extended path comprising the candidate element or elements from the respective preceding path combined with an additional candidate element, and 
generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the probability score for the respective preceding path; and 
(8) performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and 
	Limitations 1-3, 5-6, and 8 are mental processes (see bolded terms). Limitations 4 and 7 are mathematical processes but for the first neural network (see bolded terms). Accordingly, the claim recites an abstract idea. 
The claim recites the additional element: by a first neural network. This additional element is not integrated into a practical application because it is not an improvement to a computer or other technology. It is merely the use of a known technology to achieve the abstract idea of calculating a probability.
The claim recites a second additional element: based thereon outputting a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths. This additional element is not integrated into a practical application because it is merely outputting results, which is insignificant extra-solution activity under Step 2A Prong Two. See MPEP 2106.05(g). Adding insignificant extra-solution activity is not sufficient to integrate the additional elements into a practical application.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional element of a neural network is not an improvement to the functioning of a computer or to any other technology under MPEP 2106.05(a); and it is a field of use of using a neural network to perform a calculation under MPEP 2106.05(h). 
The additional element of outputting result(s) is well-understood, routine, and conventional under MPEP 2106.05(d). MPEP 2106.05(d)(II). states an example of activity that the courts have 

Claim 2 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitation: 
wherein in each of the successive search steps for each point, the set of one or more preceding paths to extend is selected from the immediately preceding search step. 
This limitation is a mental process of selecting. Accordingly, the claim recites an abstract idea. 
This judicial exception is not integrated into a practical application because the claim does not include any additional elements. The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Claim 3 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitations:
(1) following each of one, some or all of said search steps for each point, pruning away lower scoring ones of the paths based on the probability scores, thus leaving only one or some of the paths remaining; 
(2) wherein for each of said plurality of points, in each of the successive search steps, said set of preceding paths to be expended [interpreted as “extended”] are the paths remaining after any pruning.
	Limitation 1 is a mental process of deciding to prune lower scoring paths, interpreted as deciding not to extend lower scoring paths. Limitation 2 is a mental process of deciding to extend higher 

Claim 5 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitation:
wherein each successive search step does not proceed for any of the points until the immediately preceding search step has been performed for all of the points.
This limitation is a mental process merely of deciding not proceed until the conditions are met. Accordingly, the claim recites an abstract idea. This judicial exception is not integrated into a practical application because the claim does not include any additional elements. The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Claim 10 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitation:
wherein for each of said points, the probability score generated… for each path in the first search step is also a function of an initial classifier for the respective position, the classifier representing a probability that the respective point has a missing or erroneous element.
This limitation is a mathematical processes, but for the recitation of the first neural network, of computing a probability score as a function of an initial classifier representing a probability. Accordingly, the claim recites an abstract idea. 


Claim 14 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitations:
for each of said points: prior to the first search step, including an end-of-sequence element in the input sequence at the end of the sequence to represent the end of the portion of input data, and/or 
including a start-of-sequence element in the input sequence at the start of the sequence to represent the start of the portion of input data; 
wherein the input elements… include the end-of- sequence element and/or the start-of-sequence element.
	These limitations are mental processes of deciding to include an end-of-sequence element and a start-of-sequence element in the sequences and input elements. Accordingly, the claim recites an abstract idea. 
The claim recites the additional elements: the first, second, third and/or fourth neural network. These additional elements are not integrated into a practical application because they are not an improvement to a computer or other technology. They are merely the use of a known technology to achieve the abstract idea of calculating a probability.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional elements of first, second, third, and/or fourth neural networks are not an improvement to the functioning of a computer or to any other technology under MPEP 2106.05(a); and they are a field of use of using neural networks to perform a calculation under MPEP 2106.05(h). The additional elements individually and in combination do not 

Claim 15 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitations:
wherein in one, some or all of the search steps for each of some or all of the points, the generating of the paths comprises generating a respective set of multiple paths for each respective one of at least some said points, 
the multiple paths for the respective point each comprising a different candidate element and associated probability score…
These limitations are mental processes of deciding to generate multiple paths per point and deciding the multiple paths should comprise a different candidate element and associated probability score. Accordingly, the claim recites an abstract idea.
The claim recites the additional element: based on the first neural network. This additional element is not integrated into a practical application because it is not an improvement to a computer or other technology. It is merely the use of a known technology to achieve the abstract idea of calculating a probability.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional element of a neural network is not an improvement to the functioning of a computer or to any other technology under MPEP 2106.05(a); and it is a field of use of using a neural network to perform a calculation under MPEP 2106.05(h). The claim is not patent eligible.

Claim 16 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitation: 
wherein amongst the multiple paths for each respective point having multiple paths in the current search step, the candidate elements for one of the paths includes a rejoin-sequence element representing stopping the search for the respective point and rejoining the candidate element or elements from the preceding search steps to the input sequence.
This limitation is a mental process of deciding to stop the search for the respective point rejoining the candidate element or elements from the preceding search steps to the input sequence. Accordingly, the claim recites an abstract idea. 
This judicial exception is not integrated into a practical application because the claim does not include any additional elements. The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Claim 17 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitation:
wherein said points are gaps between the input elements where missing data is potentially to be imputed.
This limitation is a mental process of marking the gaps between input elements as said points. Accordingly, the claim recites an abstract idea. 
This judicial exception is not integrated into a practical application because the claim does not include any additional elements. The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Claim 18 recites a method, one of the four statutory categories of patentable subject matter. The claim recites the following limitation:
wherein the portion of input data comprises a portion of text, and the elements from the received text are words or characters.
This limitation modifies limitation (1) of claim 1, which was identified as being a mental process, in a way that doesn’t affect the analysis of it being a mental process. Dividing text into words or characters is a mental process. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application because the claim does not include any additional elements. The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Claim 19 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because it is directed towards transitory signals per se. Neither the claims nor the specification preclude the storage from being transitory signals per se. 
To overcome this rejection, Applicant should amend claim 19, line 1 to read “non-transitory computer readable storage”. Applicant should note that amending claim 19 as such would not overcome a 35 U.S.C. 101 rejection for being directed towards a judicial exception of abstract idea, for the same reasons independent claims 1 and 20 are rejected.

Claim 20 recites an apparatus, one of the four statutory categories of patentable subject matter. The claim recites the following limitations:  
(1) dividing a portion of input data into a sequence of smaller input elements
identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed; 
(3) for each respective one of said points: - in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, and 
(4) an associated probability score, the probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and 
(5) - in each of a plurality of subsequent successive search steps, selecting a set of one or more of the preceding paths from one or more of the preceding search steps to extend, the selection being based on the associated probability scores, and 
(6) generating a respective set of one or more extended paths from each respective one of the selected set of preceding paths, each extended path comprising the candidate element or elements from the respective preceding path combined with an additional candidate element, and 
(7) an associated probability score for the combination, this probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the probability score for the respective preceding path; and 
(8) performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and 
Limitations 1-3, 5-6, and 8 are mental processes (see bolded terms). Limitations 4 and 7 are mathematical processes but for the first neural network (see bolded terms). Accordingly, the claim recites an abstract idea. 
The claim recites the following additional elements:
A computer apparatus 
by a first neural network. 

The claim recites another additional element: based thereon outputting a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths. This additional element is not integrated into a practical application because it is merely outputting results, which is insignificant extra-solution activity under Step 2A Prong Two. See MPEP 2106.05(g). Adding insignificant extra-solution activity is not sufficient to integrate the additional elements into a practical application.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
The additional element of a computer apparatus is mere instructions to apply an exception using a computer under See MPEP 2106.05(f). 
The additional element of a neural network is not an improvement to the functioning of a computer or to any other technology under MPEP 2106.05(a); and it is a field of use of using a neural network to perform a calculation under MPEP 2106.05(h). 
The additional element of outputting result(s) is well-understood, routine, and conventional under MPEP 2106.05(d). MPEP 2106.05(d)(II) states an example of activity that the courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity include (i.) Receiving or transmitting data over a network. Outputting results is 
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-5, 10, and 12-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by WIPO application publication number WO 2013191662 A1 to Dahlmeier et al., hereinafter Dahlmeier.
Regarding claim 1, Dahlmeier teaches: A computer-implemented method comprising automatically: 
dividing a portion of input data into a sequence of smaller input elements; (Dahlmeier ¶ 28: “The proposer modules (also referred to as "proposers") generate new hypotheses from the current hypothesis. Because the space of all possible hypotheses is exponential, each proposer only makes a small incremental change to the current hypothesis in each step. Each change may correspond to a single correction of a word or phrase in the current hypothesis.” The proposer may change words or phrases in the current hypothesis, hence it may divide a portion of input data (the sentence) into a sequence of smaller input elements (phrases, words, characters, spaces, etc.).)
(¶ 28, where point is interpreted as words, characters, spaces, etc.)
for each respective one of said points: 
- in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, (Dahlmeier ¶ 28, where “path comprises a candidate element” is interpreted as a new hypothesis) and an associated probability score (Dahlmeier ¶ 39: “the decoder combines the features associated with each hypothesis into an overall hypothesis score”, where according to Dahlmeier ¶15, “The term "score" may mean a value, between 0.0 and 1.0, that measures a probability of the grammatical correctness of a hypothesis.”), the probability score being generated by a neural network as a function of some or all of the input elements before and/or after the respective point in the sequence, and (Dahlmeier ¶ 34: “Supervised classifiers can be used for particular grammatical errors by letting the classifier predict the correct word for a particular sentence context…  For example, for preposition errors, a classifier can be trained to predict the correct preposition, given a feature representation of the surrounding context, e.g., the words to the left and right of the preposition.” The decoder in ¶ 39 is interpreted as being a neural network which accepts a weight vector and a feature map comprising the results of the classifier.)
- in each of a plurality of subsequent successive search steps, selecting a set of one or more of the preceding paths from one or more of the preceding search steps to extend, the selection being based on the associated probability scores, and generating a respective set of one or more extended paths from each respective one of the selected set of preceding paths, each extended path comprising the candidate element or elements from the respective preceding path combined with an additional candidate element, and an associated probability score for the combination, this probability score being generated by the neural network as a function of some or all of the input elements before and/or after (Dahlmeier ¶ 12: “the input sentence may be a sentence generated from the hypothesis of a previous iteration, i.e. the original sentence having undergone one or more successive iterations of grammatical error correction.” In the claim limitation, preceding path is interpreted as a previous hypothesis, and preceding search step is interpreted as a proposer step in a previous iteration.)
performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and based thereon outputting a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths. (Dahlmeier ¶ 24: In step 106, each of the new hypotheses is analysed to compute a score for each of the plurality of new hypotheses. In step 108, the scores of the plurality of new hypotheses are compared. In step 110, an output sentence from the new hypotheses with the highest score is generated.”)

Examiner’s Note: The term “preceding path(s)” in claim 1, lines 13, 16, 17, and 22 and the term “preceding search steps” in claim 1 line 13 are potentially confusing. Applicant should amend the claims to clarify these terms, such as by replacing them with “said path(s)” and “said search steps”.

Regarding claim 2, Dahlmeier teaches: The method of claim 1, wherein in each of the successive search steps for each point, the set of one or more preceding paths to extend is selected from the immediately preceding search step. (Examiner interprets Dahlmeier ¶ 12 to include the immediate previous iteration: “the input sentence may be a sentence generated from the hypothesis of a previous iteration, i.e. the original sentence having undergone one or more successive iterations of grammatical error correction.”)

Regarding claim 3, Dahlmeier teaches: The method of claim 2, comprising: following each of one, some or all of said search steps for each point, pruning away lower scoring ones of the paths based on the probability scores, thus leaving only one or some of the paths remaining; wherein for each of said plurality of points, in each of the successive search steps, said set of preceding paths to be expended [extended] are the paths remaining after any pruning. (Dahlmeier ¶ 41, where paths are interpreted as hypotheses: “the search space may be pruned by only accepting the most promising hypotheses to the pool of hypotheses for future consideration. Thus, the plurality of new hypotheses, when performing grammatical error correction of an input sentence, may be based on hypotheses with the highest score range from a previous iteration. If a hypothesis has a higher score compared to the best hypothesis found so far in previous iterations, it is added to the pool.”)

Regarding claim 4, Dahlmeier teaches: The method of claim 3, wherein the method comprises: 
following each of some or all of the search steps, skimming off the element or combination of elements from each of some or all of the paths generated from across some or all of the points in the sequence into a candidate pool, the element or combination of elements from each of the skimmed-off paths forming a respective candidate result in the candidate pool; and (Dahlmeier ¶ 41: “If a hypothesis has a higher score compared to the best hypothesis found so far in previous iterations, it is added to the pool.”)
applying a fourth neural network to each entry to generate a new probability score for each candidate result in the candidate pool; (As stated in the 112(b) rejection, “a fourth neural network” is interpreted as “a neural network” which corresponds to the decoder performing this limitation in Dahlmeier ¶ 39.)
Dahlmeier ¶ 39-41 teaches that the decoder scores each hypothesis and selects the best one, which involves comparing scores)
wherein said skimming comprises, for each current one of the search steps, after the current search step is completed across all the points in the sequence, skimming off the element or combination of elements from each of only a selected subset of the paths generated in the current search step into the candidate pool as candidate results, wherein the subset is selected as those paths having greater than a threshold probability score, or those in a highest portion according to the probability score; and 
wherein the selected subset is selected only from amongst the paths remaining after the pruning in the current search step. 
(The last two limitations are taught by Dahlmeier ¶ 41: “Therefore, the search space may be pruned by only accepting the most promising hypotheses to the pool of hypotheses for future consideration. Thus, the plurality of new hypotheses, when performing grammatical error correction of an input sentence, may be based on hypotheses with the highest score range from a previous iteration. If a hypothesis has a higher score compared to the best hypothesis found so far in previous iterations, it is added to the pool.”)

Regarding claim 5, Dahlmeier teaches: The method of claim 1, wherein each successive search step does not proceed for any of the points until the immediately preceding search step has been performed for all of the points. (The method of Dahlmeier ¶ 12 broadly reads on this claim limitation. Dahlmeier ¶ 12: “the input sentence may be a sentence generated from the hypothesis of a previous iteration, i.e. the original sentence having undergone one or more successive iterations of grammatical error correction.”)

Regarding claim 10, Dahlmeier teaches: The method of claim 1, wherein for each of said points, the probability score generated by the first neural network for each path in the first search step is also a function of an initial classifier for the respective position, the classifier representing a probability that the respective point has a missing or erroneous element. (Dahlmeier ¶ 39 teaches a first neural network for each hypothesis (path) is a function of a classifier. ¶ 35 lists classifiers for typical grammatical errors.)

Regarding claim 12, Dahlmeier teaches: The method of claim 1, wherein the method comprises: 
following each of some or all of the search steps, skimming off the element or combination of elements from each of some or all of the paths generated from across some or all of the points in the sequence into a candidate pool, the element or combination of elements from each of the skimmed-off paths forming a respective candidate result in the candidate pool; and (Dahlmeier ¶ 41: “If a hypothesis has a higher score compared to the best hypothesis found so far in previous iterations, it is added to the pool.”)
applying a fourth neural network to each entry to generate a new probability score for each candidate result in the candidate pool; (As stated in the 112(b) rejection, “a fourth neural network” is interpreted as “a neural network” which corresponds to the decoder performing this limitation in Dahlmeier ¶ 39.)
wherein said comparing comprises comparing the new probability scores in the candidate pool, and said selection comprises a selection of one or more of the candidate results having the highest of the new probability scores. (Dahlmeier ¶ 39-41 teaches that the decoder scores each hypothesis and selects the best one, which involves comparing scores)

Regarding claim 13, Dahlmeier teaches: The method of claim 12, wherein said skimming comprises, for each current one of the search steps, after the current search step is completed across all the points in the sequence, skimming off the element or combination of elements from each of only a selected subset of the paths generated in the current search step into the candidate pool as candidate results, wherein the subset is selected as those paths having greater than a threshold probability score, or those in a highest portion according to the probability score. (Dahlmeier ¶ 41: “Therefore, the search space may be pruned by only accepting the most promising hypotheses to the pool of hypotheses for future consideration. Thus, the plurality of new hypotheses, when performing grammatical error correction of an input sentence, may be based on hypotheses with the highest score range from a previous iteration. If a hypothesis has a higher score compared to the best hypothesis found so far in previous iterations, it is added to the pool.”)

Regarding claim 14, Dahlmeier teaches: The method of claim 1 comprising, for each of said points: prior to the first search step, including an end-of-sequence element in the input sequence at the end of the sequence to represent the end of the portion of input data, and/or including a start-of-sequence element in the input sequence at the start of the sequence to represent the start of the portion of input data; (Input sequence is interpreted as input sentence as in Dahlmeier ¶12. Dahlmeier ¶ 36 teaches the input sequence “He leaves at the morning.” The first letter “H” is interpreted as a start-of-sequence element and the sentence period (“.”) is interpreted as the end-of-sequence element.)
wherein the input elements of which the first, second, third and/or fourth neural network is a function include the end-of- sequence element and/or the start-of-sequence element. (The decoder taught by Dahlmeier is a function of the start of sequence element and the end of sequence element.)

Regarding claim 15, Dahlmeier teaches: The method of claim 1, wherein in one, some or all of the search steps for each of some or all of the points, the generating of the paths comprises generating a respective set of multiple paths for each respective one of at least some said points, (generating multiple hypotheses is interpreted as generating a set of multiple paths. Dahlmeier ¶ 27: “Proposers take a hypothesis and output a set of new hypotheses, where each new hypothesis is the result of making an incremental change to the current hypothesis. Accordingly, proposers generate a plurality of new hypothesis from a current hypothesis.”)
the multiple paths for the respective point each comprising a different candidate element and associated probability score based on the first neural network. (Dahlmeier ¶ 27: “Experts subsequently score these hypotheses on particular aspects of grammaticality. Accordingly, experts analyse each of the new hypothesis to compute a score for each of the plurality of new hypotheses.”)

Regarding claim 16, Dahlmeier teaches: The method of claim 15, wherein amongst the multiple paths for each respective point having multiple paths in the current search step, the candidate elements for one of the paths includes a rejoin-sequence element representing stopping the search for the respective point and rejoining the candidate element or elements from the preceding search steps to the input sequence. (Dahlmeier ¶ 28 teaches making a single correction of a word or phrase in the current hypothesis. This is interpreted as a rejoin-sequence element as recited by claim 16.)


Regarding claim 17, Dahlmeier teaches: The method of claim 1, wherein said points are gaps between the input elements where missing data is potentially to be imputed. (Dahlmeier ¶ 29 teaches that a new hypothesis may comprise punctuation marks inserted between words. Examiner interprets gaps as spaces between words, and missing data as the punctuation marks.)

Regarding claim 18, Dahlmeier teaches: The method of claim 1, wherein the portion of input data comprises a portion of text, and the elements from the received text are words or characters. (Dahlmeier ¶ 6 teaches input data is an input sentence (a portion of text) containing words or characters) 

Regarding claim 19, Dahlmeier teaches: A computer program comprising code embodied on computer readable storage and configured so as when run on a computer apparatus to perform operations of automatically: (Dahlmeier ¶ 7: “computer program code”)
dividing a portion of input data into a sequence of smaller input elements; (Dahlmeier ¶ 28: “The proposer modules (also referred to as "proposers") generate new hypotheses from the current hypothesis. Because the space of all possible hypotheses is exponential, each proposer only makes a small incremental change to the current hypothesis in each step. Each change may correspond to a single correction of a word or phrase in the current hypothesis.” The proposer may change words or phrases in the current hypothesis, hence it may divide a portion of input data (the sentence) into a sequence of smaller input elements (phrases, words, characters, spaces, etc.).)
identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed; (¶ 28, where point is interpreted as words, characters, spaces, etc.)
for each respective one of said points: - in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, (Dahlmeier ¶ 28, where “path comprises a candidate element” is interpreted as a new hypothesis) and an associated probability score (Dahlmeier ¶ 39: “the decoder combines the features associated with each hypothesis into an overall hypothesis score”, where according to Dahlmeier ¶15, “The term "score" may mean a value, between 0.0 and 1.0, that measures a probability of the grammatical correctness of a hypothesis.”), the probability score being generated by a first neural network as a function of some or all of the input (Dahlmeier ¶ 34: “Supervised classifiers can be used for particular grammatical errors by letting the classifier predict the correct word for a particular sentence context…  For example, for preposition errors, a classifier can be trained to predict the correct preposition, given a feature representation of the surrounding context, e.g., the words to the left and right of the preposition.” The decoder in ¶ 39 is interpreted as being a neural network which accepts a weight vector and a feature map comprising the results of the classifier.)
- in each of a plurality of subsequent successive search steps, selecting a set of one or more of the preceding paths from one or more of the preceding search steps to extend, the selection being based on the associated probability scores, and generating a respective set of one or more extended paths from each respective one of the selected set of preceding paths, each extended path comprising the candidate element or elements from the respective preceding path combined with an additional candidate element, and an associated probability score for the combination, this probability score being generated by the first neural network as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the probability score for the respective preceding path; and (Dahlmeier ¶ 12: “the input sentence may be a sentence generated from the hypothesis of a previous iteration, i.e. the original sentence having undergone one or more successive iterations of grammatical error correction.” In the claim limitation, preceding path is interpreted as a previous hypothesis, and preceding search step is interpreted as a proposer step in a previous iteration.)
performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and based thereon outputting a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths. (Dahlmeier ¶ 24: In step 106, each of the new hypotheses is analysed to compute a score for each of the plurality of new hypotheses. In step 108, the scores of the plurality of new hypotheses are compared. In step 110, an output sentence from the new hypotheses with the highest score is generated.”)

Regarding claim 20, Dahlmeier teaches: Computer apparatus programmed to perform operations of automatically: (Dahlmeier ¶ 7: “computer system”)
dividing a portion of input data into a sequence of smaller input elements; (Dahlmeier ¶ 28: “The proposer modules (also referred to as "proposers") generate new hypotheses from the current hypothesis. Because the space of all possible hypotheses is exponential, each proposer only makes a small incremental change to the current hypothesis in each step. Each change may correspond to a single correction of a word or phrase in the current hypothesis.” The proposer may change words or phrases in the current hypothesis, hence it may divide a portion of input data (the sentence) into a sequence of smaller input elements (phrases, words, characters, spaces, etc.).)
identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed; (¶ 28, where point is interpreted as words, characters, spaces, etc.)
for each respective one of said points: - in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, (Dahlmeier ¶ 28, where “path comprises a candidate element” is interpreted as a new hypothesis) and an associated probability score(Dahlmeier ¶ 39: “the decoder combines the features associated with each hypothesis into an overall hypothesis score”, where according to Dahlmeier ¶15, “The term "score" may mean a value, between 0.0 and 1.0, that measures a probability of the grammatical correctness of a hypothesis.”), the probability score being generated by a first neural network as a function of some or all of the input elements before and/or after the respective point in the sequence, and (Dahlmeier ¶ 34: “Supervised classifiers can be used for particular grammatical errors by letting the classifier predict the correct word for a particular sentence context…  For example, for preposition errors, a classifier can be trained to predict the correct preposition, given a feature representation of the surrounding context, e.g., the words to the left and right of the preposition.” The decoder in ¶ 39 is interpreted as being a neural network which accepts a weight vector and a feature map comprising the results of the classifier.)
- in each of a plurality of subsequent successive search steps, selecting a set of one or more of the preceding paths from one or more of the preceding search steps to extend, the selection being based on the associated probability scores, and generating a respective set of one or more extended paths from each respective one of the selected set of preceding paths, each extended path comprising the candidate element or elements from the respective preceding path combined with an additional candidate element, and an associated probability score for the combination, this probability score being generated by the first neural network as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the probability score for the respective preceding path; and (Dahlmeier ¶ 12: “the input sentence may be a sentence generated from the hypothesis of a previous iteration, i.e. the original sentence having undergone one or more successive iterations of grammatical error correction.” In the claim limitation, preceding path is interpreted as a previous hypothesis, and preceding search step is interpreted as a proposer step in a previous iteration.)
performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and based thereon outputting a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths. (Dahlmeier ¶ 24: In step 106, each of the new hypotheses is analysed to compute a score for each of the plurality of new hypotheses. In step 108, the scores of the plurality of new hypotheses are compared. In step 110, an output sentence from the new hypotheses with the highest score is generated.”)

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 6, 8, 9 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Dahlmeier in view of U.S. Patent No. 9,911,413 to Kumar et al. (published March 6, 2018), hereinafter Kumar.

Regarding claim 6, Dahlmeier teaches: The method of claim 1 comprising, for each respective one of said points: 
prior to the first search step, generating a respective embedding for the respective point, the embedding being a vector generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the position of the respective point in the sequence; (the embedding is being interpreted as a classifier encoder as taught by Dahlmeier ¶ 34: “classifiers”, “encoded”, “the words to the left and right”)
wherein in the first search step for each of said points, the candidate elements of the respective one or more paths are generated based on a decoder state that is a function of the respective embedding. (The decoder receives the feature map                         
                            
                                
                                    f
                                
                                
                                    e
                                
                            
                        
                     from classifiers - Dahlmeier ¶ 39, equation 4)
Although Dahlmeier teaches that the classifier may be a discriminative supervised classifier (¶ 5, 31) Dahlmeier does not explicitly teach: [the embedding being a vector generated] by a second neural network
Kumar teaches: [the embedding being a vector generated] by a second neural network (Kumar col. 3, lns. 44-56 teaches a natural language classifier generated by an artificial neural network which receives a linguistic representation as input values and outputs a probability distribution.)
Kumar is in the same field of endeavor as Dahlmeier, namely, training artificial neural network classifiers for natural language processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Kumar’s system into Dahlmeier’s system by implementing the embedding classifier with neural network, with a motivation to train the classifier by reinforcement learning (Kumar col. 3, ln. 44)


Regarding claim 7, Dahlmeier in view of Kumar teaches: The method of claim 6, 
Further, Dahlmeier teaches: wherein the method further comprises, for each of said points: between each successive subsequent search step and the preceding search step, at least for the selected set of preceding paths, updating the decoder state as a function of the candidate elements in the respective preceding path; (The decoder state is interpreted as variable ‘s’ in Dahlmeier Eq. 4. The decoder state is updated between iterations in the Beam-Search Decoding Algorithm Pseudo Code at the line                         
                            
                                
                                    s
                                
                                
                                    b
                                    e
                                    s
                                    t
                                
                            
                            ←
                            
                                
                                    s
                                
                                
                                    h
                                
                            
                        
                     on page 14.)
wherein in each of the subsequent search steps, for each of the extended paths, the additional element of the respective path is generated based on the updated decoder state for the respective path. (Dahlmeier ¶ 28: “The proposer modules (also referred to as "proposers") generate new hypotheses from the current hypothesis.)

Regarding claim 8, Dahlmeier teaches: The method of claim 6 (Interpreted as “The method of claim 1”), comprising, for each respective one of said points: 
prior to the first search step, generating a respective embedding for the respective point, the embedding being a vector generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the position of the respective point in the sequence; (the embedding is being interpreted as a classifier encoder as taught by Dahlmeier ¶ 34: “classifiers”, “encoded”, “the words to the left and right”)
wherein in the first search step for each of said points, the candidate elements of the respective one or more paths are generated based on a decoder state that is a function of the respective embedding; (The decoder receives the feature map                         
                            
                                
                                    f
                                
                                
                                    e
                                
                            
                        
                     from classifiers - Dahlmeier ¶ 39, equation 4) 
(Dahlmeier ¶ 34 teaches a supervised classifier. In ¶ 36, the classifier scores the original element “at” at 0.1 out of 1.0.) 
wherein the classifier is generated… as a function of the respective embedding. (the embedding is being interpreted as a classifier encoding data as taught by Dahlmeier ¶ 34: “classifiers”, “encoded”)
Although Dahlmeier teaches that the classifier may be a discriminative supervised classifier (¶ 5, 31) Dahlmeier does not explicitly teach: [the embedding being a vector generated] by a second neural network; [wherein the classifier is generated] by a third neural network.
Kumar teaches: [the embedding being a vector generated] by a second neural network; [wherein the classifier is generated] by a third neural network. (Kumar col. 3, lns. 44-56 teaches a natural language classifier generated by an artificial neural network which receives a linguistic representation as input values and outputs a probability distribution. The second and third neural networks are interpreted as being the same classifier neural network.)
Kumar is in the same field of endeavor as Dahlmeier, namely, training artificial neural network classifiers for natural language processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Kumar’s system into Dahlmeier’s system by implementing the embedding and classifier with a neural network, with a motivation to train the classifier by reinforcement learning (Kumar col. 3, ln. 44)

Regarding claim 9, Dahlmeier in view of Kumar teaches: The method of claim 8, 
Further, Dahlmeier teaches: wherein the method comprises: following each of some or all of the search steps, skimming off the element or combination of elements from each of some or all of the paths generated from across some or all of the points in the sequence into a candidate pool, the (Dahlmeier ¶ 41: “If a hypothesis has a higher score compared to the best hypothesis found so far in previous iterations, it is added to the pool.”)
applying a fourth neural network to each entry to generate a new probability score for each candidate result in the candidate pool; (As stated in the 112(b) rejection, “a fourth neural network” is interpreted as “a neural network” which corresponds to the decoder performing this limitation in Dahlmeier ¶ 39.)
wherein said comparing comprises comparing the new probability scores in the candidate pool, and said selection comprises a selection of one or more of the candidate results having the highest of the new probability scores; and Dahlmeier ¶ 39-41 teaches that the decoder scores each hypothesis and selects the best one, which involves comparing scores)
wherein some or all of the first, second, third or fourth neural networks are subgraphs of the same wider network, and are trained together. (The first neural network is the only one of a subgraph of a same wider neural network)

Regarding claim 11, Dahlmeier teaches: The method of claim 11, wherein the classifier is generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the position of the respective point in the sequence. (Dahlmeier ¶ 34: “Supervised classifiers can be used for particular grammatical errors by letting the classifier predict the correct word for a particular sentence context…  For example, for preposition errors, a classifier can be trained to predict the correct preposition, given a feature representation of the surrounding context, e.g., the words to the left and right of the preposition.” The decoder in ¶ 39 is interpreted as being a neural network which accepts a weight vector and a feature map comprising the results of the classifier.)
(interpreted as “by a neural network”).
Kumar teaches: [wherein the classifier is generated] by a third neural network (Kumar col. 3, lns. 44-56 teaches a natural language classifier generated by an artificial neural network which receives a linguistic representation as input values and outputs a probability distribution.)
Kumar is in the same field of endeavor as Dahlmeier, namely, training artificial neural network classifiers for natural language processing. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Kumar’s system into Dahlmeier’s system by implementing the classifier with neural network, with a motivation to train the classifier by reinforcement learning (Kumar col. 3, ln. 44)
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Non-patent literature “Sentence Correction Based on Large-scale Language Modelling” to Wen teaches imputing missing words.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Asher H. Jablon whose telephone number is (571)270-7648.  The examiner can normally be reached on Monday - Friday, 8:30 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.







/ASHER H. JABLON/Examiner, Art Unit 2122                                                                                                                                                                                                        
/ERIC NILSSON/Primary Examiner, Art Unit 2122