Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Amendments
Claims 1, 14, 16, and 18-20 are amended. Claims 1-20 are pending and have been considered.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10/13/2021 has been entered.

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Claim Objections
Claims 7, 16, 18, and 20 are objected to because of the following informality:
Claim 7, the last paragraph should recite: “the additional candidate element”.
Claim 16, line 4, remove the hyphen in “preceeding-search”
Claim 18, lines 2-3 recite a redundant limitation of training the first neural network that was previously recited by claim 1, line 2. Claim 18, lines 2-3 should recite “and wherein comprises
A computer apparatus”. 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

RELATIVE TERMS:
The term "lower" in claim 3, line 2 is a relative term which renders the claim indefinite.  The term "lower" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.  “Lower” is a term of degree as discussed in MPEP 2173.05(b)(I.). For examining purposes, Examiner interprets the claim as if lines 2-3 had recited: “pruning away paths scoring below a first threshold based on the probability scores”.

Claim 4 is rejected for failing to cure the limitations of claim 3 upon which it depends. Additionally, claim 4, four lines from the end, should recite “paths having probability scores greater than a second threshold” to distinguish this threshold from the one recited by claim 3.

The term "wider" in claim 9, last line is a relative term which renders the claim indefinite.  The term "wider" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.  “Wider” is a term of degree as discussed in MPEP 2173.05(b)(I.) and it is unclear whether “wider” modifies one, some, or all of the first, second, third, and fourth neural networks. For examining purposes, Examiner interprets the claim as if it had recited “a same [[wider]] network comprising the first, second, third, and fourth neural networks”.

LACK OF ANTECEDENT BASIS:
Claims 1, 19, and 20 recite the limitation "or elements" in the second bullet, line 5 in each claim.  There is insufficient antecedent basis for this limitation in the claims. Only a single candidate element per path has been recited by each claim at the first bullet point, line 2: “wherein each path comprises a candidate element”. For purposes of examination, Examiner interprets claims 1, 19, and 20 as if they had not recited “or elements.”
Claims 2-18 are rejected for failing to cure the limitations of claim 1 upon which they depend.

The limitation "the input sequence" is recited twice by claim 14 in lines 2 and 4 and once by claim 16 in line 4.  There is insufficient antecedent basis for this limitation in the claim. For examining purposes, Examiner interprets the first recitation in claim 14 as “an [[the]] input sequence” and the same for claim 16.

UNCLEAR TERMS:
claim 4 in ¶ 2 line 1 and ¶ 5 line 2, by claim 9 in line 2, by claim 12 in line 2, and by claim 13 in line 3. It is unclear how skimming differs from pruning because both reduce a list of candidates. For purposes of examination, Examiner interprets skimming off elements as retaining elements after pruning a path, and skimmed-off elements as elements remaining after pruning.

Claim 13 is rejected for failing to cure the deficiencies of claim 12 upon which it depends.

The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.


Claim 17 is rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends. The following table compares claims 1 and 17.
Claim 1 (L. 3-7)
Claim 17
dividing a portion of input data into a sequence of input elements, each element in the input 
identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed, each of the plurality of points comprising a gap between a pair of adjacent words;
.


Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claim 19 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. Claim 19 discloses “a device storing code” which is not specifically disclosed in the specification. The specification does not specifically disclose what is included and excluded from the limitation of the claimed device. Also, instant specification paragraph 75 discloses that programs are stored on a computer-readable storage of a client device 102. It is not specifically disclosed as to what is included or excluded as part of the computer-readable storage. Therefore, it is unclear if the claimed device is a statutory element or if it can be a non-statutory element. Under the broadest reasonable interpretation, it can be both. Therefore, claim 19 is non-statutory. 

Claim 20 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claim does not fall within at least one of the four categories of patent eligible subject matter because it is directed towards software per se. See MPEP § 2106.03, subsection I. Neither the claim nor the specification preclude the one or more “processing units” from being software per se. Applicant should change each recitation of “processing unit” to “processor” which has support in specification ¶ [0073] and ¶ [0088].
For the sake of compact prosecution, Examiner will complete the inquiry for claims 19-20 as if they had fallen into a statutory category. 

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

CLAIM 1
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The claim recites the following limitations:
(1) dividing a portion of input data into a sequence of input elements, each element in the input elements comprising a word or a gap between words;
(2) identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed, each of the plurality of points comprising a gap between a pair of adjacent words; 
(3) for each respective one of said points: - in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, and 
generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and 
(5) - in each of a plurality of subsequent successive search steps, selecting a set of one or more of the paths from one or more of the search steps to extend, the selection being based on the associated probability scores, and 
(6) generating a respective set of one or more extended paths from each respective one of the selected set of paths, each extended path comprising the candidate element or elements from the respective path combined with an additional candidate element, and 
(7) an associated probability score for the combination, this probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the probability score for the respective path; and 
(8) performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps
	Limitations 1-3, 5-6, and 8 are mental processes (see bolded terms) which can reasonably be performed in one’s mind with the aid of pencil and paper. Limitations 4 and 7 are mathematical calculations of computing probability scores. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
training a first neural network
[score being generated by] the first neural network
outputting, based on the comparison, a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths.

Outputting results is an insignificant extra-solution activity because it is well-known. See MPEP 2106.05(g): 
“When determining whether an additional element is insignificant extra-solution activity, examiners may consider the following: (1) Whether the extra-solution limitation is well known”. 
Adding insignificant extra-solution activity is not sufficient to integrate the additional elements into a practical application.
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. 
Outputting information is well-known in the art, as disclosed by Wical (US Patent 6,460,034, published 2002) at C. 9, L. 30-32: “A screen module, such as screen module 230, which processes information for display on a computer output display, is well known in the art.”
The claim is not patent eligible.

CLAIM 2 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:
selected from the immediately search step. 
	The limitation is a mental process of selecting which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recites additional elements to impose meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

CLAIM 3 incorporates the rejection of claim 2.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 2 are incorporated. The claim recites the following limitations:
(1) following each of one, some or all of said search steps for each point, pruning away lower scoring ones of the paths based on the probability scores, thus leaving only one or some of the paths remaining; 
(2) wherein for each of said plurality of points, in each of the successive search steps, said set of paths to be extended are the paths remaining after any pruning.
Limitation 1 is a mental process of deciding to prune lower scoring paths, interpreted as deciding not to extend lower scoring paths. Limitation 2 is a mental process of deciding to extend higher scoring paths. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recites additional elements to impose meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

CLAIM 4 incorporates the rejection of claim 3.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 3 are incorporated. The claim recites the following limitations:
(1) following each of some or all of the search steps, skimming off the element or combination of elements from each of some or all of the paths generated from across some or all of the points in the sequence into a candidate pool, 
(2) the element or combination of elements from each of the skimmed-off paths forming a respective candidate result in the candidate pool; and 
(3) generate a new probability score for each candidate result in the candidate pool; 
(4) wherein said comparing comprises comparing the new probability scores in the candidate pool, and 
(5) said selection comprises a selection of one or more of the candidate results having the highest of the new probability scores; 
(6) wherein said skimming comprises, for each current one of the search steps, after the current search step is completed across all the points in the sequence, skimming off the element or combination of elements from each of only a selected subset of the paths generated in the current search step into the candidate pool as candidate results, 
is selected as those paths having greater than a threshold probability score, or those in a highest portion according to the probability score; and 
(8) wherein the selected subset is selected only from amongst the paths remaining after the pruning in the current search step.
Limitations 1-2 and 4-7 are mental processes (see bolded terms) which can reasonably be performed in one’s mind with the aid of pencil and paper. Limitation 3 is a mathematical calculation of computing a probability score. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
applying another neural network to each entry
The BRI of “applying another neural network” includes training a neural network. Both of “another neural network” and “applying” are generally linking the abstract ideas to the particular technology environment of machine learning, and they are not improvements to machine learning technology. Therefore, they are not meaningful limitations. See MPEP 2106.05(e) and (h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. 
The claim is not patent eligible.

CLAIM 5 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitation:
wherein each successive search step does not proceed for any of the points until the immediately preceding search step has been performed for all of the points.
This limitation is a mental process merely of deciding not proceed until the conditions are met. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recites additional elements to impose meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

CLAIM 6 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:
(1) for each respective one of said points: prior to the first search step, generating a respective embedding for the respective point, the embedding being a vector generated … as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function a position of the respective point in the sequence; 
(2) wherein in the first search step for each of said points, the candidate elements of the respective one or more paths are generated based on a… state that is a function of the respective embedding.

Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
a second neural network 
decoder
A second neural network and a decoder are generally linking the abstract ideas to the particular technology environment of machine learning, and they are not an improvement to machine learning technology. Therefore, they are not meaningful limitations. See MPEP 2106.05(e) and (h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 7 incorporates the rejection of claim 6.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 6 are incorporated. The claim recites the following limitation:
(1) for each of said points: between each successive subsequent search step and the preceding search step, at least for the selected set of paths, updating the… state as a function of the candidate elements in the respective path; 
generated based on the updated… state for the respective path.
Limitation 1 is a mathematical calculation of computing a state as a function of candidate elements. Limitation 2 is a mental process of generating an additional element based on the updated state, which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
decoder
A decoder is generally linking the abstract ideas to the particular technology environment of machine learning, and it is not an improvement to machine learning technology. Therefore, it is not a meaningful limitation. See MPEP 2106.05(e) and (h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 8 incorporates the rejection of claim 6.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 6 are incorporated. The claim recites the following limitation:
generated… for each path in the first search step also being a function… for the respective position, 
(2) representing a probability that the respective point has a missing or erroneous element; and 
(3) generated… as a function of the respective embedding.
	Limitations 1 and 3 are mathematical calculations. Limitation 2 is a mental process of evaluating which can be reasonably performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
the first neural network
of an initial classifier
a third neural network
Each of the first neural network, an initial classifier, and a third neural network are generally linking the abstract ideas to the particular technology environment of machine learning, and they are not an improvement to machine learning technology. Therefore, they are not meaningful limitations. See MPEP 2106.05(e) and (h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 9 incorporates the rejection of claim 8.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 8 are incorporated. The claim recites the following limitation:
(1) following each of some or all of the search steps, skimming off the element or combination of elements from each of some or all of the paths generated from across some or all of the points in the sequence into a candidate pool, 
(2) the element or combination of elements from each of the skimmed-off paths forming a respective candidate result in the candidate pool; and 
(3) generate a new probability score for each candidate result in the candidate pool; 
(4) wherein said comparing comprises comparing the new probability scores in the candidate pool, and 
(5) said selection comprises a selection of one or more of the candidate results having the highest of the new probability scores; and
	Limitations 1-2 and 4-5 are mental processes (see bolded terms) which can reasonably be performed in one’s mind with the aid of pencil and paper. Limitation 3 is a mathematical calculation of computing a probability score. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
applying a fourth neural network to each entry
wherein some or all of the first, second, third or fourth neural networks are subgraphs of a same wider network
wherein some or all of the first, second, third or fourth neural networks are trained together.
The BRI of “applying a fourth neural network” includes training a neural network. A first, second, third, and fourth neural networks, a same wider network, and applying a fourth neural network are 
The broadest reasonable interpretation of training four neural networks together includes training a neural network having at least four combinations of nodes. Training four neural networks together is generally linking the abstract ideas to the particular technology environment of machine learning, and it is not an improvement to machine learning technology. Therefore, it is not a meaningful limitation. See MPEP 2106.05(e) and (h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. 
The claim is not patent eligible.

CLAIM 10 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitation:
(1) wherein for each of said points, the probability score generated… for each path in the first search step is also a function… for the respective position, 
(2) representing a probability that the respective point has a missing or erroneous element.

Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
the first neural network
an initial classifier
The first neural network and an initial classifier are generally linking the abstract ideas to the particular technology environment of machine learning, and they are not improvements to machine learning technology. Therefore, they are not a meaningful limitation. See MPEP 2106.05(e) and (h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 11 incorporates the rejection of claim 10.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 10 are incorporated. The claim recites the following limitation:
generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the position of the respective point in the sequence.

Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
classifier
another neural network
A classifier and another neural network are generally linking the abstract ideas to the particular technology environment of machine learning, and they are not an improvement to machine learning technology. Therefore, they are not meaningful limitations. See MPEP 2106.05(e) and (h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 12 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitation:
(1) following each of some or all of the search steps, skimming off the element or combination of elements from each of some or all of the paths generated from across some or all of the points in the sequence into a candidate pool, 
(2) the element or combination of elements from each of the skimmed-off paths forming a respective candidate result in the candidate pool; and 
generate a new probability score for each candidate result in the candidate pool; 
(4) wherein said comparing comprises comparing the new probability scores in the candidate pool, and 
(5) said selection comprises a selection of one or more of the candidate results having the highest of the new probability scores.
Limitations 1-2 and 4-5 are mental processes (see bolded terms) which can reasonably be performed in one’s mind with the aid of pencil and paper. Limitation 3 is a mathematical calculation of computing a probability score. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional element:
applying another neural network to each entry
Another neural network is generally linking the abstract ideas to the particular technology environment of machine learning, and it is not an improvement to machine learning technology. Therefore, it is not a meaningful limitation. See MPEP 2106.05(e) and (h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 13 incorporates the rejection of claim 12.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 12 are incorporated. The claim recites the following limitation:
wherein said skimming comprises, for each current one of the search steps, after the current search step is completed across all the points in the sequence, skimming off the element or combination of elements from each of only a selected subset of the paths generated in the current search step into the candidate pool as candidate results, 
wherein the subset is selected as those paths having greater than a threshold probability score, or those in a highest portion according to the probability score.
These limitation are mental processes (see bolded terms) which can reasonably be performed in one’s mind with the aid of pencil and paper. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recites additional elements to impose meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

CLAIM 14 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitations:
(1) for each of said points: prior to the first search step, including an end-of-sequence element in the input sequence at an end of the sequence to represent an end of the portion of input data, and/or
(2) including a start-of-sequence element in the input sequence at a start of the sequence to 
include the end-of-sequence element and/or the start-of-sequence element.
These limitations are mental processes of deciding to include an end-of-sequence element and a start-of-sequence element in the sequences and input elements. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional element:
at least one neural network
At least one neural network is generally linking the abstract ideas to the particular technology environment of machine learning, and it is not an improvement to machine learning technology. Therefore, it is not a meaningful limitation. See MPEP 2106.05(e) and (h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 15 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitation:
wherein in one, some or all of the search steps for each of some or all of the points, the generating of the paths comprises generating a respective set of multiple paths for each respective one 
This limitation is a mental processe of generating paths. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional element:
the first neural network.
The first neural network is generally linking the abstract ideas to the particular technology environment of machine learning, and it is not an improvement to machine learning technology. Therefore, it is not a meaningful limitation. See MPEP 2106.05(e) and (h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. The claim is not patent eligible.

CLAIM 16 incorporates the rejection of claim 15.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitation:
wherein amongst the multiple paths for each respective point having multiple paths in a current search step, the candidate elements for one of the paths includes a rejoin-sequence element stopping the search for the respective point and rejoining the candidate element or elements from the preceding-search steps to the input sequence.
This limitation is a mental process of deciding to stop the search for the respective point and rejoining the candidate element or elements from the preceding search steps to the input sequence. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recites additional elements to impose meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

CLAIM 17 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitation:
wherein said points are gaps between the input elements where missing data is potentially to be imputed.
This limitation is a mental process of marking the gaps between input elements as said points.
Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim does not recites additional elements to impose meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

CLAIM 18 incorporates the rejection of claim 1.
Step 1: The claim recites a method, one of the four categories of eligible subject matter.
Step 2A Prong 1: The judicial exceptions of claim 1 are incorporated. The claim recites the following limitation:
wherein the portion of input data comprises a portion of text, and the input elements from the text are words or characters.
This limitation modifies limitation (1) of claim 1, which was identified as being a mental process, in a way that doesn’t affect the analysis of it being a mental process. Dividing text into words or characters is a mental process. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2: The judicial exceptions are not integrated into a practical application. The claim recites the following additional elements:
training the first neural network
supervised training with a training data set comprising input data points and predetermined desired output data points
a reinforcement approach
an unsupervised approach
Each additional element is generally linking the abstract ideas to the particular technology environment of machine learning, and is not an improvement to machine learning technology. Therefore, they are not meaningful limitations. See MPEP 2106.05(e) and (h).
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not recite any additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2.
The claim is not patent eligible.

CLAIM 19
Step 1: The claim recites signal per se. For sake of compact prosecution, Examiner will complete the inquiry for claim 19 as if it had fallen into a statutory category.
Step 2A Prong 1: The claim recites the following limitations:
(1) dividing a portion of input data into a sequence of input elements, each element in the input elements comprising a word or a gap between words; 
(2) identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed, each of the plurality of points comprising a gap between a pair of adjacent words; 
(3) for each respective one of said points: - in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, and 
(4) an associated probability score of the candidate element, the probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and 
(5) - in each of a plurality of subsequent successive search steps, selecting a set of one or more of the paths from one or more of the search steps to extend, the selection being based on the associated probability scores, and 
(6) generating a respective set of one or more extended paths from each respective one of the selected set of paths, each extended path comprising the candidate element or elements from the respective path combined with an additional candidate element, and 
generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the probability score for the respective path; and 
(8) performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps
Limitations 1-3, 5-6, and 8 are mental processes (see bolded terms) which can reasonably be performed in one’s mind with the aid of pencil and paper. Limitations 4 and 7 are mathematical calculations of computing probability scores. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The claim recites the following additional elements:
A device storing code configured to perform operations of automatically
training a first neural network;
[score being generated by] the first neural network
outputting, based on the comparison, a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths.
The device storing code, a first neural network, and training a first neural network are generally linking the abstract ideas to the particular technology environment of machine learning, and they are not improvements to machine learning technology. Therefore, they are not meaningful limitations. See MPEP 2106.05(e) and (h). 
Outputting results is an insignificant extra-solution activity because it is well-known. See MPEP 2106.05(g): 
“When determining whether an additional element is insignificant extra-solution activity, examiners may consider the following: (1) Whether the extra-solution limitation is well known”. 

Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. 
Outputting information is well-known in the art, as disclosed by Wical (US Patent 6,460,034, published 2002) at C. 9, L. 30-32: “A screen module, such as screen module 230, which processes information for display on a computer output display, is well known in the art.”
The claim is not patent eligible.

CLAIM 20
Step 1: The claim recites software per se. For sake of compact prosecution, Examiner will complete the inquiry for claim 20 as if it had fallen into a statutory category.
Step 2A Prong 1: The claim recites the following limitations:
(1) dividing a portion of input data into a sequence of input elements, each element in the input elements comprising a word or a gap between words; 
(2) identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed, each of the plurality of points comprising a gap between a pair of adjacent words; 
(3) for each respective one of said points: - in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, and 
generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and 
(5) - in each of a plurality of subsequent successive search steps, selecting a set of one or more of the paths from one or more of the search steps to extend, the selection being based on the associated probability scores, and 
(6) generating a respective set of one or more extended paths from each respective one of the selected set of paths, each extended path comprising the candidate element or elements from the respective path combined with an additional candidate element, and 
(7) an associated probability score for the combination, this probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the probability score for the respective path; and 
(8) performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps
Limitations 1-3, 5-6, and 8 are mental processes (see bolded terms) which can reasonably be performed in one’s mind with the aid of pencil and paper are mathematical calculations of computing probability scores. Accordingly, the claim recites an abstract idea. 
Step 2A Prong 2: The claim recites the following additional elements:
one or more processing units
training a first neural network;
[score being generated by] the first neural network
outputting, based on the comparison, a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths.

Outputting results is an insignificant extra-solution activity because it is well-known. See MPEP 2106.05(g): 
“When determining whether an additional element is insignificant extra-solution activity, examiners may consider the following: (1) Whether the extra-solution limitation is well known”. 
Adding insignificant extra-solution activity is not sufficient to integrate the additional elements into a practical application.
Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception for the reasons given in Step 2A Prong 2. 
Outputting information is well-known in the art, as disclosed by Wical (US Patent 6,460,034, published 2002) at C. 9, L. 30-32: “A screen module, such as screen module 230, which processes information for display on a computer output display, is well known in the art.”
The claim is not patent eligible.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-4, 10, and 12-20 are rejected under 35 U.S.C. 103 as being unpatentable over Lefebure et al. (US 20190108257 A1) in view of Sutskever et al. (“Sequence to Sequence Learning with Neural Networks”).

	Regarding CLAIM 1, Lefebure teaches: A computer-implemented method comprising automatically: 
dividing a portion of input data into a sequence of input elements, each element in the input elements comprising a word or a gap between words; (Lefebure at ¶ [0082] teaches producing a sequence of tokens from audio: “A speech recognition module 11 receives speech audio from a person issuing a spoken expression. The speech recognition module 11 produces a sequence of tokens.” Lefebure at ¶ [0083], lines 1-3 states: “Some embodiments receive token sequences from a person by means other than speech audio, such as typing on a keyboard.” A 5-token sequence “the weather pin Hawaii today” is given in Fig. 17A, as disclosed in ¶ [0140], first sentence.)
identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed, each of the plurality of points comprising a gap between a pair of adjacent words; (A single point in the sequence at which erroneous data is potentially to be imputed includes the erroneous token “pin” in Fig. 17A, as disclosed in ¶ [0140], first sentence. The broadest reasonable interpretation of a gap between a pair of adjacent words, in light of instant specification ¶ [045] and [101], includes the token “pin” between the pair of tokens “weather” and “Hawaii.” A plurality of points is taught at ¶ [0135], lines 5-10: “Some such embodiments first create multiple alternative rewrites, each by making a single edit at a position with a probability score below a threshold, and then proceed to make further rewrites by editing at second positions within the first set of rewritten token sequences to create further rewrites.”)
for each respective one of said points: (Lefebure teaches the token sequence rewrite flowchart for a token replacement in Fig. 21 is an extension of Fig. 19 (see ¶ [0149] second sentence), which is an extension of Fig. 14 (see ¶ [0145], second sentence). Since Fig. 17A-D are based on Fig. 14, they also apply to Figs. 19 and 21.)
- in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, (Fig. 17D shows two paths for replacing the erroneous token “pin” with either “in” or “for”. See ¶ [0142].) 
and an associated probability score of the candidate element, (An associated probability score is interpreted as a score based on probabilities, which includes the rewrite scores in Fig. 17D. The rewrite scores are described in ¶ [0141] – [0142].)
the probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and (¶ [0137]: “The list of forward most probable tokens for each position comprises, for the previous N tokens, a set of most probable tokens and their probabilities. The list of backward most probable tokens for each position comprises, for the following N tokens, a set of most probable tokens and their probabilities. For choosing a token for insertion or replacement, the token replacement module 45 searches for tokens present in both lists (the intersection of the lists) and chooses a new token based on the probability of the token in each list.”)
- in each of a plurality of subsequent successive search steps, selecting a set of one or more of the paths from one or more of the search steps to extend, (¶ [0162], lines 1-4 states: “Some embodiments, in order to correct multiple wrong words or errors causing multiple missing words, feed rewritten token sequences through additional iterations of the rewriting flow….”)
 the selection being based on the associated probability scores, and (¶ [0162], lines 4-7: “…Iteration can proceed by selecting rewrites for reprocessing based on their rewrite score.” Also, ¶ [0149], lines 5-10 state: “A choosing module 218 uses the rewrite scores to choose a best rewritten token sequence among the multiple alternative rewrites. Some alternatives may have different edits at the same position. Some alternatives may have edits at different positions.”)
generating a respective set of one or more extended paths from each respective one of the selected set of paths, (¶ [0162], lines 8-9: “Selection may work by attempting a breadth-first search tree algorithm” and ¶ [0161], lines 3-7: “Some embodiments produce lists of possible rewrites of input token sequences. That is possible either by making different edits at the same suspicious position, making edits at different suspicious positions, or both”.)
each extended path comprising the candidate element or elements from the respective path combined with an additional candidate element, and an associated probability score for the combination, (¶ [0162] states: “Some embodiments, in order to correct multiple wrong words or errors causing multiple extra words, missing words, or repeated words, feed rewritten token sequences through additional iterations of the rewriting flow. Iteration can proceed either by creating simple lists of rewrites or by selecting rewrites for reprocessing based on either their rewrite score, their grammar parse score, or a combination of both. Selection may work by attempting a depth-first, breadth-first, or best-score-first search tree algorithm” (Examiner has added underlines for emphasis). Here, Lefebure teaches using a breadth-first search tree algorithm to iteratively insert missing words into a token sequence, where each sequence for rewriting is chosen based on its score.)
this probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the probability score for the respective path; and (The token replacement module 215 from Fig. 21 generates a rewrite score by the same manner the token replacement module 195 from Fig. 19 generates a rewrite score. In 
performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and (Broadly interpreted as comparing the rewritten token sequences. ¶ [0149], lines 5-8: “A choosing module 218 uses the rewrite scores to choose a best rewritten token sequence among the multiple alternative rewrites.”)
outputting, based on the comparison, a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths. (Fig. 21 shows the output of choosing module 18 is the best rewritten token sequence. ¶ [0161], lines 1-2 states: “Some embodiments, upon finding a successful rewrite, complete their processing and provide the successful rewritten token sequence as output.” The iterations of generating new token sequences and rewrite scores is also shown by Fig. 1. In ¶ [0163], lines 1-4 state: “FIG. 1 shows a process of iteratively rewriting new token sequences. Some embodiments, after the rewriting stage 12 reprocess the new token sequences according to their rewrite scores.)
	However, Lefebure does not explicitly teach: training a first neural network; and by the first neural network
But Sutskever teaches: scores generated by the first neural network (Sutskever teaches a long short-term memory (LSTM)-based neural network on p. 3, above equation 1: “The goal of the LSTM is to estimate the conditional probability” of the output sequence given the input sequence. In equation 1, the right side shows a probability of an output state                         
                            
                                
                                    y
                                
                                
                                    t
                                
                            
                        
                     is conditioned upon                         
                            v
                        
                     the fixed-dimensional representation of the input sequence given the last hidden state of the LSTM and upon                         
                            
                                
                                    y
                                
                                
                                    1
                                
                            
                            ,
                            …
                             
                            ,
                             
                            
                                
                                    y
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     the 
training a first neural network; (Sutskever teaches training the LSTM on p. 4, § 3.2, ¶ 1: “We trained it by maximizing the log probability of a correct translation T given the source sentence S… where S is in the training set.” Training is also taught at p. 5, § 3.4, with details explained in the bullet points.)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Sutskever’s LSTM model shown in Fig. 1 for the statistical language models (SLM) in Lefebure, specifically as the forward SLM 42 and backward SLM 44 from Figs. 14, 19, and 21. A motivation for the combination is that Sutskever’s LSTM model is a probabilistic model that predicts an output sequence given an input sequence as shown in Sutskever’s Equation 1.

	Regarding CLAIM 2, the combination of Lefebure and Sutskever teaches: The method of claim 1, 
	Lefebure teaches selecting rewrites may work by attempting a breadth-first search tree algorithm, but Lefebure does not explicitly teach: wherein in each of the successive search steps for each point, the set of one or more paths to extend is selected from the immediately preceding search step.
	However, Sutskever teaches: wherein in each of the successive search steps for each point, the set of one or more paths to extend is selected from the immediately preceding search step. (In Sutskever, p. 4, § 3.2 discloses performing a beam search during training, which is a type of breadth-first search algorithm. In the paragraph under equation 2, lines 3-4 state: “At each timestep we extend each partial hypothesis in the beam with every possible word in the vocabulary.”)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have extended Lefebure’s path from the immediately preceding search step (Sutskever, p. 4, under equation 2, lines 3-4: “At each timestep we extend each partial hypothesis in the beam with every possible word in the vocabulary.”)

Regarding CLAIM 3, the combination of Lefebure and Sutskever teaches: The method of claim 2, comprising: 
Further, Lefebure teaches: following each of one, some or all of said search steps for each point, pruning away lower scoring ones of the paths based on the probability scores, thus leaving only one or some of the paths remaining; (In choosing a path with a higher rewrite score, the lower-scoring paths are effectively pruned away. Regarding Fig. 17D, Lefebure ¶ [0142] states, starting at lines 4: “The adding method finds a higher rewrite score for the second token, but the multiplying and sum of logs methods find a higher rewrite score for the first token. For some embodiments, the method of multiplying or adding logs is preferable. It has the effect of favoring tokens that are more reasonable in both directions, whereas the method of adding probabilities favors tokens that are sensible in one direction but nonsense in the other direction.”)
wherein for each of said plurality of points, in each of the successive search steps, said set of paths to be extended are the paths remaining after any pruning. (Lefebure ¶ [0162], lines 4-7: “…Iteration can proceed by selecting rewrites for reprocessing based on their rewrite score.”  Paths that are not reprocessed are effectively pruned.)

Regarding CLAIM 4, the combination of Lefebure and Sutskever teaches: The method of claim 3, 
Further, Lefebure teaches: wherein the method comprises: following each of some or all of the search steps, skimming off the element or combination of elements from each of some or all of the paths generated from across some or all of the points in the sequence into a candidate pool, (Under the broadest reasonable interpretation, in light of the specification, “skimming off” elements is interpreted as performing pruning in any iteration after the one in claim 3 and retaining some elements. The retained elements are the “skimmed-off” elements recited later in the claim. Lefebure ¶ [0162], lines 4-7: “…Iteration can proceed by selecting rewrites for reprocessing based on their rewrite score.” Also ¶ [0084], last four lines: “Some embodiments feed new token sequences back into rewriting module 12 and reprocess them in an attempt to produce even better rewritten token sequences. This is useful if, for example, token sequences have multiple errors.” In choosing a path with a higher rewrite score, the lower-scoring paths are effectively pruned away. See Lefebure ¶ [0142], starting at lines 4.)
the element or combination of elements from each of the skimmed-off paths forming a respective candidate result in the candidate pool; and (Candidate result in a candidate pool is interpreted as the best rewritten token sequence after pruning. See Lefebure ¶ [0142], starting at lines 4.)
generate a new probability score for each candidate result in the candidate pool; (A new probability score is generated in the next iteration, as disclosed in Lefebure ¶ [0084], last four lines. For generating a new probability score, see Lefebure ¶ [0142], starting at lines 4.)
wherein said comparing comprises comparing the new probability scores in the candidate pool, and (See Lefebure ¶ [0142], starting at lines 4. Sum of probabilities, product of probabilities and sum of log probabilities are compared.)
said selection comprises a selection of one or more of the candidate results having the highest of the new probability scores; (Highest probability score is interpreted as most preferable. ¶ [0142], last two sentences: “For some embodiments, the method of multiplying or adding logs is preferable. It has the effect of favoring tokens that are more reasonable in both directions, whereas the method of adding probabilities favors tokens that are sensible in one direction but nonsense in the other direction.”)
wherein said skimming comprises, for each current one of the search steps, after the current search step is completed across all the points in the sequence, skimming off the element or combination of elements from each of only a selected subset of the paths generated in the current search step into the candidate pool as candidate results, (Under the broadest reasonable interpretation, in light of the specification, skimming elements is interpreted as pruning in a subsequent iteration. Lefebure ¶ [0162], lines 4-7: “…Iteration can proceed by selecting rewrites for reprocessing based on their rewrite score.” In choosing a path with a higher rewrite score, the lower-scoring paths are effectively pruned away.)
wherein the subset is selected as those paths having greater than a threshold probability score, or those in a highest portion according to the probability score; and (Lefebure ¶ [0142], lines 4-5 and 10-12 states, “The adding method finds a higher rewrite score for the second token… [T]he method of adding probabilities favors tokens that are sensible in one direction but nonsense in the other direction.”)
wherein the selected subset is selected only from amongst the paths remaining after the pruning in the current search step. (“skimming off” elements is interpreted as pruning in a subsequent iteration. Lefebure ¶ [0162], lines 4-7: “…Iteration can proceed by selecting rewrites for reprocessing based on their rewrite score.”)
	However, Lefebure does not explicitly teach: applying another neural network to each entry 
	But Sutskever teaches: applying another neural network to each entry (The broadest reasonable interpretation of this limitation, in light of the specification, is generating a probability score again using Sutskever’s neural network. Sutskever teaches a long short-term memory (LSTM)-based neural network on p. 3, above equation 1: “The goal of the LSTM is to estimate the conditional probability” of the output sequence given the input sequence. In equation 1, the right side shows a probability of an output state                         
                            
                                
                                    y
                                
                                
                                    t
                                
                            
                        
                     is conditioned upon                         
                            v
                        
                     the fixed-dimensional representation of the input                         
                            
                                
                                    y
                                
                                
                                    1
                                
                            
                            ,
                            …
                             
                            ,
                             
                            
                                
                                    y
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     the previous outputs. The last paragraph on p. 3, § 2 states two different LSTMs were used, one for the input sequence and another for the output sequence. Fig. 1 on p. 2 shows the model.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Sutskever’s LSTM model shown in Fig. 1 for the statistical language models (SLM) in Lefebure, specifically as the forward SLM 42 and backward SLM 44 from Figs. 14, 19, and 21. A motivation for the combination is that Sutskever’s LSTM model is a probabilistic model that predicts an output sequence given an input sequence as shown in Sutskever’s Equation 1.

Regarding CLAIM 10, the combination of Lefebure and Sutskever teaches: The method of claim 1, 
Further, Lefebure teaches (limitations re-ordered for clarity): the classifier representing a probability that the respective point has a missing or erroneous element. (Lefebure ¶ [0126] states: “FIG . 12B for English words show expressions with a wrong token error (e.g., “pin” rather than “in” in FIG. 12B). All tokens following the wrong token have a high probability in the backward direction and all tokens before the wrong token have a high probability in the forward direction. The wrong token has a low probability in each of the backward and forward directions” (edited by Examiner).)
wherein for each of said points, the probability score generated… for each path in the first search step is also a function of an initial classifier for the respective position, (This limitation is broadly interpreted as the probability score depending on the classifier’s output. Lefebure ¶ [0130] states: “Accordingly, a token that has a low probability in each of the backward and forward directions is suspicious. It indicates a position in which a deletion or replacement edit is likely appropriate.” Also, Lefebure ¶ [0136], lines 1-6 state: “Various embodiments create a rewrite by performing an edit at the position with a lowest combination of backward and forward probabilities on the same token” (edited 
However, Lefebure does not explicitly teach: by the first neural network
	But Sutskever teaches: by the first neural network (Sutskever teaches a long short-term memory (LSTM)-based neural network on p. 3, above equation 1: “The goal of the LSTM is to estimate the conditional probability” of the output sequence given the input sequence. In equation 1, the right side shows a probability of an output state                         
                            
                                
                                    y
                                
                                
                                    t
                                
                            
                        
                     is conditioned upon                         
                            v
                        
                     the fixed-dimensional representation of the input sequence given the last hidden state of the LSTM and upon                         
                            
                                
                                    y
                                
                                
                                    1
                                
                            
                            ,
                            …
                             
                            ,
                             
                            
                                
                                    y
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     the previous outputs. The last paragraph on p. 3, § 2 states two different LSTMs were used, one for the input sequence and another for the output sequence. Fig. 1 on p. 2 shows the model.)
Claim 10 includes the neural network of claim 1. In claim 1, the neural network generates probability scores.

Regarding CLAIM 12, the combination of Lefebure and Sutskever teaches: The method of claim 1, 
Further, Lefebure teaches: wherein the method comprises: following each of some or all of the search steps, skimming off the element or combination of elements from each of some or all of the paths generated from across some or all of the points in the sequence into a candidate pool, (Under the broadest reasonable interpretation, in light of the specification, “skimming off” elements is interpreted as performing pruning in any iteration and retaining some elements. The retained elements are the “skimmed-off” elements recited later in the claim. Lefebure ¶ [0162], lines 4-7: “…Iteration can proceed by selecting rewrites for reprocessing based on their rewrite score.” Also ¶ [0084], last four lines: “Some embodiments feed new token sequences back into rewriting module 12 and reprocess them in an attempt to produce even better rewritten token sequences. This is useful if, for example, 
the element or combination of elements from each of the skimmed-off paths forming a respective candidate result in the candidate pool; and (Candidate result in a candidate pool is interpreted as the best rewritten token sequence after pruning. See Lefebure ¶ [0142], starting at lines 4.)
to generate a new probability score for each candidate result in the candidate pool; (A new probability score is generated in the next iteration, as disclosed in Lefebure ¶ [0084], last four lines. For generating a new probability score, see Lefebure ¶ [0142], starting at lines 4.)
5wherein said comparing comprises comparing the new probability scores in the candidate pool, and (See Lefebure ¶ [0142], starting at lines 4. Sum of probabilities, product of probabilities and sum of log probabilities are compared.)
said selection comprises a selection of one or more of the candidate results having the highest of the new probability scores. (Highest probability score is interpreted as most preferable. ¶ [0142], last two sentences: “For some embodiments, the method of multiplying or adding logs is preferable. It has the effect of favoring tokens that are more reasonable in both directions, whereas the method of adding probabilities favors tokens that are sensible in one direction but nonsense in the other direction.”)
	However, Lefebure does not explicitly teach: applying another neural network to each entry
But Sutskever teaches: applying another neural network to each entry (The broadest reasonable interpretation of this limitation, in light of the specification, is generating a probability score again using Sutskever’s neural network. Sutskever teaches a long short-term memory (LSTM)-based neural network on p. 3, above equation 1: “The goal of the LSTM is to estimate the conditional probability” of the output sequence given the input sequence. In equation 1, the right side shows a probability of an output state                         
                            
                                
                                    y
                                
                                
                                    t
                                
                            
                        
                     is conditioned upon                         
                            v
                        
                     the fixed-dimensional representation of the input                         
                            
                                
                                    y
                                
                                
                                    1
                                
                            
                            ,
                            …
                             
                            ,
                             
                            
                                
                                    y
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     the previous outputs. The last paragraph on p. 3, § 2 states two different LSTMs were used, one for the input sequence and another for the output sequence. Fig. 1 on p. 2 shows the model.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Sutskever’s LSTM model shown in Fig. 1 for the statistical language models (SLM) in Lefebure, specifically as the forward SLM 42 and backward SLM 44 from Figs. 14, 19, and 21. A motivation for the combination is that Sutskever’s LSTM model is a probabilistic model that predicts an output sequence given an input sequence as shown in Sutskever’s Equation 1.

Regarding CLAIM 13, the combination of Lefebure and Sutskever teaches: The method of claim 12, 
Further, Lefebure teaches: wherein said skimming comprises, for each current one of the search steps, after the current search step is completed across all the points in the sequence, skimming off the element or combination of elements from each of only a selected subset of the paths generated in the current search step into the candidate pool as candidate results, (Under the broadest reasonable interpretation, in light of the specification, skimming elements is interpreted as pruning in a subsequent iteration. Lefebure ¶ [0162], lines 4-7: “…Iteration can proceed by selecting rewrites for reprocessing based on their rewrite score.” In choosing a path with a higher rewrite score, the lower-scoring paths are effectively pruned away.)
wherein the subset is selected as those paths having greater than a threshold probability score, or those in a highest portion according to the probability score. (Lefebure ¶ [0142], lines 4-5 and 10-12 states, “The adding method finds a higher rewrite score for the second token… [T]he method of adding probabilities favors tokens that are sensible in one direction but nonsense in the other direction.”)

Regarding CLAIM 14, the combination of Lefebure and Sutskever teaches: The method of claim 1 comprising, 
Further, Lefebure teaches: for each of said points: prior to the first search step, including an end-of-sequence element in the input sequence at an end of the sequence to represent an end of the portion of input data, and/or including a start-of-sequence element in the input sequence at a start of the sequence to repreent a start of the portion of input data; (¶ [0109], lines 8-9 teaches: “The symbols <s> and </s> indicate the beginning and end of a token sequence”)
wherein the input elements… include the end-of-sequence element and/or the start-of-sequence element. (See ¶ [0109], lines 8-9. Also, in Fig. 9 and 10, the start- and end-of-sequence elements are shown as sequence tokens.) 
However, Lefebure does not explicitly teach: of which at least one neural network is a function
But Sutskever teaches: of which at least one neural network is a function (Sutskever p. 3, below equation 1, states: “Note that we require that each sentence ends with a special end-of-sentence symbol “<EOS>”, which enables the model to define a distribution over sequences of all possible lengths. The overall scheme is outlined in figure 1, where the shown LSTM computes the representation of “A”, “B”, “C”, “<EOS>”.” Figure 1 shows <EOS> as an input element.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention for Sutskever’s LSTM to rely on Lefebure’s end-of-sequence element because it already relies on Sutskever’s own end-of-sequence element. (Sutskever p. 3, below equation 1)
Note that the claim limitation does not require a start-of-sequence element, so the limitations have been met.

CLAIM 15, the combination of Lefebure and Sutskever teaches: The method of claim 1, 
Further, Lefebure teaches: wherein in one, some or all of the search steps for each of some or all of the points, the generating of the paths comprises generating a respective set of multiple paths for each respective one of at least some said points, the multiple paths for the respective point each comprising a different candidate element and associated probability score (Fig. 17D shows two paths for replacing the erroneous token “pin” with either “in” or “for”. See ¶ [0142]. An associated probability score is interpreted as a score based on probabilities, which includes the rewrite scores in Fig. 17D. The rewrite scores are described in ¶ [0141] – [0142]. A plurality of points is taught by the last sentence of ¶ [0149]: “Some alternatives may have edits at different positions.”)
However, Lefebure does not explicitly teach: score based on the first neural network.
But Sutskever teaches: score based on the first neural network. (Sutskever teaches a long short-term memory (LSTM)-based neural network on p. 3, above equation 1: “The goal of the LSTM is to estimate the conditional probability” of the output sequence given the input sequence. In equation 1, the right side shows a probability of an output state                         
                            
                                
                                    y
                                
                                
                                    t
                                
                            
                        
                     is conditioned upon                         
                            v
                        
                     the fixed-dimensional representation of the input sequence given the last hidden state of the LSTM and upon                         
                            
                                
                                    y
                                
                                
                                    1
                                
                            
                            ,
                            …
                             
                            ,
                             
                            
                                
                                    y
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     the previous outputs. The last paragraph on p. 3, § 2 states two different LSTMs were used, one for the input sequence and another for the output sequence. Fig. 1 on p. 2 shows the model.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Sutskever’s LSTM model shown in Fig. 1 for the statistical language models (SLM) in Lefebure, specifically as the forward SLM 42 and backward SLM 44 from Figs. 14, 19, and 21, that predicts words with associated probability scores. A motivation for the combination is that Sutskever’s LSTM model is a probabilistic model that predicts an output sequence given an input sequence as shown in Sutskever’s Equation 1.

Regarding CLAIM 16, the combination of Lefebure and Sutskever teaches: The method of claim 15, 
Further Lefebure teaches: wherein amongst the multiple paths for each respective point having multiple paths in a current search step, the candidate elements for one of the paths includes a rejoin-sequence element representing stopping the search for the respective point and rejoining the candidate element or elements from the preceding-search steps to the input sequence. (The broadest reasonable interpretation of a “rejoin-sequence element,” in light of instant specification ¶ [124] and [127], includes the act of selecting the best-scoring path in Lefebure ¶ [0149], lines 5-8: “A choosing module 218 uses the rewrite scores to choose a best rewritten token sequence among the multiple alternative rewrites.”)

Regarding CLAIM 17, the combination of Lefebure and Sutskever teaches: The method of claim 1, 
Further Lefebure teaches: wherein said points are gaps between the input elements where missing data is potentially to be imputed. (A single point in the sequence at which erroneous data is potentially to be imputed includes the erroneous token “pin” in Fig. 17A, as disclosed in Lefebure ¶ [0140], first sentence. The broadest reasonable interpretation of a gap between a pair of adjacent words, in light of instant specification ¶ [045] and [101], includes the token “pin” between the pair of tokens “weather” and “Hawaii.”)

Regarding CLAIM 18, the combination of Lefebure and Sutskever teaches: The method of claim 1, 
wherein the portion of input data comprises a portion of text, and the input elements from the text are words or characters; and (Fig. 17A shows a 5-token sequence reading “the weather pin Hawaii today”. Fig. 17A is described at Lefebure ¶ [0140], first sentence.)
However, Lefebure does not explicitly teach: further comprising training the first neural network, the training comprising at least one of: supervised training with a training data set comprising input data points and predetermined desired output data points;  
a reinforcement approach; 
or an unsupervised approach.
But Sutskever teaches: further comprising training the first neural network, the training comprising at least one of: supervised training with a training data set comprising input data points and predetermined desired output data points; (Sutskever discloses training the LSTM model by supervised learning at p. 4, § 3.2, ¶ 1: “We trained it by maximizing the log probability of a correct translation T given the source sentence S… where S is in the training set.” Supervised training is also taught at Sutskever p. 5, § 3.4, with details explained in the bullet points.)
a reinforcement approach; or an unsupervised approach. (Examiner is not required to map to these limitations because they are listed as alternatives to “supervised training”.)
	Claim 18 includes the trained neural network of claim 1. In claim 1, the neural network first undergoes training and then generates probability scores. 

Regarding CLAIM 19, Lefebure teaches: A device storing code configured to perform operations of automatically: (Lefebure ¶ [0178], lines 9-11 states: “Server 304 uses a computer processor to execute code stored on a non-transitory computer readable medium.”)
	dividing a portion of input data into a sequence of input elements, each element in the input elements comprising a word or a gap between words; (Lefebure at ¶ [0082] teaches producing a 
identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed, each of the plurality of points comprising a gap between a pair of adjacent words; (A single point in the sequence at which erroneous data is potentially to be imputed includes the erroneous token “pin” in Fig. 17A, as disclosed in ¶ [0140], first sentence. The broadest reasonable interpretation of a gap between a pair of adjacent words, in light of instant specification ¶ [045] and [101], includes the token “pin” between the pair of tokens “weather” and “Hawaii.” A plurality of points is taught at ¶ [0135], lines 5-10: “Some such embodiments first create multiple alternative rewrites, each by making a single edit at a position with a probability score below a threshold, and then proceed to make further rewrites by editing at second positions within the first set of rewritten token sequences to create further rewrites.”)
for each respective one of said points: (Lefebure teaches the token sequence rewrite flowchart for a token replacement in Fig. 21 is an extension of Fig. 19 (see ¶ [0149] second sentence), which is an extension of Fig. 14 (see ¶ [0145], second sentence). Since Fig. 17A-D are based on Fig. 14, they also apply to Figs. 19 and 21.)
- in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, (Fig. 17D shows two paths for replacing the erroneous token “pin” with either “in” or “for”. See ¶ [0142].)
and an associated probability score of the candidate element, (An associated probability score is interpreted as a score based on probabilities, which includes the rewrite scores in Fig. 17D. The rewrite scores are described in ¶ [0141] – [0142].)
the probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and (¶ [0137]: “The list of forward most probable tokens for each position comprises, for the previous N tokens, a set of most probable tokens and their probabilities. The list of backward most probable tokens for each position comprises, for the following N tokens, a set of most probable tokens and their probabilities. For choosing a token for insertion or replacement, the token replacement module 45 searches for tokens present in both lists (the intersection of the lists) and chooses a new token based on the probability of the token in each list.”)
- in each of a plurality of subsequent successive search steps, selecting a set of one or more of the paths from one or more of the search steps to extend, (¶ [0162], lines 1-4 states: “Some embodiments, in order to correct multiple wrong words or errors causing multiple missing words, feed rewritten token sequences through additional iterations of the rewriting flow….”)
the selection being based on the associated probability scores, and (¶ [0162], lines 4-7: “…Iteration can proceed by selecting rewrites for reprocessing based on their rewrite score.” Also, ¶ [0149], lines 5-10 state: “A choosing module 218 uses the rewrite scores to choose a best rewritten token sequence among the multiple alternative rewrites. Some alternatives may have different edits at the same position. Some alternatives may have edits at different positions.”)
generating a respective set of one or more extended paths from each respective one of the selected set of paths, (¶ [0162], lines 8-9: “Selection may work by attempting a breadth-first search tree algorithm” and ¶ [0161], lines 3-7: “Some embodiments produce lists of possible rewrites of input token 
each 7extended path comprising the candidate element or elements from the respective path combined with an additional candidate element, and an associated probability score for the combination, (¶ [0162] states: “Some embodiments, in order to correct multiple wrong words or errors causing multiple extra words, missing words, or repeated words, feed rewritten token sequences through additional iterations of the rewriting flow. Iteration can proceed either by creating simple lists of rewrites or by selecting rewrites for reprocessing based on either their rewrite score, their grammar parse score, or a combination of both. Selection may work by attempting a depth-first, breadth-first, or best-score-first search tree algorithm” (Examiner has added underlines for emphasis). Here, Lefebure teaches using a breadth-first search tree algorithm to iteratively insert missing words into a token sequence, where each sequence for rewriting is chosen based on its score.)
this probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the probability score for the respective path; and (The token replacement module 215 from Fig. 21 generates a rewrite score by the same manner the token replacement module 195 from Fig. 19 generates a rewrite score. In ¶ [0145], line 5 states, regarding Fig. 19: “In some embodiments, the rewrite score is the product of the probability of each token in each of the forward and backward direction. In some embodiments, the rewrite score is the product of the probability of each token in each of the forward and backward direction, each token weighted by an acoustic score.”)
performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and (Broadly interpreted as comparing the rewritten token sequences. ¶ [0149], lines 5-8: “A choosing module 218 uses the rewrite scores to choose a best rewritten token sequence among the multiple alternative rewrites.”)
outputting, based on the comparison, a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths. (Fig. 21 shows the output of choosing module 18 is the best rewritten token sequence. ¶ [0161], lines 1-2 states: “Some embodiments, upon finding a successful rewrite, complete their processing and provide the successful rewritten token sequence as output.” The iterations of generating new token sequences and rewrite scores is also shown by Fig. 1. In ¶ [0163], lines 1-4 state: “FIG. 1 shows a process of iteratively rewriting new token sequences. Some embodiments, after the rewriting stage 12 reprocess the new token sequences according to their rewrite scores.)
However, Lefebure does not explicitly teach: training a first neural network; and by the first neural network
But Sutskever teaches: scores generated by the first neural network (Sutskever teaches a long short-term memory (LSTM)-based neural network on p. 3, above equation 1: “The goal of the LSTM is to estimate the conditional probability” of the output sequence given the input sequence. In equation 1, the right side shows a probability of an output state                         
                            
                                
                                    y
                                
                                
                                    t
                                
                            
                        
                     is conditioned upon                         
                            v
                        
                     the fixed-dimensional representation of the input sequence given the last hidden state of the LSTM and upon                         
                            
                                
                                    y
                                
                                
                                    1
                                
                            
                            ,
                            …
                             
                            ,
                             
                            
                                
                                    y
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     the previous outputs. The last paragraph on p. 3, § 2 states two different LSTMs were used, one for the input sequence and another for the output sequence. Fig. 1 on p. 2 shows the model.)
training a first neural network; (Sutskever teaches training the LSTM on p. 4, § 3.2, ¶ 1: “We trained it by maximizing the log probability of a correct translation T given the source sentence S… where S is in the training set.” Training is also taught at p. 5, § 3.4, with details explained in the bullet points.)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Sutskever’s LSTM model shown in Fig. 1 for the statistical language models (SLM) in Lefebure, specifically as the forward SLM 42 and backward SLM 44 from Figs. 14, 19, 

Regarding CLAIM 20, Lefebure teaches: Computer apparatus comprising one or more processing units, the processing units programmed to perform operations of automatically: (¶ [0178], lines 9-14 states: “Server 304 uses a computer processor to execute code stored on a non-transitory computer readable medium. By processing the instructions, the computer processor performs automatic speech recognition 305 on the expression audio to produce a token sequence hypothesis.”)
dividing a portion of input data into a sequence of input elements, each element in the input elements comprising a word or a gap between words; (Lefebure at ¶ [0082] teaches producing a sequence of tokens from audio: “A speech recognition module 11 receives speech audio from a person issuing a spoken expression. Also, Lefebure at ¶ [0083], lines 1-3 states: “Some embodiments receive token sequences from a person by means other than speech audio, such as typing on a keyboard.” The speech recognition module 11 produces a sequence of tokens.” A 5-token sequence “the weather pin Hawaii today” is given in Fig. 17A, as disclosed in ¶ [0140], first sentence.)
identifying a plurality of points in the sequence at which missing or erroneous data is potentially to be imputed, each of the plurality of points comprising a gap between a pair of adjacent words; (A single point in the sequence at which erroneous data is potentially to be imputed includes the erroneous token “pin” in Fig. 17A, as disclosed in ¶ [0140], first sentence. The broadest reasonable interpretation of a gap between a pair of adjacent words, in light of instant specification ¶ [045] and [101], includes the token “pin” between the pair of tokens “weather” and “Hawaii.” A plurality of points is taught at ¶ [0135], lines 5-10: “Some such embodiments first create multiple alternative rewrites, each by making a single edit at a position with a probability score below a threshold, and then proceed 
for each respective one of said points: (Lefebure teaches the token sequence rewrite flowchart for a token replacement in Fig. 21 is an extension of Fig. 19 (see ¶ [0149] second sentence), which is an extension of Fig. 14 (see ¶ [0145], second sentence). Since Fig. 17A-D are based on Fig. 14, they also apply to Figs. 19 and 21.)
- in a first search step, generating a respective set of one or more paths for the respective point, wherein each path comprises a candidate element to potentially replace the missing or erroneous data at the respective point, (Fig. 17D shows two paths for replacing the erroneous token “pin” with either “in” or “for”. See ¶ [0142].)
and an associated probability score of the candidate element, (An associated probability score is interpreted as a score based on probabilities, which includes the rewrite scores in Fig. 17D. The rewrite scores are described in ¶ [0141] – [0142].)
the probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and (¶ [0137]: “The list of forward most probable tokens for each position comprises, for the previous N tokens, a set of most probable tokens and their probabilities. The list of backward most probable tokens for each position comprises, for the following N tokens, a set of most probable tokens and their probabilities. For choosing a token for insertion or replacement, the token replacement module 45 searches for tokens present in both lists (the intersection of the lists) and chooses a new token based on the probability of the token in each list.”)
- in each of a plurality of subsequent successive search steps, selecting a set of one or more of the paths from one or more of the search steps to extend, (¶ [0162], lines 1-4 states: “Some 
the selection being based on the associated probability scores, and (¶ [0162], lines 4-7: “…Iteration can proceed by selecting rewrites for reprocessing based on their rewrite score.” Also, ¶ [0149], lines 5-10 state: “A choosing module 218 uses the rewrite scores to choose a best rewritten token sequence among the multiple alternative rewrites. Some alternatives may have different edits at the same position. Some alternatives may have edits at different positions.”)
generating a respective set of one or more extended paths from each respective one of the selected set of paths, (¶ [0162], lines 8-9: “Selection may work by attempting a breadth-first search tree algorithm” and ¶ [0161], lines 3-7: “Some embodiments produce lists of possible rewrites of input token sequences. That is possible either by making different edits at the same suspicious position, making edits at different suspicious positions, or both”.)
each extended path comprising the candidate element or elements from the respective path combined with an additional candidate element, and an associated probability score for the combination, (¶ [0162] states: “Some embodiments, in order to correct multiple wrong words or errors causing multiple extra words, missing words, or repeated words, feed rewritten token sequences through additional iterations of the rewriting flow. Iteration can proceed either by creating simple lists of rewrites or by selecting rewrites for reprocessing based on either their rewrite score, their grammar parse score, or a combination of both. Selection may work by attempting a depth-first, breadth-first, or best-score-first search tree algorithm” (Examiner has added underlines for emphasis). Here, Lefebure teaches using a breadth-first search tree algorithm to iteratively insert missing words into a token sequence, where each sequence for rewriting is chosen based on its score.)
this probability score being generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of the probability score for the respective path; and (The token replacement module 215 from Fig. 21 generates a rewrite score by the same manner the token replacement module 195 from Fig. 19 generates a rewrite score. In ¶ [0145], line 5 states, regarding Fig. 19: “In some embodiments, the rewrite score is the product of the probability of each token in each of the forward and backward direction. In some embodiments, the rewrite score is the product of the probability of each token in each of the forward and backward direction, each token weighted by an acoustic score.”)
performing a comparison between at least some of the paths including comparing between paths from different ones of the search steps, and (Broadly interpreted as comparing the rewritten token sequences. ¶ [0149], lines 5-8: “A choosing module 218 uses the rewrite scores to choose a best rewritten token sequence among the multiple alternative rewrites.”)
outputting, based on the comparison, a selection of one or more results wherein each result comprises the respective element or combination of elements of a respective one of the compared paths. (Fig. 21 shows the output of choosing module 18 is the best rewritten token sequence. ¶ [0161], lines 1-2 states: “Some embodiments, upon finding a successful rewrite, complete their processing and provide the successful rewritten token sequence as output.” The iterations of generating new token sequences and rewrite scores is also shown by Fig. 1. In ¶ [0163], lines 1-4 state: “FIG. 1 shows a process of iteratively rewriting new token sequences. Some embodiments, after the rewriting stage 12 reprocess the new token sequences according to their rewrite scores.)
However, Lefebure does not explicitly teach: training a first neural network; and by the first neural network
But Sutskever teaches: scores generated by the first neural network (Sutskever teaches a long short-term memory (LSTM)-based neural network on p. 3, above equation 1: “The goal of the LSTM is to estimate the conditional probability” of the output sequence given the input sequence. In equation 1, the right side shows a probability of an output state                         
                            
                                
                                    y
                                
                                
                                    t
                                
                            
                        
                     is conditioned upon                         
                            v
                        
                     the fixed-dimensional                         
                            
                                
                                    y
                                
                                
                                    1
                                
                            
                            ,
                            …
                             
                            ,
                             
                            
                                
                                    y
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     the previous outputs. The last paragraph on p. 3, § 2 states two different LSTMs were used, one for the input sequence and another for the output sequence. Fig. 1 on p. 2 shows the model.)
training a first neural network; (Sutskever teaches training the LSTM on p. 4, § 3.2, ¶ 1: “We trained it by maximizing the log probability of a correct translation T given the source sentence S… where S is in the training set.” Training is also taught at p. 5, § 3.4, with details explained in the bullet points.)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Sutskever’s LSTM model shown in Fig. 1 for the statistical language models (SLM) in Lefebure, specifically as the forward SLM 42 and backward SLM 44 from Figs. 14, 19, and 21. A motivation for the combination is that Sutskever’s LSTM model is a probabilistic model that predicts an output sequence given an input sequence as shown in Sutskever’s Equation 1.

Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Lefebure et al. (US 20190108257 A1) in view of Sutskever et al. (“Sequence to Sequence Learning with Neural Networks”), and further in view of Van Rest et al. (US 20170060958 A1).

	Regarding CLAIM 5, the combination of Lefebure and Sutskever teaches: The method of claim 1, 
Lefebure teaches selecting rewrites may work by attempting a breadth-first search tree algorithm, but neither Lefebure nor Sutskever explicitly teach: wherein each successive search step does not proceed for any of the points until the immediately preceding search step has been performed for all of the points.
	However, Van Rest teaches: wherein each successive search step does not proceed for any of the points until the immediately preceding search step has been performed for all of the points.  (Van 
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have searched all of Lefebure’s points at one “horizon” or tree level before proceeding to the next level using Van Rest’s breadth-first search. A motivation for the combination is that performing a breadth-first search ensures that Lefebure’s token paths are visited one level at a time. (Van Rest ¶ [0061]: “Per a breadth-first search, the search radius defines an expanding horizon of unvisited vertices. At each iteration, only vertices that lie on the horizon are visited.”)

Claims 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over Lefebure et al. (US 20190108257 A1) in view of Sutskever et al. (“Sequence to Sequence Learning with Neural Networks”) and Mikolov et al. (“Efficient Estimation of Word Representations in Vector Space”).

Regarding CLAIM 6, the combination of Lefebure and Sutskever teaches: The method of claim 1 comprising, 
Further, Lefebure teaches:  wherein in the first search step for each of said points, the candidate elements of the respective one or more paths are generated	(Lefebure Fig. 17D shows two paths for replacing the erroneous token “pin” with either “in” or “for” as described in ¶ [0142].)
for each respective one of said points: prior to the first search step, generating a respective embedding for the respective point, the embedding being a vector generated by a second neural network as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of a position of the respective point in the sequence; 
generated based on a decoder state that is a function of the respective embedding.
But Sutskever teaches: generated based on a decoder state that is a function of the respective embedding. (Sutskever teaches “word embeddings” on p. 5, § 3.4, line 2, which are understood to be vectorized input tokens “A,” “B,” and “C” in Fig. 1. The decoder state is a function of the respective embedding because all the embeddings are passed into the decoder, as shown in Fig. 1 reproduced below with annotations. Additionally, p. 3, § 2, in the paragraph under equation 1, Sutskever states: “The overall scheme is outlined in figure 1, where the shown LSTM computes the representation of “A”, “B”, “C”, “<EOS>” and then uses this representation to compute the probability of “W”, “X”, “Y”, “Z”, “<EOS>”.” 

    PNG
    media_image1.png
    210
    765
    media_image1.png
    Greyscale

Lastly, on p. 3, § 2, in the paragraph above, equation 1, Sutskever states:

    PNG
    media_image2.png
    95
    1094
    media_image2.png
    Greyscale

	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention for Sutskever’s decoder to have generated Lefebure’s predicted words as a (Sutskever, p. 3, in the paragraph above equation 1)
Although Sutskever teaches “word embeddings” on p. 5, § 3.4, line 2, neither Lefebure nor Sutskever explicitly teaches: for each respective one of said points: prior to the first search step, generating a respective embedding for the respective point, the embedding being a vector generated by a second neural network as a function of some or all of the input elements before and/or after the respective point in the sequence, and as a function of a position of the respective point in the sequence; 
	But Mikolov teaches: for each respective one of said points: prior to the first search step, generating a respective embedding for the respective point, the embedding being a vector generated by a second neural network as a function of some or all of the input elements before and/or after the respective point in the sequence, (Mikolov p. 4, § 3.1, describes a Continuous Bag-of-Words model (CBOW) which generates a vector embedding for a given token based on tokens before and after it. On p. 3, Fig. 1, left side, Mikolov shows that CBOW is a neural network with an input layer, a hidden projection layer, and an output layer.)
and as a function of a position of the respective point in the sequence; (The broadest reasonable interpretation of this limitation is a position of a given token with respect to the tokens before and after it. According to Mikolov p. 4, § 3.1, CBOW generates a vector embedding for a given token based on tokens before it and tokens after it.)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have generated an embedding vector for each token in Lefebure’s input using Mikolov’s Continuous Bag-of-Words model, with a motivation to convert words into vectors for perform mathematical operations on word vectors. (Mikolov, p. 1, § 1.1: “The main goal of this paper is to introduce techniques that can be used for learning high-quality word vectors from huge data sets with billions of words, and with millions of words in the vocabulary.”)

Regarding CLAIM 7, the combination of Lefebure, Sutskever, and Mikolov teaches: The method of claim 6, 
Further, Lefebure teaches: wherein the method further comprises, for each of said points: between each successive subsequent search step and the preceding search step, at least for the selected set of paths, (Interpreted as iteratively rewriting token sequence. Lefebure ¶ [0162] states: “Some embodiments, in order to correct multiple wrong words or errors causing multiple missing words, feed rewritten token sequences through additional iterations of the rewriting flow. Iteration can proceed by selecting rewrites for reprocessing based on their rewrite score. Selection may work by attempting a breadth-first search tree algorithm” (edited by Examiner).)
wherein in each of the subsequent search steps, for each of the extended paths, the additional element of the respective path is generated (Lefebure ¶ [0162] states: “Some embodiments, in order to correct multiple wrong words or errors causing multiple missing words, feed rewritten token sequences through additional iterations of the rewriting flow. Iteration can proceed by selecting rewrites for reprocessing based on their rewrite score. Selection may work by attempting a breadth-first search tree algorithm” (edited by Examiner).)
However, neither Lefebure nor Mikolov explicitly teaches: updating the decoder state as function of the candidate elements in the respective path;
generated based on the updated decoder state for the respective path.
But Sutskever teaches: updating the decoder state as a function of the candidate elements in the respective path; (A candidate element is interpreted as an output of the decoder. Sutskever Fig. 1 teaches that outputs of the decoder (“W,” “X,” “Y,” and “Z”) are fed back into the decoder input,                         
                            
                                
                                    y
                                
                                
                                    t
                                
                            
                        
                     is conditioned upon the output at the previous time step                         
                            
                                
                                    y
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    .)
based on the updated decoder state for the respective path. (A candidate element is interpreted as an output of the decoder. Sutskever Fig. 1 teaches that outputs of the decoder (“W,” “X,” “Y,” and “Z”) are fed back into the decoder input, thereby updating the decoder’s state. This is taught mathematically on p. 3 by the right side of equation 1, where the probability of outputting                         
                            
                                
                                    y
                                
                                
                                    t
                                
                            
                        
                     is conditioned upon the output at the previous time step                         
                            
                                
                                    y
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                    .)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have fed the decoder output back into the combination’s decoder, as in the LSTM of Sutskever. A motivation for the combination is to include the LSTM’s ability to successfully learn on data with long range temporal dependencies. (Sutskever p. 2, end of ¶ 2: “The LSTM’s ability to successfully learn on data with long range temporal dependencies makes it a natural choice for this application due to the considerable time lag between the inputs and their corresponding outputs (fig. 1).”)

Claims 8 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Lefebure et al. (US 20190108257 A1) in view of Sutskever et al. (“Sequence to Sequence Learning with Neural Networks”), Mikolov et al. (“Efficient Estimation of Word Representations in Vector Space”), and Rainwater (US 20180060727 A1).

Regarding CLAIM 8, the combination of Lefebure, Sutskever, and Mikolov teaches: The method of claim 6, 
Further, Lefebure teaches (limitations re-ordered for clarity): further comprising: the classifier representing a probability that the respective point has a missing or erroneous element; and (Lefebure The wrong token has a low probability in each of the backward and forward directions” (edited by Examiner).)
for each respective one of said points, the probability score generated… for each path in the first search step also being a function of an initial classifier for the respective position, (This limitation is broadly interpreted as the probability score depending on the classifier’s output. Lefebure ¶ [0130] states: “Accordingly, a token that has a low probability in each of the backward and forward directions is suspicious. It indicates a position in which a deletion or replacement edit is likely appropriate.” Also, ¶ [0136], lines 1-6 state: “Various embodiments create a rewrite by performing an edit at the position with a lowest combination of backward and forward probabilities on the same token” (edited by Examiner). The probability score generated in Fig. 17D for the 5-token sequence in 17A is a result of the classifier finding “pin” to be suspicious in Fig. 12B.)
wherein the classifier is generated (¶ [0130] states: “Accordingly, a token that has a low probability in each of the backward and forward directions is suspicious. It indicates a position in which a deletion or replacement edit is likely appropriate.”)
However, Lefebure does not explicitly teach: classifier is generated by a third neural network as a function of the respective embedding.
But Mikolov teaches: as a function of the respective embedding. (Mikolov teaches an embedding on Mikolov p. 4, § 3.1, describes a Continuous Bag-of-Words model (CBOW) which generates a vector embedding for a given token based on tokens before and after it.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Mikolov’s CBOW model to generated vectors corresponding to the (Mikolov, p. 1, § 1.1: “The main goal of this paper is to introduce techniques that can be used for learning high-quality word vectors from huge data sets with billions of words, and with millions of words in the vocabulary.”)
However, neither Lefebure nor Mikolov explicitly teaches: scores generated by the first neural network 
Neither Lefebure, Sutskever, nor Mikolov explicitly teaches: by a third neural network
But Sutskever teaches: scores generated by the first neural network (Sutskever teaches a long short-term memory (LSTM)-based neural network on p. 3, above equation 1: “The goal of the LSTM is to estimate the conditional probability” of the output sequence given the input sequence. In equation 1, the right side shows a probability of an output state                         
                            
                                
                                    y
                                
                                
                                    t
                                
                            
                        
                     is conditioned upon                         
                            v
                        
                     the fixed-dimensional representation of the input sequence given the last hidden state of the LSTM and upon                         
                            
                                
                                    y
                                
                                
                                    1
                                
                            
                            ,
                            …
                             
                            ,
                             
                            
                                
                                    y
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     the previous outputs. The last paragraph on p. 3, § 2 states two different LSTMs were used, one for the input sequence and another for the output sequence. Fig. 1 on p. 2 shows the model.)
Claim 8 includes the first neural network of claim 1. In claim 1, the first neural network generates probability scores.
However, neither Lefebure, Sutskever, nor Mikolov explicitly teaches: by a third neural network.
	But Rainwater teaches: by a third neural network. (Rainwater teaches this limitation at ¶ [0077], lines 4-15: “In an example classification architecture, sequential data that is to be classified can be encoded as a fixed dimensional vector and provided to a classifier. The classifier may use a neural network that has been trained to classify sequential data into a plurality of categories. For example, the neural network of the classifier may be trained to detect… grammatical errors in natural languages (e.g., English or French)… In these examples, the classifier may classify sequential data into sequential data with errors and those without.” Rainwater discloses classifier 740 in Fig. 7 and ¶ [0079] line 1.)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Rainwater’s neural network classifier to classify Lefebure’s forward and backward token sequences. Rainwater’s classifier may classify the token sequences in Figs. 12B as being either correct (i.e., high probability) or incorrect (i.e., low probability). A motivation for the combination is that the classifier detects grammatical errors in natural languages. (Rainwater ¶ [0077], lines 7-12: “The classifier may use a neural network that has been trained to classify sequential data into a plurality of categories. For example, the neural network of the classifier may be trained to detect… grammatical errors in natural languages (e.g., English or French)”.)

Regarding CLAIM 9, the combination of Lefebure, Sutskever, Mikolov, and Rainwater teaches: The method of claim 8, 
Further, Lefebure teaches: wherein the method comprises: following each of some or all of the search steps, skimming off the element or combination of elements from each of some or all of the paths generated from across some or all of the points in the sequence into a candidate pool, (Under the broadest reasonable interpretation, in light of the specification, “skimming off” elements is interpreted as performing pruning retaining some elements. The retained elements are the “skimmed-off” elements recited later in the claim. Lefebure ¶ [0162], lines 4-7: “…Iteration can proceed by selecting rewrites for reprocessing based on their rewrite score.” Also ¶ [0084], last four lines: “Some embodiments feed new token sequences back into rewriting module 12 and reprocess them in an attempt to produce even better rewritten token sequences. This is useful if, for example, token sequences have multiple errors.” In choosing a path with a higher rewrite score, the lower-scoring paths are effectively pruned away. See Lefebure ¶ [0142], starting at lines 4.)
the element or combination of elements from each of the skimmed-off paths forming a respective candidate result in the candidate pool; and (Candidate result in a candidate pool is interpreted as the best rewritten token sequence after pruning. See ¶ [0142], starting at lines 4.)
generate a new probability score for each candidate result in the candidate pool; (A new probability score is generated in the next iteration, as disclosed in ¶ [0084], last four lines. For generating a new probability score, see ¶ [0142], starting at lines 4.)
wherein said comparing comprises comparing the new probability scores in the candidate pool, and (See ¶ [0142], starting at lines 4. Addition of probabilities, multiplication of probabilities and addition of log probabilities are compared.)
said selection comprises a selection of one or more of the candidate results having the highest of the new probability scores; and (Highest probability score is interpreted as most preferable. ¶ [0142], last two sentences: “For some embodiments, the method of multiplying or adding logs is preferable. It has the effect of favoring tokens that are more reasonable in both directions, whereas the method of adding probabilities favors tokens that are sensible in one direction but nonsense in the other direction.”)
However, Lefebure does not explicitly teach: applying a fourth neural network to each entry and wherein some or all of the first, second, third or fourth neural networks are subgraphs of a same wider network, and are trained together.
But Sutskever teaches: applying a fourth neural network to each entry (The broadest reasonable interpretation of this limitation, in light of the specification, is generating a probability score again using Sutskever’s neural network. Sutskever teaches a long short-term memory (LSTM)-based neural network on p. 3, above equation 1: “The goal of the LSTM is to estimate the conditional probability” of the output sequence given the input sequence. In equation 1, the right side shows a probability of an output state                         
                            
                                
                                    y
                                
                                
                                    t
                                
                            
                        
                     is conditioned upon                         
                            v
                        
                     the fixed-dimensional representation of the input                         
                            
                                
                                    y
                                
                                
                                    1
                                
                            
                            ,
                            …
                             
                            ,
                             
                            
                                
                                    y
                                
                                
                                    t
                                    -
                                    1
                                
                            
                        
                     the previous outputs. The last paragraph on p. 3, § 2 states two different LSTMs were used, one for the input sequence and another for the output sequence. Fig. 1 on p. 2 shows the model.)
wherein some or all of the first, second, third or fourth neural networks are subgraphs of a same wider network, and are trained together (The first and fourth neural networks are interpreted as being Sutskever’s LSTM neural network which is applied in both claims 1 and 9. Under the broadest reasonable interpretation of “subgraphs” given by instant specification ¶ [112] – “i.e. there are connections there between and they are trained together” – the first and fourth networks are subgraphs because they have common connections. Examiner is not required to cite art for the second and third neural network because they are alternatives to the first and fourth networks.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have applied Sutskever’s LSTM neural network as a “fourth neural network” for the statistical language models (SLM) in Lefebure, specifically as Lefebure’s forward SLM 42 and backward SLM 44 from Figs. 14, 19, and 21, and it would have been obvious to have trained Sutskever’s LSTM neural network. A motivation for the combination is that Sutskever’s LSTM model is a probabilistic model that predicts an output sequence given an input sequence as shown in Sutskever’s Equation 1.
	Rainwater also teaches an encoder-decoder network in Fig. 2, and described at ¶ [0017].

Claims 11 is rejected under 35 U.S.C. 103 as being unpatentable over Lefebure et al. (US 20190108257 A1) in view of Sutskever et al. (“Sequence to Sequence Learning with Neural Networks”) and Rainwater (US 20180060727 A1).

Regarding CLAIM 11, the combination of Lefebure and Sutskever teaches: The method of claim 10,
wherein the classifier is generated… as a function of some or all of the input elements before and/or after the respective point in the sequence, and (¶ [0130] states: “Accordingly, a token that has a low probability in each of the backward and forward directions is suspicious. It indicates a position in which a deletion or replacement edit is likely appropriate.”)
as a function of the position of the respective point in the sequence. (The broadest reasonable interpretation includes a classifier is a function of the tokens before and/or after the token at the point. This is taught by ¶ [130]
	However, neither Lefebure nor Sutskever explicitly teaches: the classifier is generated by another neural network
	But Rainwater teaches: the classifier is generated by another neural network (Rainwater teaches this limitation at ¶ [0077], lines 4-15: “In an example classification architecture, sequential data that is to be classified can be encoded as a fixed dimensional vector and provided to a classifier. The classifier may use a neural network that has been trained to classify sequential data into a plurality of categories. For example, the neural network of the classifier may be trained to detect… grammatical errors in natural languages (e.g., English or French)… In these examples, the classifier may classify sequential data into two categories—sequential data with errors and those without.” Rainwater discloses classifier 740 in Fig. 7 and ¶ [0079] line 1.)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Rainwater’s neural network classifier to classify Lefebure’s forward and backward token sequences such as the one in Figs. 12B as either correct (i.e., high probability) or incorrect (i.e., low probability). A motivation for the combination is that the classifier detects grammatical errors in natural languages. (Rainwater ¶ [0077], lines 7-12: “The classifier may use a neural network that has been trained to classify sequential data into a plurality of categories. For example, the neural network of the classifier may be trained to detect… grammatical errors in natural languages (e.g., English or French)”.)

Response to Arguments
The following is a response to the claims and remarks filed 10/13/2021, the advisory action filed 10/07/2021, and the interview conducted 08/28/2021.

Claim Rejection Under 35 USC § 101 (Remarks pp. 11-12): 
Signals Per Se: Applicant’s arguments have been fully considered but they are not persuasive. Instant claim 19 recites, “A device storing code configured to perform operations”. The claim recites a non-statutory element.

Abstract Idea Without Significantly More: Applicant's arguments have been fully considered but they are not persuasive. Claims 1, 19, and 20 recite the additional element of training a neural network, and claim 18 limits the training to supervised learning, unsupervised learning, and reinforcement learning. Training a neural network is well-understood, routine, conventional activity. Please see the 35 U.S.C. 101 rejections in this office action for reference.

Claim Rejection Under 35 USC § 103 (Remarks pp. 13-18): Applicant’s arguments with respect to claim(s) 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Sun et al. (“Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning”) teaches imputing missing text between adjacent words in Fig. 1.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Asher H. Jablon whose telephone number is (571)270-7648. The examiner can normally be reached Monday - Friday, 9:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Al Kawsar can be reached on (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.







/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127