DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This correspondence is responsive to the Application filed on November 18, 2020. Claims 1-20 are pending in the case, with claims 1, 8 and 15 in independent form.

Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 5, 12 and 19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. Claims 5, 12 and 19 recite the limitation for “fine-process, only the loss of the position of the wrongly written character is calculated.”  It is not clear what “the loss of the position of the wrongly written character” means, or how only the loss of the position of the 

Claims 6, 13 and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. Claims 6, 13 and 20 recite the limitation for “fusing context information of the position of the character to be processed for decoding, and selecting the optimal candidate from the M candidates.”  It is not clear what “fusing” the context information of the position of the character to be processed for decoding means or what the context information of the position of the character to be processed is being fused together with. Applicant may cancel claims 6, 13 and 20 or amend the claims to particularly point out and distinctly claim the subject matter which the inventors regard as the invention. For examination purposes, claims 6, 13 and 20 are interpreted as considering context information of the position of the character to be processed for decoding, and selecting the optimal candidate from the M candidates.  

Claim Rejections - 35 USC § 101

35 U.S.C. 101 reads as follows:

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful 


Rejection
Claim(s) 1-2, 6, 8-9, 13, 15-16 and 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more in the claim(s) than the abstract idea itself.  These Ineligible Claim(s) are directed to an abstract idea, which have been found ineligible by judicial exception under Supreme Court Cases including Alice Corp. v. CLS Bank International, 573 U.S, 134 S. Ct. 2347, 110 USPQ2d 1976 (2014) [hereinafter “Alice Corp.”] and Mayo Collaborative Services v. Prometheus Laboratories, Inc., 566 U.S. 66, 71, 101 USPQ2d 1961, 1965 (2012) [hereinafter “Mayo”].  The Ineligible Claim(s) do not include additional elements that are sufficient to amount to significantly more than the judicial exception, as addressed below.
Summary of Analysis (See Detailed Analysis of the Claims, below)
This summary is provided for the convenience of the reader and to provide a quick overview of the analysis.  Please see the Detailed Analysis of the Claims, set forth below.  In the case of inconsistency between the following summary and the Detailed Analysis of the Claims, the latter should be relied upon to explain the rejection.
Step 1 (Statutory Subject Matter)
Claim(s) 1-20 recite statutory subject matter, subject to further review under the Alice/Mayo judicial exception test.
Step 2A, Prong 1 (Abstract Idea)
Claim(s) 3-5, 10-12 and 17-19 do not recite an abstract idea in the enumerated categories.
Claim(s) 1-2, 6, 8-9, 13, 15-16 and 20 recite an abstract idea in the enumerated categories. More specifically claims 1-2, 6, 8-9, 13, 15-16 and 20 recite an abstract idea in the enumerated category of Mental Processes. The human mind capable and well suited to perform a method of correcting character errors, including mentally processing and acquiring a score of each character in a pre-constructed vocabulary, the score being a score of the reasonability of the character in the vocabulary at the position of the character being processed, mentally selecting the top K characters as candidates, where K is a positive integer greater than one, and mentally selecting the optimal candidate from the K candidates and replacing the character with the candidate if the optimal character is different than the character being processed.    
Step 2A, Prong 2 (Practical Application)
Claim(s) 1-2, 6, 8-9, 13, 15-16 and 20 do not recite a practical application of the abstract idea identified in Prong 1.
Step 2B (Significantly More)
Claim(s) 1-2, 6, 8-9, 13, 15-16 and 20 do not recite significantly more than the abstract idea identified in Step 2A, Prong 1 or reciting a practical application in Step 2A, Prong 2.
Detailed Analysis of the Claims.
The claims are reproduced immediately below to explain the detailed analysis the Examiner undertook to determine the eligibility of the claim(s) under 35 U.S.C. §101.  
The 35 U.S.C. § 101 analysis involves several steps and sub-steps.  Step 1 is detailed in MPEP §2106.03.  Step 2A, Prongs 1 and 2 are detailed in the 2019 Guidance, which is incorporated into the most recent revision of MPEP §2106.04.  Step 2B is detailed in MPEP §2106.05.  Steps 2A and 2B represent steps 1 and 2 of the Alice/Mayo Test.
Eligibility Step 1:  The Four Categories of Statutory Subject Matter (MPEP 2106.03)
Applied to the present application, under Step 1 of the Guidance analysis, the Claim(s) belong to the statutory class(es) of a process (method Claim(s) 1-7), a machine (system/apparatus Claim(s) 8-14), and an article of manufacture (non-transitory computer-readable media Claim(s) 15-20).  This first step confirms that the invention identifies as one or more statutory classes, before determining in the following steps whether a judicial exception applies.
Eligibility Step 2A:  Whether a Claim is Directed to a Judicial Exception (2019 Guidance)
The determination of whether the claim is directed to a Judicial Exception is conducted as follows: Step 2A is divided into prong 1 and prong 2 analysis according to the procedure as set forth in the 2019 Guidance (superseding MPEP §2106.04).  If the claim(s) are found ineligible at either of the prongs under Step 2A, the claims are further considered at Step 2B, as detailed in MPEP §2106.05.

Those portions of the claim appearing in bold font have been identified for analysis as an abstract idea under Step 2A, Prong 1 of the 2019 Guidance.  These bold font phrases are followed by one or more footnotes to either explain which of the categories of abstract idea enumerated by the 2019 Guidance that was applied or to state that the phrase does not fit one of the enumerated categories.  If no abstract idea was identified at Step 2A, prong 1, the analysis finds the claim(s) eligible and the analysis concludes.  However, if an abstract idea was identified at Step 2A, prong 1, then the claims are further considered at Step 2A, prong 2.
If Step 2A, prong 2, is reached, those portions of the claim appearing in regular and underlined font were first considered for whether the abstract idea has been incorporated into a practical application.  The recitation identified by regular and underlined font are followed by a footnotes explaining the Examiner’s analysis under Step 2A, prong 2, of the 2019 Guidance.  If a practical application was identified at Step 2A, prong 2, the analysis finds the claim(s) eligible and the analysis concludes. 
Eligibility Step 2B:  Whether a Claim Amounts to Significantly More
If no practical application was identified at Step 2A, prong 2, then the claims are further considered at Step 2B to determine whether the recitation indicated by regular and underlined font represent significantly more than the abstract idea and thus constitute an inventive concept (either individually or as an ordered combination) under Step 2B of the Alice/Mayo test.   These identified phrases will be followed by an regular and underlined font.
Those portions of the claim in regular font represent preamble material previously address or are otherwise insignificant to the interpretation of the claim.  
In the following analysis, each claim has been reviewed as an ordered combination by detailed analysis of each limitation of the claim to identify recitation of abstract ideas and the further limitations that might provide significantly more than the abstract idea itself.  The ordered combination is considered as part of the process outlined in MPEP §2106 and 2019 Guidance.
Claim Markup:
1. A method for correcting character errors, comprising: for a character to be processed, acquiring the score of each character in a pre- constructed vocabulary, the score being a score of the reasonability of the character in the vocabulary at the position of the character to be processed; selecting top K characters as candidates of the character to be processed, K being a positive integer greater than one; and selecting an optimal candidate from the K candidates, and replacing the character to be processed with the optimal candidate if the optimal candidate is different from the character to be processed  [Mental Processes1- The human mind capable and well suited to perform a method of 
2. The method according to claim 1, further comprising: using N characters in a text to be processed as the characters to be processed, N being a positive integer and having a maximum value equal to the number of characters comprised in the text to be processed  [Mental Processes1- The human mind is capable and well-suited to use N characters in a text to be processed, N being a positive integer with a maximum value equal to the number of characters in the text.].  

6. The method according to claim 1, wherein the selecting an optimal candidate from the K candidates comprises: ranking the K candidates, and selecting the top M candidates after ranking, M being a positive integer greater than one and less than K; and fusing context information of the position of the character to be processed  [Mental Processes1] for decoding  [Extra-solution Activity2]  [Extra-Solution Activity3] , and selecting the optimal candidate from the M candidates.  

8. An electronic device, comprising: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform [Apply It4] [Generic Computer5] a method for correcting character errors, wherein the method comprises: 
for a character to be processed, acquiring the score of' each character in a pre- constructed vocabulary, the score being a score of the reasonability of the character in the vocabulary at the position of the character to be processed; 
selecting top K characters as candidates of the character to be processed, K being a positive integer greater than one: and selecting an optimal candidate from the K candidates, and replacing the character to be processed with the optimal candidate if the optimal candidate is different from the character to be processed  [Mental Processes1- The human mind capable and well suited to perform a method of correcting character errors, including mentally processing and acquiring a score of each character in a pre-constructed vocabulary, the score being a score of the reasonability of the character in the vocabulary at the position of the character being processed, mentally selecting the top K characters as candidates, where K is a positive integer greater than one, and mentally selecting the optimal candidate from the K candidates and replacing the character with the candidate if the optimal character is different than the character being processed.].

15. A non-transitory computer-readable storage medium storing computer instructions therein, wherein the computer instructions are used to cause the computer to perform  [Apply It4]  [Mere Instructions6]  a method for correcting character errors, wherein the method comprises: for a character to be processed, acquiring the score of each character in a pre- constructed vocabulary, the score being a score of the reasonability of the character in the vocabulary at the position of the character to be processed; selecting top K characters as candidates of the character to be processed, K being a positive integer greater than one: and selecting an optimal candidate from the K candidates, and replacing the character to be processed with the optimal candidate if the optimal candidate is different from the character to be processed  [Mental Processes1- The human mind capable and well suited to perform a method of correcting character errors, including mentally processing and acquiring a score of each character in a pre-constructed vocabulary, the score being a score of the reasonability of the character in the vocabulary at the position of the character being processed, mentally selecting the top K characters as candidates, where K is a positive integer greater than one, and mentally selecting the optimal candidate from the K candidates and replacing the character with the candidate if the optimal character is different than the character being processed.].  
The remaining claims 9, 13, 16 and 20 are rejected for comparable reasons to those set forth above.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1 and 2 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Yu et al., “Chinese Spelling Error Detection and Correction Based on Language Model, Pronunciation, and Shape”  Proceedings of the Third CIPS-SIGHAN Joint Conference on Chinese Language Processing, October 1, 2014, pages 220-223, XP055835553, Stoudsbgurg, PA USA, DOI:10.3115/v1/W14-6835, Retrieved from Internet URL:https//aclanthology.org/W146835.pdf>, (Cited on Information Disclosure Statement), hereinafter Yu.

Regarding claim 1, Yu teaches:
A method for correcting character errors (i.e., Spelling check… If the probability is higher than a predefined threshold, then we replace the original character, or we consider the original character as correct and take no action. Yu, Abstract, Sections 2.2, 2.3 Spelling Error Correction, pages 220, 221-223), comprising: 
for a character to be processed, acquiring the score of each character in a pre-constructed vocabulary, the score being a score of the reasonability of the character in the vocabulary at the position of the character to be processed (i.e.,  Yu discloses calculating the score of each character using a language model (for a character to be processed, acquiring the score of each character in a pre-constructed vocabulary), while the score is less than some threshold (score reasonability), the character and its location (the score being a score of the reasonability of the character in the vocabulary (vocabulary language model) at the position of the character to be processed) is are sent to step 2. In step 2, we need to filter the characters generated in acquiring the score of each character in a pre-constructed vocabulary, the score being a score of the reasonability of the character in the vocabulary at the position of the character to be processed (testing, scoring each character in the vocabulary for the reasonability of the character in the vocabulary at the position of the character to be processed)). Here, the character will be left for calculating its score by the language model. Yu, Abstract, Section 2.3, page 221); 
selecting top K characters as candidates of the character to be processed, K being a positive integer greater than one; and 
selecting an optimal candidate from the K candidates, and replacing the character to be processed with the optimal candidate if the optimal candidate is different from the character to be processed (i.e., Secondly, each character in the candidate set will be tested whether it can form a legal word with its neighbors. Here, the character which can construct a legal word with its neighbors will be left for calculating its score by the language model. After filtering, the number of candidates has been reduced which will bring two benefits: most candidates that have been cut are irrelevant characters and less candidates makes the system be more efficient (selecting top K characters as candidates of the character to be processed (filtering and cutting out irrelevant characters and making less candidates), K being a positive integer greater than one). At last, the best candidate means one character gets the highest score under selecting an optimal candidate from the K candidates). If existing, the original character finally will be recognized as an error character and it will be replaced by the best candidate (replacing the character to be processed with the optimal candidate if the optimal candidate is different from the character to be processed). We only use the language model and to choose the best candidate because we find that the language model can get a quite high accuracy if we can provide a suitable candidate set successfully. Yu, Sections 2.3, page 221).  

Regarding claim 2, which depends from claim 1 and further recites:
using N characters in a text to be processed as the characters to be processed, N being a positive integer and having a maximum value equal to the number of characters comprised in the text to be processed (i.e., Step 1, we calculate the score of each character in a sentence (using N characters in a text (each of the N characters in a text sentence) to be processed as the characters to be processed, N being a positive integer and having a maximum value equal to the number of characters comprised in the text (sentence) to be processed) by a forward-backward 5 gram language model. Yu, Abstract, Sections 2.2, 2.3, pages 221, 222. ). 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have 

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Yu as applied to claim 1 above, and further in view of Applicant Admitted Prior Art.

Regarding claim 3, which depends from claim 1 and recites:
wherein the acquiring the score of each character in a pre-constructed vocabulary comprises: determining the score of each character in the vocabulary with a pre-trained language model.  
Yu teaches the method of claim 1. As similarly discussed above with respect to claim 1, Yu teaches calculating the score of each character using a language model (acquiring the score of each character in a pre-constructed vocabulary comprises: determining the score of each character in the vocabulary with a ), while the score is less than some threshold, the character and its location is are sent to step 2. In step 2, we need to filter the characters generated in step 1. We will judge the character whether it can construct a word. Otherwise, we will make the assumption that I may be a spelling error which means we are still not sure about it. Yu, Abstract, Sections 2.2, 2.3, page 221. 
Thus, Yu teaches determining the score of each character in the vocabulary with a language model. Yu does not explicitly disclose “pre-trained” language model.
However, Applicant’s present application discloses that, “How to pre-train the language model at the character granularity is a prior art.” See paragraph 26 of the originally filed specification.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model of Yu using the pre-trained language model of Applicant’s Admitted Prior Art, with a reasonable expectation of success, as it would have allowed a user to score the characters of the vocabulary using a pre-trained model. This would have provided the user with the advantage of saving time and computational resources, as the model would not be required to be trained at run-time.

Claims 4-5 are rejected under 35 U.S.C. 103 as being unpatentable over Yu and Applicant Admitted Prior Art as applied to claim 3 above, and further in view of Higgins et al. (Pub. No. US 2014/0046696 A1, published Feb. 13, 2014) hereinafter Higgins and Nadejde et al. (Pub. No. US 2021/0271810 A1, filed Mar. 2, 2020) hereinafter Nadejde.

Regarding claim 4, which depends from claim 3 and recites:
wherein a method for acquiring the language model comprises: acquiring first-class training data, any piece of the first-class training data comprising an input text and an output text which are the same and do not contain wrongly written characters; 
Yu in view of Applicant Admitted Prior Art, teaches the method of claim 3, including the acquired language model, input character text and output character text. Yu, Abstract, Sections 2.2, 2.3, page 221.  Yu in view of Applicant Admitted Prior Art does not specifically disclose acquiring first-class training data, any piece of the first-class training data comprising an input text and an output text which are the same and do not contain wrongly written characters.
However, Higgins teaches in the field related to clinical decision support and processing free text using machine-learning based approaches Higgins, para 1, 4-5, . Higgins, which is analogous to the claimed invention because Higgins is directed toward processing free text using machine-learning based approaches, teaches that, In accordance with this embodiment, one learning machine is pre-trained using a set of error-free clinical data in text format (unstructured data) as the training set (acquiring first-class training data, any piece of the first-class training data comprising an input text and an output text which are the same and do not contain wrongly written characters). Higgins, para 44.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model of Yu using the pre-trained language model of Applicant’s Admitted Prior Art and the feature for acquiring first-class training data, any piece of the first-class training data comprising an input text and an output text which are the same and do not contain wrongly written characters of Higgins, with a reasonable expectation of success, as it would have allowed a user to score the characters of the vocabulary using a pre-trained model and to more accurately pre-train the model using error-free text training data. This would have provided the user with the advantage of saving time and computational resources, as the model would not be required to be trained at run-time and would 
pre-training the language model at character granularity utilizing the first-class training data; 
As similarly discussed above with respect to claim 3, from which claim 4 depends, Yu teaches determining the score of each character in the vocabulary with a language model. Yu, Abstract, Sections 2.2, 2.3, page 221. Yu does not explicitly disclose pre-training the language model at the character granularity.  However, Applicant’s present application discloses that, “How to pre-train the language model at the character granularity is a prior art.” See paragraph 26 of the originally filed specification. Thus, Yu in view of Applicant Admitted Prior Art teaches pre-training the language model at character granularity. Yu in view of Applicant Admitted Prior Art does not specifically disclose utilizing the first class training data. 
However, as discussed above, Higgins teaches that, In accordance with this embodiment, one learning machine is pre-trained using a set of error-free clinical data in text format (unstructured data) as the training set (utilizing first-class training data). Higgins, para 44.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model of Yu using the pre-trained language model of Applicant’s Admitted Prior Art and the feature for acquiring first-class training data, any piece of the first-class training data comprising an input text and an output text which are the same and do not contain . This would have provided the user with the advantage of saving time and computational resources, as the model would not be required to be trained at run-time and would provide the user with correct training data for more accurately pre-training the model. Higgins, para 4-5, 44.
acquiring second-class training data, any piece of the second-class training data comprising an input text containing wrongly written characters and an output text which corresponds to the input text and does not contain wrongly written characters; and fine-tuning the language model using the second-class training data.  
As discussed above, Yu in view of Applicant Admitted Prior Art and Higgins teaches the language model. Yu in view of Applicant Admitted Prior Art and Higgins does not specifically disclose acquiring second-class training data, any piece of the second-class training data comprising an input text containing wrongly written characters and an output text which corresponds to the input text and does not contain wrongly written characters; and fine-tuning the language model using the second-class training data.  
However, Nadejde teaches in the field related to computer software for grammatical error correction. Nadejde, para 1. Nadejde teaches that, After the initial training, parameter values for only a subset of the model parameters are fine-tuned acquiring second-class training data, any piece of the second-class training data comprising an input text containing wrongly written characters and an output text which corresponds to the input text and does not contain wrongly written characters, and fine-tuning the language model using the second-class training data) may be labeled with corresponding error codes, which may indicate, for a particular text sequence, at least one type of error that is present in the text sequence and the location of the error within the text sequence. In an embodiment, the subset of the model parameters that are fine-tuned includes only the model parameter values for the encoder, for example the embedding and/or encoding layers of the adapted model, while model parameter values for other layers of the adapted model, such as the decoder, are not fine-tuned. Nadejde, Fig 4A, para 18, 52.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model of Yu using the pre-trained language model of Applicant’s Admitted Prior Art and the feature for acquiring first-class training data, any piece of the first-class training data comprising an input text and an output text which are the same and do not contain wrongly written characters and utilizing the first-class training data of Higgins and the 

Regarding claim 5, which depends from claim 4 and recites:
wherein in the fine-tuning process, only the loss of the position of the wrongly written character is calculated.  
Yu in view of Applicant Admitted Prior Art, Higgins and Nadejde teaches the method of claim 4, from which claim 5 depends, including the fine-tuning process and the character position.  As similarly discussed above, Yu teaches calculating the score of each character using a language model, while the score is less than some threshold, the character and its location (only the loss of the position (location) of the wrongly written character is calculated) is are sent to step 2. In step 2, we need to filter the does not specifically disclose the fine-tuning process. 
However, as similarly discussed above, Nadejde teaches that, After the initial training, parameter values for only a subset of the model parameters are fine-tuned (in the fine-tuning process, only subset of parameter values fine-tuned) using in-domain training data that includes uncorrected text sequences that are labeled with the native languages and proficiency levels of the sources of the respective uncorrected text sequences, as well as grammatically corrected versions of the native language and proficiency-labeled uncorrected source text sequences. Although not required, in some implementations, uncorrected text sequence-corrected text sequence pairs in the dataset used for fine tuning may be labeled with corresponding error codes, which may indicate, for a particular text sequence, at least one type of error that is present in the text sequence and the location of the error within the text sequence.. Nadejde, Fig 4A, para 18, 52.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model and only the loss of the position of the wrongly written character is calculated of Yu using the pre-trained language model of Applicant’s Admitted Prior Art and the feature for acquiring first-class training data, any piece of the first-class training data comprising an input text and an output text which are the same and do not contain .

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Yu as applied to claim 1 above, and further in view of Xiong et al., "Extended HMM and Ranking models for Chinese Spelling Correction, Proceedings of the Third CIPS-SIGHAN Joint Conference on Chinese Language Processing, pages 133–138, October 2014, retrieved at https://aclanthology.org/W14-6821.pdf, hereinafter Xiong, and Zhang et al., "Spelling Error Correction with Soft-Masked BERT" Shaohua Zhang, Haoran Huang, Jicong Liu and Hang Li , 9 pages, May 15, 2020, retrieved at https://arxiv.org/pdf/2005.07421.pdf, hereinafter Zhang and Nadejde.

Regarding claim 6, which depends from claim 1 and recites:
wherein the selecting an optimal candidate from the K candidates comprises: ranking the K candidates, and selecting the top M candidates after ranking, M being a positive integer greater than one and less than K; and 
Yu teaches the method of claim 1, including selecting an optimal candidate from the K candidates. Yu, Sections 2.2, 2.3, pages 220-222. As similarly discussed above, Yu teaches that, Secondly, each character in the candidate set will be tested whether it can form a legal word with its neighbors. Here, the character which can construct a legal word with its neighbors will be left for calculating its score by the language model. After filtering, the number of candidates has been reduced which will bring two benefits: most candidates that have been cut are irrelevant characters and less candidates makes the system be more efficient. At last, the best candidate means one character gets the highest score (selecting an optimal candidate from the K candidates, ranking the K candidates, and selecting the top M candidates after ranking (ranking the candidates by score, and selecting the top best M highest ranking score after ranking), M being a positive integer ) under a forward-backward 5-gram language model and the score is higher than the threshold (selecting an optimal candidate from the K candidates). If existing, the original character finally will be recognized as an error character and it will be replaced by the best candidate. We only use the language model and to choose the best candidate because we find that the 
Thus, Yu teaches selecting an optimal candidate from the K candidates, ranking the K candidates, and selecting the top M candidates after ranking, M being a positive integer and less than K. Yu does not specifically disclose that selecting the top best M highest ranking, M being a positive integer greater than one.
However, Xiong teaches in the field related to ranking models for Chinese spelling correction. Xiong, abstract. Xiong teaches that, 3.2 Ranking Candidates In the candidates generation phase, top-k best candidates for a sentence are generated, but the HMM-based framework does not have the flexibility to incorporate a wide variety of features useful for spelling correction, such as the online search results and CKIP Parser results, which can significantly improve the precision of spelling correction. Given the original sentence, our system first creates a list of candidate sentences. The candidates in the list will be re-ranked at this stage based on the confidence score generated by a ranker, herein by a SVM classifier. We choose the top-2 candidates (selecting the top best M highest ranking, M being a positive integer greater than one) in the re-ranked candidate list to make the final decision. Xiong, section 3.2, page 136.
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model and selecting an optimal candidate from the K candidates, ranking the K candidates, and selecting the top M candidates after ranking, M being a positive integer and less than K of Yu using the feature for selecting the top best M highest ranking, M 
fusing context information of the position of the character to be processed for decoding, and selecting the optimal candidate from the M candidates.  
Yu teaches the method of claim 1, including selecting an optimal candidate from the K candidates and the character to processed.  Yu, Sections 2.2, 2.3, pages 220-222. Yu in view of Xiong does not specifically disclose decoding. 
However, Nadejde teaches encoding and decoding processing. Nadejde, para 34, 52, 60, 62, 65-67, 72, 74, 76, 78. The Examiner also notes that the claim merely recites processing for an intended purpose of decoding, and does not actually perform decoding.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for a character to be processed, determining the score of each character in the vocabulary with a language model and selecting an optimal candidate from the K candidates, ranking the K candidates, and selecting the top M candidates after ranking, M being a positive integer and less than K of Yu using the feature for selecting the top best M highest ranking, M being a positive integer greater than one of Xiong and decoding of Nadejde, with a reasonable expectation of success, in order to improve the precision of correction and to decode and output a grammatically and fluency-corrected version of the input digital text sequence. Xiong, section 3.2 Ranking Candidates, page 136.Nadejde, para 34.
does not specifically disclose “fusing context information of the position of the character.
However, Zhang teaches in the field related to spelling error correction and language model. Zhang, abstract. Zhang teaches that, Soft-masked BERT is able to make more effective use of global context information than BERT-Finetue. With soft masking the likely errors are identified, and as a result the model can better leverage the power of BERT to make sensible reasoning for error correction by referring to not only local context but also global context (fusing context information of the position of the character to be processed). … BERT-Finetune cannot rectify the mistake, but Soft-Masked BERT can, because the error detection can only be accurately conducted with global context information. Zhang, section 3.7 Discussions, page 7.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for a character to be processed, for determining the score of each character in the vocabulary with a language model and selecting an optimal candidate from the K candidates, ranking the K candidates, and selecting the top M candidates after ranking, M being a positive integer and less than K of Yu using the feature for selecting the top best M highest ranking, M being a positive integer greater than one of Xiong and decoding of Nadejde and the fusing context information of the position of the character to be processed of Zhang, with a reasonable expectation of success, in order to improve the precision of correction and to decode and output a grammatically and fluency-corrected version of the input digital text sequence and to provide more .

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Yu and Xiong, Nadejde and Zhang as applied to claim 6 above, and further in view of Applicant Admitted Prior Art.

Regarding claim 7, which depends from claim 6 and recites:
wherein the ranking the K candidates comprises: acquiring predetermined features corresponding to any candidate, and scoring the candidate according to the predetermined features and a pre-trained candidate ranking model: and ranking the K candidates according to the corresponding scores from high to low.  
Yu in view of Xiong, Nadejde and Zhang teaches the method of claim 6 from which claim 7 depends, including ranking the K candidates. As similarly discussed above, Yu teaches ranking the K candidates according to the corresponding scores from high to low in that Yu discloses selecting the optimal candidate with the highest ranked corresponding score. Yu, sections 2.2, 2.3, page 221. As similarly discussed above, Yu does not specifically disclose the pre-trained model. 
However, Applicant’s present application discloses that, “How to pre-train the language model at the character granularity is a prior art.” See paragraph 26 of the originally filed specification.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model and ranking the K candidates of Yu using the pre-trained language model of Applicant’s Admitted Prior Art, with a reasonable expectation of success, as it would have allowed a user to score the characters of the vocabulary using a pre-trained model. This would have provided the user with the advantage of saving time and computational resources, as the model would not be required to be trained at run-time.
Yu in view of Applicant Admitted Prior Art and does not specifically disclose acquiring predetermined features corresponding to any candidate, and scoring the candidate according to the predetermined features.
However, Xiong teaches that, 3.2 Ranking Candidates In the candidates generation phase, top-k best candidates for a sentence are generated, but the HMM-based framework does not have the flexibility to incorporate a wide variety of features useful for spelling correction, such as the online search results and CKIP Parser results, which can significantly improve the precision of spelling correction. Given the original sentence, our system first creates a list of candidate sentences. The candidates in the list will be re-ranked at this stage based on the confidence score generated by a ranker, herein by a SVM classifier. We choose the top-2 candidates in the re-ranked candidate list to make the final decision. 
We use a lot of features in the re-ranking phase (acquiring predetermined features corresponding to any candidate, and scoring the candidate according to the predetermined features). The features can be grouped into the following categories: 1) 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for a character to be processed, for determining the score of each character in the vocabulary with a language model and selecting an optimal candidate from the K candidates, ranking the K candidates, and selecting the top M candidates after ranking, M being a positive integer and less than K of Yu using the using the pre-trained language model of Applicant’s Admitted Prior Art, and decoding of Nadejde and the feature for selecting the top best M highest ranking, M being a positive integer greater than one and acquiring predetermined features corresponding to any candidate, and scoring the candidate according to the predetermined features of Xiong and the fusing context information of the position of the character to be processed of Zhang, with a reasonable expectation of success, in order to improve the precision of correction and to decode and output a grammatically and fluency-corrected version of the input digital text sequence and to provide more accurate error detection using context information .

Claims 8-9 and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Yu and Nadejde. 

Regarding Claim 8, Yu teaches:
An electronic device, comprising: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform a method for correcting character errors, wherein the method comprises:
Yu teaches a method for correcting character errors in that Yu discloses, Spelling check… If the probability is higher than a predefined threshold, then we replace the original character, or we consider the original character as correct and take no action. Yu, Abstract, Sections 2.2, 2.3 Spelling Error Correction, pages 220, 221-223.
Yu does not specifically disclose An electronic device, comprising: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform a method.
However, Nadejde teaches in the field related to computer software for grammatical error correction. Nadejde, para 1. Nadejde teaches an electronic device, comprising: at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform a method. Nadejde, Fig 5, para 93-95, 96-103.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to perform the method for correcting character errors of Yu using the electronic device of Nadejde, with a reasonable expectation of success, as it would allow the user to perform method using an electronic device. Nadejde, Fig 5, para 93-103.
for a character to be processed, acquiring the score of each character in a pre-constructed vocabulary, the score being a score of the reasonability of the character in the vocabulary at the position of the character to be processed (i.e.,  Yu discloses calculating the score of each character using a language model (for a character to be processed, acquiring the score of each character in a pre-constructed vocabulary), while the score is less than some threshold (score reasonability), the character and its location (the score being a score of the reasonability of the character in the vocabulary (vocabulary language model) at the position of the character to be processed) is are sent to step 2. In step 2, we need to filter the characters generated in step 1. We will judge the character whether it can construct a word. Otherwise, we will make the assumption that I may be a spelling error which means we are still not sure acquiring the score of each character in a pre-constructed vocabulary, the score being a score of the reasonability of the character in the vocabulary at the position of the character to be processed (testing, scoring each character in the vocabulary for the reasonability of the character in the vocabulary at the position of the character to be processed)). Here, the character will be left for calculating its score by the language model. Yu, Abstract, Section 2.3, page 221); 
selecting top K characters as candidates of the character to be processed, K being a positive integer greater than one; and 
selecting an optimal candidate from the K candidates, and replacing the character to be processed with the optimal candidate if the optimal candidate is different from the character to be processed (i.e., Secondly, each character in the candidate set will be tested whether it can form a legal word with its neighbors. Here, the character which can construct a legal word with its neighbors will be left for calculating its score by the language model. After filtering, the number of candidates has been reduced which will bring two benefits: most candidates that have been cut are irrelevant characters and less candidates makes the system be more efficient (selecting top K characters as candidates of the character to be processed (filtering and cutting out irrelevant characters and making less candidates), K being a positive integer greater than one). At last, the best candidate means one character gets the highest score under a forward-backward 5-gram language model and the score is higher than the threshold (selecting an optimal candidate from the K candidates). If existing, the original character replacing the character to be processed with the optimal candidate if the optimal candidate is different from the character to be processed). We only use the language model and to choose the best candidate because we find that the language model can get a quite high accuracy if we can provide a suitable candidate set successfully. Yu, Sections 2.3, page 221).  

Regarding claim 9, which depends from claim 8 and further recites:
using N characters in a text to be processed as the characters to be processed, N being a positive integer and having a maximum value equal to the number of characters comprised in the text to be processed (i.e., Step 1, we calculate the score of each character in a sentence (using N characters in a text (each of the N characters in a text sentence) to be processed as the characters to be processed, N being a positive integer and having a maximum value equal to the number of characters comprised in the text (sentence) to be processed) by a forward-backward 5 gram language model. Yu, Abstract, Sections 2.2, 2.3, pages 221, 222. ). 
 
Regarding claim 15, Yu teaches
A non-transitory computer-readable storage medium storing computer instructions therein, wherein the computer instructions are used to cause the computer to perform a method for correcting character errors, wherein the method comprises: 

Yu does not specifically disclose a non-transitory computer-readable storage medium storing computer instructions therein, wherein the computer instructions are used to cause the computer to perform a method.
However, the Nadejde teaches a non-transitory computer-readable storage medium storing computer instructions therein, wherein the computer instructions are used to cause the computer to perform a method is notoriously well known by those of ordinary skill in the art. Nadejde, Fig 5, para 98, 93-103.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to perform the method for correcting character errors of Yu using the non-transitory computer-readable storage medium storing computer instructions therein, wherein the computer instructions are used to cause the computer to perform a method of Nadejde, with a reasonable expectation of success, as it would allow the user to perform method using an computer-readable storage medium storing computer instructions. Nadejde, Fig 5, para 93-103.
for a character to be processed, acquiring the score of each character in a pre-constructed vocabulary, the score being a score of the reasonability of the character in the vocabulary at the position of the character to be processed (i.e.,  Yu discloses calculating the score of each character using a language model (for a character to be processed, acquiring the score of each character in a pre-constructed vocabulary), while the score is less than some threshold (score reasonability), the character and its location (the score being a score of the reasonability of the character in the vocabulary (vocabulary language model) at the position of the character to be processed) is are sent to step 2. In step 2, we need to filter the characters generated in step 1. We will judge the character whether it can construct a word. Otherwise, we will make the assumption that I may be a spelling error which means we are still not sure about it. Yu, Abstract, Sections 2.2, 2.3, page 221. Secondly, each character in the candidate set will be tested whether it can form a legal word with its neighbors (acquiring the score of each character in a pre-constructed vocabulary, the score being a score of the reasonability of the character in the vocabulary at the position of the character to be processed (testing, scoring each character in the vocabulary for the reasonability of the character in the vocabulary at the position of the character to be processed)). Here, the character will be left for calculating its score by the language model. Yu, Abstract, Section 2.3, page 221); 
selecting top K characters as candidates of the character to be processed, K being a positive integer greater than one; and 
selecting an optimal candidate from the K candidates, and replacing the character to be processed with the optimal candidate if the optimal candidate is different from the character to be processed (i.e., Secondly, each character in the candidate set will be tested whether it can form a legal word with its neighbors. Here, the character which can construct a legal word with its neighbors will be left for calculating its score by the language model. After filtering, the number of candidates has been reduced which will bring two benefits: most candidates that have been cut are selecting top K characters as candidates of the character to be processed (filtering and cutting out irrelevant characters and making less candidates), K being a positive integer greater than one). At last, the best candidate means one character gets the highest score under a forward-backward 5-gram language model and the score is higher than the threshold (selecting an optimal candidate from the K candidates). If existing, the original character finally will be recognized as an error character and it will be replaced by the best candidate (replacing the character to be processed with the optimal candidate if the optimal candidate is different from the character to be processed). We only use the language model and to choose the best candidate because we find that the language model can get a quite high accuracy if we can provide a suitable candidate set successfully. Yu, Sections 2.3, page 221).  

Claim 16 recites a non-transitory computer-readable storage medium that parallels the electronic device of claim 9. Therefore, the analysis discussed above with respect to claim 9 also applies to claim 16. Accordingly, claim 16 is rejected based on substantially the same rationale as set forth above with respect to claim 9. 

Claims 10 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Yu and Nadejde as applied to claims 8 and 15 above, and further in view of Applicant Admitted Prior Art.

Regarding claim 10, which depends from claim 8 and recites:
wherein the acquiring the score of each character in a pre-constructed vocabulary comprises: determining the score of each character in the vocabulary with a pre-trained language model.  
Yu in view of Nadejde teaches the electronic device of claim 8. As similarly discussed above with respect to claim 1, Yu teaches calculating the score of each character using a language model (acquiring the score of each character in a pre-constructed vocabulary comprises: determining the score of each character in the vocabulary with a ), while the score is less than some threshold, the character and its location is are sent to step 2. In step 2, we need to filter the characters generated in step 1. We will judge the character whether it can construct a word. Otherwise, we will make the assumption that I may be a spelling error which means we are still not sure about it. Yu, Abstract, Sections 2.2, 2.3, page 221. 
Thus, Yu teaches determining the score of each character in the vocabulary with a language model. Yu does not explicitly disclose “pre-trained” language model.
However, Applicant’s present application discloses that, “How to pre-train the language model at the character granularity is a prior art.” See paragraph 26 of the originally filed specification.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model of Yu using the electronic device of Nadejde and the pre-trained language model 

Claim 17 recites a non-transitory computer-readable storage medium that parallels the electronic device of claim 10. Therefore, the analysis discussed above with respect to claim 10 also applies to claim 17. Accordingly, claim 17 is rejected based on substantially the same rationale as set forth above with respect to claim 10. 

Claims 11-12 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Yu, Nadejde and Applicant Admitted Prior Art as applied to claims 10 and 17 above, and further in view of Higgins.

Regarding claim 11, which depends from claim 10 and recites:
wherein a method for acquiring the language model comprises: acquiring first-class training data, any piece of the first-class training data comprising an input text and an output text which are the same and do not contain wrongly written characters; 
Yu in view of Nadejde and Applicant Admitted Prior Art, teaches the electronic device of claim 10, including the acquired language model, input character text and output character text. Yu, Abstract, Sections 2.2, 2.3, page 221.  Yu in view of Nadejde and Applicant Admitted Prior Art does not specifically disclose acquiring first-class 
However, Higgins teaches in the field related to clinical decision support and processing free text using machine-learning based approaches Higgins, para 1, 4-5, . Higgins, which is analogous to the claimed invention because Higgins is directed toward processing free text using machine-learning based approaches, teaches that, In accordance with this embodiment, one learning machine is pre-trained using a set of error-free clinical data in text format (unstructured data) as the training set (acquiring first-class training data, any piece of the first-class training data comprising an input text and an output text which are the same and do not contain wrongly written characters). Higgins, para 44.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model of Yu using the electronic device of Nadejde and the pre-trained language model of Applicant’s Admitted Prior Art and the feature for acquiring first-class training data, any piece of the first-class training data comprising an input text and an output text which are the same and do not contain wrongly written characters of Higgins, with a reasonable expectation of success, as it would have allowed a user to score the characters of the vocabulary using a pre-trained model and to more accurately pre-train the model using error-free text training data. This would have provided the user with the advantage of saving time and computational resources, as the model would not be 
pre-training the language model at character granularity utilizing the first-class training data; 
As similarly discussed above with respect to claim 10, from which claim 11 depends, Yu teaches determining the score of each character in the vocabulary with a language model. Yu, Abstract, Sections 2.2, 2.3, page 221. Yu does not explicitly disclose pre-training the language model at the character granularity.  However, Applicant’s present application discloses that, “How to pre-train the language model at the character granularity is a prior art.” See paragraph 26 of the originally filed specification. Thus, Yu in view of Applicant Admitted Prior Art teaches pre-training the language model at character granularity. Yu in view of Nadejde and Applicant Admitted Prior Art does not specifically disclose utilizing the first class training data. 
However, as discussed above, Higgins teaches that, In accordance with this embodiment, one learning machine is pre-trained using a set of error-free clinical data in text format (unstructured data) as the training set (utilizing first-class training data). Higgins, para 44.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model of Yu using electronic device of Nadejde and the pre-trained language model of Applicant’s Admitted Prior Art and the feature for acquiring first-class training data, any piece of the first-class training data comprising an input text and an output text which . This would have provided the user with the advantage of saving time and computational resources, as the model would not be required to be trained at run-time and would provide the user with correct training data for more accurately pre-training the model. Higgins, para 4-5, 44.
acquiring second-class training data, any piece of the second-class training data comprising an input text containing wrongly written characters and an output text which corresponds to the input text and does not contain wrongly written characters; and fine-tuning the language model using the second-class training data.  
As discussed above, Yu in view of Applicant Admitted Prior Art and Higgins teaches the language model. Yu in view of Applicant Admitted Prior Art and Higgins does not specifically disclose acquiring second-class training data, any piece of the second-class training data comprising an input text containing wrongly written characters and an output text which corresponds to the input text and does not contain wrongly written characters; and fine-tuning the language model using the second-class training data.  
However, Nadejde teaches that, After the initial training, parameter values for only a subset of the model parameters are fine-tuned using in-domain training data that includes uncorrected text sequences that are labeled with the native languages and acquiring second-class training data, any piece of the second-class training data comprising an input text containing wrongly written characters and an output text which corresponds to the input text and does not contain wrongly written characters, and fine-tuning the language model using the second-class training data) may be labeled with corresponding error codes, which may indicate, for a particular text sequence, at least one type of error that is present in the text sequence and the location of the error within the text sequence. In an embodiment, the subset of the model parameters that are fine-tuned includes only the model parameter values for the encoder, for example the embedding and/or encoding layers of the adapted model, while model parameter values for other layers of the adapted model, such as the decoder, are not fine-tuned. Nadejde, Fig 4A, para 18, 52.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model of Yu using the pre-trained language model of Applicant’s Admitted Prior Art and the feature for acquiring first-class training data, any piece of the first-class training data comprising an input text and an output text which are the same and do not contain wrongly written characters and utilizing the first-class training data of Higgins and the electronic device and feature for acquiring second-class training data, any piece of the 

Regarding claim 12, which depends from claim 11 and recites:
wherein in the fine-tuning process, only the loss of the position of the wrongly written character is calculated.  
Yu in view of Nadejde, Applicant Admitted Prior Art, and Higgins teaches the electronic device of claim 11, from which claim 12 depends, including the fine-tuning process and the character position.  As similarly discussed above, Yu teaches calculating the score of each character using a language model, while the score is less than some threshold, the character and its location (only the loss of the position (location) of the wrongly written character is calculated) is are sent to step 2. In step 2, we need to filter the characters generated in step 1. We will judge the character whether does not specifically disclose the fine-tuning process. 
However, as similarly discussed above, Nadejde teaches that, After the initial training, parameter values for only a subset of the model parameters are fine-tuned (in the fine-tuning process, only subset of parameter values fine-tuned) using in-domain training data that includes uncorrected text sequences that are labeled with the native languages and proficiency levels of the sources of the respective uncorrected text sequences, as well as grammatically corrected versions of the native language and proficiency-labeled uncorrected source text sequences. Although not required, in some implementations, uncorrected text sequence-corrected text sequence pairs in the dataset used for fine tuning may be labeled with corresponding error codes, which may indicate, for a particular text sequence, at least one type of error that is present in the text sequence and the location of the error within the text sequence.. Nadejde, Fig 4A, para 18, 52.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model and only the loss of the position of the wrongly written character is calculated of Yu using the pre-trained language model of Applicant’s Admitted Prior Art and the feature for acquiring first-class training data, any piece of the first-class training data comprising an input text and an output text which are the same and do not contain 

Claims 18-19 recite non-transitory computer-readable storage media that parallel the electronic devices of claims 11-12, respectively. Therefore, the analysis discussed above with respect to claims 11-12 also applies to claims 18-19, respectively. Accordingly, claims 18-19 are rejected based on substantially the same rationale as set forth above with respect to claims 11-12, respectively. 

Claims 13 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Yu in view of Nadejde as applied to claims 8 and 15 above, and further in view of Xiong et al., "Extended HMM and Ranking models for Chinese Spelling Correction, Proceedings of the Third CIPS-SIGHAN Joint Conference on Chinese Language Processing, pages 133–138, October 2014, retrieved at https://aclanthology.org/W14-6821.pdf, hereinafter Xiong, and Zhang et al., "Spelling Error Correction with Soft-Masked BERT" Shaohua Zhang, Haoran Huang, Jicong Liu and Hang Li , 9 pages, May 15, 2020, retrieved at https://arxiv.org/pdf/2005.07421.pdf, hereinafter Zhang and Nadejde.

Regarding claim 13, which depends from claim 8 and recites:
wherein the selecting an optimal candidate from the K candidates comprises: ranking the K candidates, and selecting the top M candidates after ranking, M being a positive integer greater than one and less than K; and 
Yu in view of Nadejde teaches the electronic device of claim 8, including selecting an optimal candidate from the K candidates. Yu, Sections 2.2, 2.3, pages 220-222. As similarly discussed above, Yu teaches that, Secondly, each character in the candidate set will be tested whether it can form a legal word with its neighbors. Here, the character which can construct a legal word with its neighbors will be left for calculating its score by the language model. After filtering, the number of candidates has been reduced which will bring two benefits: most candidates that have been cut are irrelevant characters and less candidates makes the system be more efficient. At last, the best candidate means one character gets the highest score (selecting an optimal candidate from the K candidates, ranking the K candidates, and selecting the top M candidates after ranking (ranking the candidates by score, and selecting the top best M highest ranking score after ranking), M being a positive integer ) under a forward-backward 5-gram language model and the score is higher than the threshold (selecting an optimal candidate from the K candidates). If existing, the original character finally will be recognized as an error character and it will be replaced by the best candidate. We only use the language model and to choose the best candidate because we find that the language model can get a quite high accuracy if we can provide a suitable candidate set successfully. Yu, Sections 2.3, page 221. 
Thus, Yu in view of Nadejde teaches selecting an optimal candidate from the K candidates, ranking the K candidates, and selecting the top M candidates after ranking, M being a positive integer and less than K. Yu in view of Nadejde does not specifically disclose that selecting the top best M highest ranking, M being a positive integer greater than one.
However, Xiong teaches in the field related to ranking models for Chinese spelling correction. Xiong, abstract. Xiong teaches that, 3.2 Ranking Candidates In the candidates generation phase, top-k best candidates for a sentence are generated, but the HMM-based framework does not have the flexibility to incorporate a wide variety of features useful for spelling correction, such as the online search results and CKIP Parser results, which can significantly improve the precision of spelling correction. Given the original sentence, our system first creates a list of candidate sentences. The candidates in the list will be re-ranked at this stage based on the confidence score generated by a ranker, herein by a SVM classifier. We choose the top-2 candidates selecting the top best M highest ranking, M being a positive integer greater than one) in the re-ranked candidate list to make the final decision. Xiong, section 3.2, page 136.
 It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model and selecting an optimal candidate from the K candidates, ranking the K candidates, and selecting the top M candidates after ranking, M being a positive integer and less than K of Yu using the electronic device of Nadejde and the feature for selecting the top best M highest ranking, M being a positive integer greater than one of Xiong, with a reasonable expectation of success, in order to improve the precision of correction. Xiong, section 3.2 Ranking Candidates, page 136.
fusing context information of the position of the character to be processed for decoding, and selecting the optimal candidate from the M candidates.  
Yu in view of Nadejde teaches the electronic device of claim 8, including selecting an optimal candidate from the K candidates and the character to processed for decoding. Yu, Sections 2.2, 2.3, pages 220-222. Yu in view of Xiong does not specifically disclose decoding. 
However, Nadejde teaches encoding and decoding processing. Nadejde, para 34, 52, 60, 62, 65-67, 72, 74, 76, 78. The Examiner also notes that the claim merely recites processing for an intended purpose of decoding, and does not actually perform decoding.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction , ranking the K candidates, and selecting the top M candidates after ranking, M being a positive integer and less than K of Yu using the feature for selecting the top best M highest ranking, M being a positive integer greater than one of Xiong and the electronic device and decoding of Nadejde, with a reasonable expectation of success, in order to improve the precision of correction and to decode and output a grammatically and fluency-corrected version of the input digital text sequence. Xiong, section 3.2 Ranking Candidates, page 136.Nadejde, para 34.
Yu in view of Xiong and Nadejde does not specifically disclose “fusing context information of the position of the character.
However, Zhang teaches in the field related to spelling error correction and language model. Zhang, abstract. Zhang teaches that, Soft-masked BERT is able to make more effective use of global context information than BERT-Finetue. With soft masking the likely errors are identified, and as a result the model can better leverage the power of BERT to make sensible reasoning for error correction by referring to not only local context but also global context (fusing context information of the position of the character to be processed). … BERT-Finetune cannot rectify the mistake, but Soft-Masked BERT can, because the error detection can only be accurately conducted with global context information. Zhang, section 3.7 Discussions, page 7.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for a character to be processed, for determining the score of each character in , ranking the K candidates, and selecting the top M candidates after ranking, M being a positive integer and less than K of Yu using the feature for selecting the top best M highest ranking, M being a positive integer greater than one of Xiong and the electronic device and decoding of Nadejde and the fusing context information of the position of the character to be processed of Zhang, with a reasonable expectation of success, in order to improve the precision of correction and to decode and output a grammatically and fluency-corrected version of the input digital text sequence and to provide more accurate error detection using context information. Xiong, section 3.2 Ranking Candidates, page 136. Nadejde, para 34. Zhang, section 3.7, page 7.

Claim 20 recites a non-transitory computer-readable storage medium that parallels the electronic device of claim 13. Therefore, the analysis discussed above with . 

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Yu in view of Nadejde, Xiong and Zhang as applied to claim 13 above, and further in view of Applicant Admitted Prior Art.

Regarding claim 14, which depends from claim 13 and recites:
wherein the ranking the K candidates comprises: acquiring predetermined features corresponding to any candidate, and scoring the candidate according to the predetermined features and a pre-trained candidate ranking model: and ranking the K candidates according to the corresponding scores from high to low.  
Yu in view of Nadejde, Xiong and Zhang teaches the electronic device of claim 13 from which claim 14 depends, including ranking the K candidates. As similarly discussed above, Yu teaches ranking the K candidates according to the corresponding scores from high to low in that Yu discloses selecting the optimal candidate with the highest ranked corresponding score. Yu, sections 2.2, 2.3, page 221. As similarly discussed above, Yu does not specifically disclose the pre-trained model. 
However, Applicant’s present application discloses that, “How to pre-train the language model at the character granularity is a prior art.” See paragraph 26 of the originally filed specification.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for determining the score of each character in the vocabulary with a language model and ranking the K candidates of Yu using the pre-trained language model of Applicant’s Admitted Prior Art, with a reasonable expectation of success, as it would have allowed a user to score the characters of the vocabulary using a pre-trained model. This would have provided the user with the advantage of saving time and computational resources, as the model would not be required to be trained at run-time.
Yu in view of Nadejde and Applicant Admitted Prior Art and does not specifically disclose acquiring predetermined features corresponding to any candidate, and scoring the candidate according to the predetermined features.
However, Xiong teaches that, 3.2 Ranking Candidates In the candidates generation phase, top-k best candidates for a sentence are generated, but the HMM-based framework does not have the flexibility to incorporate a wide variety of features useful for spelling correction, such as the online search results and CKIP Parser results, which can significantly improve the precision of spelling correction. Given the original sentence, our system first creates a list of candidate sentences. The candidates in the list will be re-ranked at this stage based on the confidence score generated by a ranker, herein by a SVM classifier. We choose the top-2 candidates in the re-ranked candidate list to make the final decision. 
We use a lot of features in the re-ranking phase (acquiring predetermined features corresponding to any candidate, and scoring the candidate according to the predetermined features). The features can be grouped into the following categories: 1) 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement the character correction method for a character to be processed, for determining the score of each character in the vocabulary with a language model and selecting an optimal candidate from the K candidates, ranking the K candidates, and selecting the top M candidates after ranking, M being a positive integer and less than K of Yu using electronic device and decoding of Nadejde and the using the pre-trained language model of Applicant’s Admitted Prior Art, and the feature for selecting the top best M highest ranking, M being a positive integer greater than one and acquiring predetermined features corresponding to any candidate, and scoring the candidate according to the predetermined features of Xiong and the fusing context information of the position of the character to be processed of Zhang, with a reasonable expectation of success, in order to improve the precision of correction and to decode and output a grammatically and fluency-corrected version of the input digital text sequence and to provide more accurate error detection using context information and to save time and computational resources, 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BARBARA M LEVEL whose telephone number is (303)297-4748. The examiner can normally be reached Monday through Friday 8:00 AM - 5:00 PM MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Scott T Baderman can be reached on (571) 272-3644. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 





/BARBARA M LEVEL/Examiner, Art Unit 2144                                                                                                                                                                                                        


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 This recitation has been evaluated under Step 2A, Prong 1, and the Examiner concludes:  Mental processes – concepts performed in the human mind (including an observation, evaluation, judgment, opinion).  MPEP 2106.04(a)(2)(III)  The courts consider a mental process (thinking) that "can be performed in the human mind, or by a human using a pen and paper" to be an abstract idea. CyberSource Corp. v. Retail Decisions, Inc., 654 F.3d 1366, 1372, 99 USPQ2d 1690, 1695 (Fed. Cir. 2011). As the Federal Circuit explained, "methods which can be performed mentally, or which are the equivalent of human mental work, are unpatentable abstract ideas the ‘basic tools of scientific and technological work’ that are open to all.’" 654 F.3d at 1371, 99 USPQ2d at 1694 (citing Gottschalk v. Benson, 409 U.S. 63, 175 USPQ 673 (1972)). See also Mayo Collaborative Servs. v. Prometheus Labs. Inc., 566 U.S. 66, 71, 101 USPQ2d 1961, 1965 ("‘[M]ental processes[] and abstract intellectual concepts are not patentable, as they are the basic tools of scientific and technological work’" (quoting Benson, 409 U.S. at 67, 175 USPQ at 675)); Parker v. Flook, 437 U.S. 584, 589, 198 USPQ 193, 197 (1978) (same)...Nor do the courts distinguish between claims that recite mental processes performed by humans and claims that recite mental processes performed on a computer. [ID:(S2AP1)1030]
        2 This recitation has been evaluated under Step 2A, Prong 2, and the Examiner concludes:  The invention is not integrated into a practical application.   An additional element adds insignificant extra-solution activity to the judicial exception. See also, MPEP 2106.05(g).  MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 31. [ID:(S2AP2)1120]
        3 This recitation has been evaluated under Step 2B, and the Examiner concludes:  Insignificant Extra-Solution Activity.  MPEP 2106.05(g).  The term "extra-solution activity" can be understood as activities incidental to the primary process or product that are merely a nominal or tangential addition to the claim. Extra-solution activity includes both pre-solution and post-solution activity.    [ID:(S2B)1610]
        4 This recitation has been evaluated under Step 2A, Prong 2, and the Examiner concludes:  The invention is not integrated into a practical application.   An additional element merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See also, MPEP 2106.05(f).  MPEP 2106.04(d); 2019 Guidance, 84 FR 50 at 55.  See, 2019 Guidance, 84 FR 50, footnote 30. [ID:(S2AP2)1110]
        5 This recitation has been evaluated under Step 2B, and the Examiner concludes:  Generic Computer.  Alice Corp.  “We conclude that the method claims, which merely require generic computer implementation, fail to transform that abstract idea into a patent eligible invention,” Alice Corp., slip op. 13-298, at 10. [ID:(S2B)1590]
        6 This recitation has been evaluated under Step 2B, and the Examiner concludes:  Mere Instructions To Apply An Exception.  MPEP 2106.05(f).  As explained by the Supreme Court, in order to make a claim directed to a judicial exception patent-eligible, the additional element or combination of elements must do "‘more than simply stat[e] the [judicial exception] while adding the words ‘apply it’". Alice Corp. v. CLS Bank, 573 U.S. 208, 221, 110 USPQ2d 1976, 1982-83 (2014) (quoting Mayo Collaborative Servs. V. Prometheus Labs., Inc., 566 U.S. 66, 72, 101 USPQ2d 1961, 1965).  [ID:(S2B)1600]