DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The present application was filed on July 31, 2019.
Claims 1-20 are presented for examination and are pending.

Information Disclosure Statement
The information disclosure statement(s) (IDS) was/were submitted on July 31, 2019. The submission(s) are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement(s) are being considered by the examiner.

Drawings
The drawings filed on July 31, 2019 are accepted.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding Claim 1, 
A system for machine-learnt field- specific standardization, the system comprising: one or more processors; and a non-transitory computer readable medium…, which is directed to a machine, one of the statutory categories.

Claim 1 recites the following limitations: 
tokenize raw values and corresponding standardized values into raw token sequences and corresponding standardized token sequences;
learn… standardization from token insertions and token substitutions that modify the raw token sequences to match the corresponding standardized token sequences;
tokenize an input value into an input token sequence;
determine… a probability of inserting an insertion token after an insertion markable token in the input token sequence;
determine whether the probability of inserting the insertion token satisfies a threshold;
insert the insertion token after the insertion markable token in the input token sequence, in response to a determination that the probability of inserting the insertion token satisfies the threshold;
determine… a probability of substituting a substitution token for a substitutable token in the input token sequence;
determine whether the probability of substituting the substitution token satisfies another threshold; and
substitute the substitution token for the substitutable token in the input token sequence, in response to a determination that the probability of substituting the substitution token satisfies the other threshold.


The abstract idea of Claim 1 is not integrated into a practical application because the additional elements recited in Claim 1 are:
one or more processors; and 
a non-transitory computer readable medium storing a plurality of instructions, which when executed, cause the one or more processors to: 
by a machine-learning model
Instructions to apply the abstract idea on generic computer components (one or more processors; and a non-transitory computer readable medium storing a plurality of instructions, which when executed, cause the one or more processors to:) do not represent a practical application of the abstract idea (see MPEP 2106.05(f)). Further, generally linking the abstract idea to a particular technological environment or field of use (by a machine-learning model) cannot integrate the abstract idea into a practical application (see MPEP 2106.05(h)), these additional elements merely specify that the above mental process steps are performed in a particular technological environment. Therefore, Claim 1 is directed to an abstract idea.

Finally, the additional elements, taken alone or in combination, do not represent significantly more than the abstract idea itself. Generally linking the abstract idea to a field of use or technological environment (by a machine-learning model) does not provide an inventive concept (see MPEP 2106.05(h)) and using generic computer components (one or more processors; and a non-transitory computer readable medium storing a plurality of instructions, which when executed, cause the one or more processors to:) to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer which cannot provide an inventive concept. Therefore, Claim 1 is subject-matter ineligible. 

Regarding Claim 2, 
Claim 2 depends on claim 1, and only includes an additional element that amounts to recitation of insignificant extra-solution activity (…receive at least one input value) which amounts to insignificant extra-solution activity of data gathering, See MPEP 2106.05(g). Further, MPEP 2106(d)(II) notes the following, "The courts have recognized the following computer functions as well-understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity...i. Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); TLI Communications LLC v. AV Auto. LLC, 823 F.3d 607, 610, 118 USPQ2d 1744, 1745 (Fed. Cir. 2016) (using a telephone for image transmission); OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network); ". Accordingly, the additional element does not integrate the abstract idea into a practical application because the recitation of insignificant extra solution activity is well-understood, routine, and conventional. The claim thus remains subject-matter ineligible.

Regarding Claims 3-7, 
Claims 3-7 are dependent on claim 1, and only includes additional limitations drawn to mental processes (Claim 3: wherein tokenizing the raw values and the corresponding standardized values into the raw token sequences and the corresponding standardized token sequences comprises aligning the raw token sequences with the corresponding standardized token sequences; Claim 4: wherein determining the probability of inserting the insertion token after the insertion markable token in the input token sequence is based on a count of instances that the insertion token is inserted after a class of the insertion markable token in any raw token sequence and a count of instances that any raw token sequence includes the class of the insertion markable token; Claim 5: wherein determining the probability of substituting the substitution token for the substitutable token in the input token sequence is based on a count of instances that the substitution token is substituted for a class of the substitutable token in any raw token sequence and a count of instances that any raw token sequence includes the class of the substitutable token; Claim 6: …rearrange tokens in the input token sequence; Claim 7: …join tokens in the input token sequence.) These claims do not recite any additional elements beyond those recited in claim 1, and as such do not recite any additional elements which could integrate the abstract idea into a practical application or be significantly more than the abstract idea. The claims thus remain subject-matter ineligible.

Regarding Claim 8,
Claim 8 is directed to A computer program… which is directed to an article of manufacture, one of the statutory categories. Claim 8 recites: A computer program product comprising computer-readable program code to be executed by one or more processors when retrieved from a non-transitory computer-readable medium the program code including instructions to, which executes a process similar to the processes executed by the system of claim 1. As performing a mental process or abstract idea on a generic computer component cannot integrate the abstract idea into a practical application and cannot provide an inventive concept, Claim 8 remains subject matter ineligible.


Regarding Claims 9-14, 
Claims 9-14 are dependent on claim 8 and recite limitations similar to the limitations recited in claims 2-7, therefore is rejected with the same rationale applied to claims 2-7. These claims do not recite any additional elements beyond those recited in independent claim 8, and as such do not recite any additional elements which could integrate the abstract idea into a practical application or be significantly more than the abstract idea. The claims thus remain subject-matter ineligible.

Regarding Claim 15,
Claim 15 is directed to A method, which is directed to a process, one of the statutory categories. Claim 15 recites: A method for machine-learnt field-specific standardization, which performs operations similar to the processes executed by the system of claim 1. As performing a mental process or abstract with a method cannot integrate the abstract idea into a practical application and cannot provide an inventive concept, Claim 15 remains subject matter ineligible.

Regarding Claims 16-19, 
Claims 16-19 are dependent on claim 15 and recite limitations similar to the limitations recited in claims 2-7, therefore is rejected with the same rationale applied to claims 2-5. These claims do not recite any additional elements beyond those recited in independent claim 15, and as such do not recite any additional elements which could integrate the abstract idea into a practical application or be significantly more than the abstract idea. The claims thus remain subject-matter ineligible.

Regarding Claim 20, 
rearranging tokens in the input token sequence; and joining tokens in the input token sequence. This claim do not recite any additional elements beyond those recited in claim 15, and as such do not recite any additional elements which could integrate the abstract idea into a practical application or be significantly more than the abstract idea. Claim 20 thus remain subject-matter ineligible.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Wu et al. (“Learning Data Transformation Rules through Examples: Preliminary Results”) in view of Lefebure et al. (US 20190108257 A1). 

Regarding Claim 1,
Wu teaches:  
A system for machine-learnt field- specific standardization, the system comprising: one or more processors; and
a non-transitory computer readable medium storing a plurality of instructions, which when executed, cause the one or more processors to: (Page 4, Section 3.2: “Our searchOneTime algorithm uses the UCT algorithm[5], an extension of the UCB algorithm[1], to balance the exploration and exploitation in a large search space.” and Page 6: “This paper introduced a data transformation approach. It learns data transformations from user-provided examples. We define a grammar to describe common user editing behaviors. Our approach then reduces the larger grammar space to a disjunction of subgrammar spaces using examples. We apply a UCT-based search algorithm to identify consistent transformation programs in these subgrammar spaces.” teaches using an algorithm based on a UCT algorithm for data transformation, this suggests a computer-based implementation)

tokenize raw values and corresponding standardized values into raw token sequences and corresponding standardized token sequences; (Fig. 1 and Page 3, Section 3.1: “We use the following algorithm to generate subgrammar spaces. Step 1: Tokenize the original and target strings and add special start and end tokens” teaches tokenizing the original string (raw values) and the target string (standardized values) into tokenized strings (sequences))

learn, by a machine-learning model, standardization from token insertions and token substitutions that modify the raw token sequences to match the corresponding standardized token sequences; (Page 3, Section 3.1: “The search space for transformation programs is (ins|mov|-del)∗. Without loss of generality we refactor this space into(ins)∗(mov)∗(del)∗. The insert phase (ins)∗ , inserts tokens that appear in the target token sequence but were not part of the original token sequence. The move phase (mov)∗ reorders the tokens so that they appear in the same ordering as in the target sequence. The delete phase (del)∗removes tokens that do not appear in the target token sequence.” teaches performing token insertion and token deletion (a combination of token insertion Page 2, Fig. 1 and Page 3, Section 3.1: “In step 3, we generate a set of possible edit operation sequences from each alignment. For the pair “1 Lombard Street,London” and “London,1 Lombard Street”, the alignment algorithm determines that all tokens in the target token sequence can be mapped to the original token sequence. Consequently, only move operations are needed.” teaches performing token modifications to the original (raw) token sequence to match the target (standardized) token sequence; Page 3, Section 3.1: “In step 4, in order to generate data transformations that cover all examples, we cluster the edit operations derived from the same transformation program. We cluster together the edit sequences with the same length and edit operator type of different examples.” and Page 5, Section 3.3: “We rank the transformation programs using the confidence of the logistic regression classifier that classifies the results as regular or irregular. Unless there is a clear winner, we show users the top K results.” teaches performing clustering and performing logistic regression (machine learning models) to cluster transformations and rank transformation programs)

    PNG
    media_image1.png
    284
    646
    media_image1.png
    Greyscale

tokenize an input value into an input token sequence; (Page 3, Section 3.1: “We use the following algorithm to generate subgrammar spaces. Step 1: Tokenize the original and target strings and add special start and end tokens” and “In step 1, our approach tokenizes all the examples. For example “1 Lombard Street,London”is tokenized as START() NUMTYP(1) BNKTYP( ) WRDTYP(Lombard) BNKTYP( ) WRDTYP(Street) SYBTYP(,) WRDTYP(London) END(). NUMTYP, BNKTYP, WRDTYP and SYBTYP represent different token types (number, blank, word or symbol). The START() and END() identify the start and end of the original token sequence.” teaches tokenizing the original string (input value) into a sequence of tokens)

Wu does not appear to explicitly teach:  
determine, by the machine- learning model, a probability of inserting an insertion token after an insertion markable token in the input token sequence;
determine whether the probability of inserting the insertion token satisfies a threshold;
insert the insertion token after the insertion markable token in the input token sequence, in response to a determination that the probability of inserting the insertion token satisfies the threshold;
determine, by the machine- learning model, a probability of substituting a substitution token for a substitutable token in the input token sequence;
determine whether the probability of substituting the substitution token satisfies another threshold; and
substitute the substitution token for the substitutable token in the input token sequence, in response to a determination that the probability of substituting the substitution token satisfies the other threshold.

However, Lefebure teaches: 
determine, by the machine- learning model, a probability of inserting an insertion token after an insertion markable token in the input token sequence; (Para [0095]: “For token insertion, the position of the edit is between two tokens, the earlier having a low backward probability and the later having a low forward probability.” teaches determining the backward and forward probabilities of tokens for token insertion at the position of edit (insertion markable token); Para [0097]: “A forward token probability module 41 receives an input token sequence and uses a forward SLM 42 to produce a forward probability for each token in the sequence. A backward probability module 43 receives the input token sequence and uses a backward SLM 44 to produce a backward probability for each token in the sequence.” teaches that the forward and backward probabilities are determined by SLMs; Para [0078] – [0079]: “A statistical language model (SLM) captures the statistics of neighboring words in a given corpus of expressions. Applying a SLM to token sequence hypotheses significantly improves the accuracy of ASR systems. A forward SLM represents the conditional probability of the next token given one or a sequence of prior tokens. A backward SLM represents the conditional probability of an immediately prior token given one or a sequence of following tokens. Any given pair of tokens can have very different probabilities in each of the forward and backward direction.” and Para [0104]: “Choosing the most useful rewrites depends on having accurate SLMs. An SLM is most accurate if built from a corpus of expressions of the same type as the expressions to rewrite. For example, a corpus of expressions in Twitter™ tweets has very different SLM probabilities than a corpus of expressions from articles in the New York Times™ newspaper. Likewise, a corpus of expressions for a virtual assistant in general has different SLM probabilities than a corpus of expressions specific to a weather domain.” teaches that SLMs are trained machine learning models used to generate probability distributions)

determine whether the probability of inserting the insertion token satisfies a threshold; (Para [0095]: “For token insertion, the position of the edit is between two tokens, the earlier having a low backward probability and the later having a low forward probability.” teaches determining if a token should be inserted based on a low forward probability and low backward probability; Para [0100]: “Some embodiments determine a low probability by comparing probabilities to thresholds.” and Para [0135]: “Various embodiments create a rewrite by identifying positions at which backward probabilities are below a threshold, forward probabilities are below a threshold, or a combination of backward and forward probabilities are below a threshold.” teaches that the determination of low probabilities are based on threshold values, therefore determining that the backward and forward probabilities are low corresponds to these probabilities being below (satisfies) a threshold)

insert the insertion token after the insertion markable token in the input token sequence, in response to a determination that the probability of inserting the insertion token satisfies the threshold; (Para [0095]: “For token insertion, the position of the edit is between two tokens, the earlier having a low backward probability and the later having a low forward probability.” teaches determining if a token should be inserted based on a low forward probability and low backward probability; Para [0100]: “Some embodiments determine a low probability by comparing probabilities to thresholds.” and Para [0135]: “Various embodiments create a rewrite by identifying positions at which backward probabilities are below a threshold, forward probabilities are below a threshold, or a combination of backward and forward probabilities are below a threshold.” teaches that the determination of low probabilities are based on threshold values, therefore determining that the backward and forward probabilities are low corresponds to these probabilities being below (satisfies) a threshold; Fig. 28 teaches rewriting token sequence “is it going to” by inserting tokens “rain” and “<date>” to form the new token sequence)

    PNG
    media_image2.png
    777
    636
    media_image2.png
    Greyscale


determine, by the machine- learning model, a probability of substituting a substitution token for a substitutable token in the input token sequence; (Para [0095]: “For token replacement, the position of the edit is at the low probability token to be replaced.” and Para [0097]: “A forward token probability module 41 receives an input token sequence and uses a forward SLM 42 to produce a forward probability for each token in the sequence. A backward probability module 43 receives the input token sequence and uses a backward SLM 44 to produce a backward probability for each token in the sequence. An edit module 45 receives the forward and backward probability sequences, finds a token position with a low probability in both the forward and the backward direction, and replaces that token with another to produce a new rewritten token sequence. Some embodiments as in FIG. 4 perform token replacement conditionally, only if it at least one token has a sufficiently low probability.” teaches determining the backward and forward probabilities for token replacement (token substitution) at the position of edit (token that needs to be replaced, substitution token) by using SLMs; Para [0078] – [0079]: “A statistical language model (SLM) captures the statistics of neighboring words in a given corpus of expressions. Applying a SLM to token sequence hypotheses significantly improves the accuracy of ASR systems. A forward SLM represents the conditional probability of the next token given one or a sequence of prior tokens. A backward SLM represents the conditional probability of an immediately prior token given one or a sequence of following tokens. Any given pair of tokens can have very different probabilities in each of the forward and backward direction.” and Para [0104]: “Choosing the most useful rewrites depends on having accurate SLMs. An SLM is most accurate if built from a corpus of expressions of the same type as the expressions to rewrite. For example, a corpus of expressions in Twitter™ tweets has very different SLM probabilities than a corpus of expressions from articles in the New York Times™ newspaper. Likewise, a corpus of expressions for a virtual assistant in general has different SLM probabilities than a corpus of expressions specific to a weather domain.” teaches that SLMs are trained machine learning models used to generate probability distributions)

determine whether the probability of substituting the substitution token satisfies another threshold; and (Para [0095]: “For token replacement, the position of the edit is at the low probability token to be replaced.” and Para [0097]: “A forward token probability module 41 receives an input token sequence and uses a forward SLM 42 to produce a forward probability for each token in the sequence. A backward probability module 43 receives the input token sequence and uses a backward SLM 44 to produce a backward probability for each token in the sequence. An edit module 45 receives the forward and backward probability sequences, finds a token position with a low probability in both the forward and the backward direction, and replaces that token with another to produce a new rewritten token sequence. Some embodiments as in FIG. 4 perform token replacement conditionally, only if it at least one token has a sufficiently low probability.” teaches performing token replacement if backward and forward probabilities are low; Para [0100]: “Some embodiments determine a low probability by comparing probabilities to thresholds.” and Para [0135]: “Various embodiments create a rewrite by identifying positions at which backward probabilities are below a threshold, forward probabilities are below a threshold, or a combination of backward and forward probabilities are below a threshold.” teaches that the determination of low probabilities are based on threshold values, therefore determining that the backward and forward probabilities are low corresponds to these probabilities being below (satisfies) a threshold)

substitute the substitution token for the substitutable token in the input token sequence, in response to a determination that the probability of substituting the substitution token satisfies the other threshold. (Para [0095]: “For token replacement, the position of the edit is at the low probability token to be replaced.” and Para [0097]: “A forward token probability module 41 receives an input token sequence and uses a forward SLM 42 to produce a forward probability for each token in the sequence. A backward probability module 43 receives the input token sequence and uses a backward SLM 44 to produce a backward probability for each token in the sequence. An edit module 45 receives the forward and backward probability sequences, finds a token position with a low probability in both the forward and the backward direction, and replaces that token with another to produce a new rewritten token sequence. Some embodiments as in FIG. 4 perform token replacement conditionally, only if it at least one token has a sufficiently low probability.” teaches performing token replacement if backward and forward probabilities are low; Para [0100]: “Some embodiments determine a low probability by comparing probabilities to thresholds.” and Para [0135]: “Various embodiments create a rewrite by identifying positions at which backward probabilities are below a threshold, forward probabilities are below a threshold, or a combination of backward and forward probabilities are below a threshold.” teaches that the determination of low probabilities are based on threshold values, therefore determining that the backward and forward probabilities are low corresponds to these probabilities being below (satisfies) a threshold; Figs 15A-15C teaches replacing token “pin” in “weather pin Hawaii”)

Wu and Lefebure are analogous art because they are directed to performing modifications on an input token sequence.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Lefebure’s SLMs to obtain forward and backward probabilities to determine token modification with the input token sequences of Wu with a motivation to create a higher performance and more accurate natural language understanding system (Lefebure, Para [0018]).

Regarding Claim 2, 
The combination of Wu and Lefebure teaches: 
The system of claim 1, 
Wu further teaches: 
wherein the plurality of instructions, when executed, will further cause the one or more processors to receive at least one input value. (Page 3, Section 3.1: “We use the following algorithm to generate subgrammar spaces. Step 1: Tokenize the original and target strings and add special start and end tokens” and “In step 1, our approach tokenizes all the examples. For example “1 Lombard Street,London”is tokenized as START() NUMTYP(1) BNKTYP( ) WRDTYP(Lombard) BNKTYP( ) WRDTYP(Street) SYBTYP(,) WRDTYP(London) END(). NUMTYP, BNKTYP, WRDTYP and SYBTYP represent different token types (number, blank, word or symbol). The START() and END() identify the start and end of the original token sequence.” teaches receiving input value “1 Lombard Street, 

Regarding Claim 3, 
The combination of Wu and Lefebure teaches: 
The system of claim 1, 
Wu further teaches: 
wherein tokenizing the raw values and the corresponding standardized values into the raw token sequences and the corresponding standardized token sequences comprises aligning the raw token sequences with the corresponding standardized token sequences. (Page 3, Section 3.1: “Step 2: Generate alignments between examples.” and “In step 2, our approach uses a simple alignment algorithm to identify the same tokens in the original and target token sequences (e.g., the token “London” in Figure 1). When tokens appear multiple times, the algorithm generates all possible alignments.” teaches aligning the original (raw) and target (standardized) token sequences)

Regarding Claim 4, 
The combination of Wu and Lefebure teaches: 
The system of claim 1, 
Lefebure further teaches: 
wherein determining the probability of inserting the insertion token after the insertion markable token in the input token sequence is based on a count of instances that the insertion token is inserted after a class of the insertion markable token in any raw token sequence and a count of instances that any raw token sequence includes the class of the insertion markable token. (Para [0095]: “For token insertion, the position of the edit is between two tokens, the earlier having a low backward probability and the later having a low forward probability.” teaches determining if a token should be inserted based on a low forward probability and low backward probability; Para [0109]: “FIG. 9 shows a forward SLM built from a corpus with a large number of expressions that each match a phrasing in the grammar of FIG. 7. Each row corresponds to a sequence of most recent tokens (supporting only 1-token sequences). Other embodiments use sequences of more than one token to more accurately predict a next token. Columns correspond to the token following the sequence. Cells indicate probabilities. The symbols <s> and </s> indicate the beginning and end of a token sequence. For example, the probability that the first word at the beginning of a sequence is “what” is 0.69, and the probability that the last word at the end of a sequence is a date entity is 0.67. The probability that the word “what” follows the word “the” is 0.01, indicating that it is infrequent for people to say, “the what”” teaches the forward probabilities for a token are determined based on a plurality of instances where the token was placed after a given token; Para [0110]: “FIG. 10 shows a backward SLM built from a corpus with a large number of expressions that each match a phrasing in the grammar of FIG. 7. Each row corresponds to a sequence of following tokens (supporting only 1-token sequences). Columns correspond to the token preceding the sequence. It shows, for example, that the probability of a token sequence ending with the word “weather” is 0.128. The probability of the word “will” following the word “what” is 0.51, indicating that it is relatively common for people to say, “what will”” teaches the backward probabilities for a token are determined based on a plurality of instances where the token was placed before a given token)

    PNG
    media_image3.png
    466
    814
    media_image3.png
    Greyscale


    PNG
    media_image4.png
    459
    831
    media_image4.png
    Greyscale

The combination of claim 1 has already incorporated the SLMs used to obtain forward and backward probabilities for token modification, therefore already incorporating the details of the count of instances required by claim 4. 

Regarding Claim 5, 
The combination of Wu and Lefebure teaches: 
The system of claim 1, 
Lefebure further teaches: 
wherein determining the probability of substituting the substitution token for the substitutable token in the input token sequence is based on a count of instances that the substitution token is substituted for a class of the substitutable token in any raw token sequence and a count of instances that any raw token sequence includes the class of the substitutable token. (Para [0097]: “A forward token probability module 41 receives an input token sequence and uses a forward SLM 42 to produce a forward probability for each token in the sequence. A backward probability module 43 receives the input token sequence and uses a backward SLM 44 to produce a backward probability for each token in the sequence. An edit module 45 receives the forward and backward probability sequences, finds a token position with a low probability in both the forward and the backward direction, and replaces that token with another to produce a new rewritten token sequence. Some embodiments as in FIG. 4 perform token replacement conditionally, only if it at least one token has a sufficiently low probability.” teaches performing token replacement if backward and forward probabilities are low; Para [0109]: “FIG. 9 shows a forward SLM built from a corpus with a large number of expressions that each match a phrasing in the grammar of FIG. 7. Each row corresponds to a sequence of most recent tokens (supporting only 1-token sequences). Other embodiments use sequences of more than one token to more accurately predict a next token. Columns correspond to the token following the sequence. Cells indicate probabilities. The symbols <s> and </s> indicate the beginning and end of a token sequence. For example, the probability that the first word at the beginning of a sequence is “what” is 0.69, and the probability that the last word at the end of a sequence is a date entity is 0.67. The probability that the word “what” follows the word “the” is 0.01, indicating that it is infrequent for people to say, “the what”” teaches the forward probabilities for a token are determined based on a plurality of instances where the token was placed after a given token; Para [0110]: “FIG. 10 shows a backward SLM built from a corpus with a large number of expressions that each match a phrasing in the grammar of FIG. 7. Each row corresponds to a sequence of following tokens (supporting only 1-token sequences). Columns correspond to the token preceding the sequence. It shows, for example, that the probability of a token sequence ending with the word “weather” is 0.128. The probability of the word “will” following the word “what” is 0.51, indicating that it is relatively common for people to say, “what will”” teaches the backward probabilities for a token are determined based on a plurality of instances where the token was placed before a given token)

The combination of claim 1 has already incorporated the SLMs used to obtain forward and backward probabilities for token modification, therefore already incorporating the details of the count of instances required by claim 5. 

Regarding Claim 6, 
The combination of Wu and Lefebure teaches: 
The system of claim 1, 
Wu further teaches: 
wherein the plurality of instructions, when executed, will further cause the one or more processors to rearrange tokens in the input token sequence (Page 3, Section 3.1: “The move phase (mov)∗ reorders the tokens so that they appear in the same ordering as in the target sequence.” and “In step 3, we generate a set of possible edit operation sequences from each alignment. For the pair “1 Lombard Street,London” and “London,1 Lombard Street”, the alignment algorithm determines that all tokens in the target token sequence can be mapped to the original token sequence. Consequently, only move operations are needed. In the move phase, the alignment results would indicate the new position of the token NUM(1) is behind WRDTYP(London). A mov operation such as mov(7,9,0) would be generated, which means moving the subtoken sequence located within position 7 and 9 to position 0. In the delete phase, the algorithm deletes the START and END tokens.” teaches reordering the tokens in the input sequence by using a move operation)

Regarding Claim 7, 
The combination of Wu and Lefebure teaches: 
The system of claim 1, 
Wu further teaches: 
wherein the plurality of instructions, when executed, will further cause the one or more processors to join tokens in the input token sequence. (Page 3, Section 3.1: “The search space for transformation programs is (ins|mov|-del)∗. Without loss of generality we refactor this space into(ins)∗(mov)∗(del)∗. The insert phase (ins)∗ , inserts tokens that appear in the target token sequence but were not part of the original token sequence. The move phase (mov)∗ reorders the tokens so that they appear in the same ordering as in the target sequence. The delete phase (del)∗removes tokens that do not appear in the target token sequence.” teaches performing token insertion and deletion; Page 6, Table 1 teaches modifying the original token sequence “Brankova&nbsp;13” to “Brankova 13” by deleting tokens “&”, “nbsp”, and “;” to join tokens “Brankova” and “13” together (join tokens in input sequence))

Regarding Claim 8, 
Claim 8 recites A computer program product… which recites limitations that are similar to claim 1, thus is rejected with the same rationale applied against claim 1.

Regarding Claim 9, 


Regarding Claim 10, 
Claim 10 recites The computer program product of claim 8… which recites limitations that are similar to claim 3, thus is rejected with the same rationale applied against claim 3.

Regarding Claim 11, 
Claim 11 recites The computer program product of claim 8… which recites limitations that are similar to claim 4, thus is rejected with the same rationale applied against claim 4.

Regarding Claim 12, 
Claim 12 recites The computer program product of claim 8… which recites limitations that are similar to claim 5, thus is rejected with the same rationale applied against claim 5.

Regarding Claim 13, 
Claim 13 recites The computer program product of claim 8… which recites limitations that are similar to claim 6, thus is rejected with the same rationale applied against claim 6.

Regarding Claim 14, 
Claim 14 recites The computer program product of claim 8… which recites limitations that are similar to claim 7, thus is rejected with the same rationale applied against claim 7.

Regarding Claim 15, 


Regarding Claim 16, 
Claim 16 recites The method of claim 15… which recites limitations that are similar to claim 2, thus is rejected with the same rationale applied against claim 2.

Regarding Claim 17, 
Claim 17 recites The method of claim 15… which recites limitations that are similar to claim 3, thus is rejected with the same rationale applied against claim 3.

Regarding Claim 18, 
Claim 18 recites The method of claim 15… which recites limitations that are similar to claim 4, thus is rejected with the same rationale applied against claim 4.

Regarding Claim 19, 
Claim 19 recites The method of claim 15… which recites limitations that are similar to claim 5, thus is rejected with the same rationale applied against claim 5.

Regarding Claim 20, 
The combination of Wu and Lefebure teaches: 
The method of claim 15, 
Wu further teaches: 
rearranging tokens in the input token sequence; and (Page 3, Section 3.1: “The move phase (mov)∗ reorders the tokens so that they appear in the same ordering as in the target sequence.” and “In step 3, we generate a set of possible edit operation sequences from each alignment. For the pair “1 Lombard Street,London” and “London,1 Lombard Street”, the alignment algorithm determines that all tokens in the target token sequence can be mapped to the original token sequence. Consequently, only move operations are needed. In the move phase, the alignment results would indicate the new position of the token NUM(1) is behind WRDTYP(London). A mov operation such as mov(7,9,0) would be generated, which means moving the subtoken sequence located within position 7 and 9 to position 0. In the delete phase, the algorithm deletes the START and END tokens.” teaches reordering the tokens in the input sequence by using a move operation)
joining tokens in the input token sequence. (Page 3, Section 3.1: “The search space for transformation programs is (ins|mov|-del)∗. Without loss of generality we refactor this space into(ins)∗(mov)∗(del)∗. The insert phase (ins)∗ , inserts tokens that appear in the target token sequence but were not part of the original token sequence. The move phase (mov)∗ reorders the tokens so that they appear in the same ordering as in the target sequence. The delete phase (del)∗removes tokens that do not appear in the target token sequence.” teaches performing token insertion and deletion; Page 6, Table 1 teaches modifying the original token sequence “Brankova&nbsp;13” to “Brankova 13” by deleting tokens “&”, “nbsp”, and “;” to join tokens “Brankova” and “13” together (join tokens in input sequence))
Conclusion
The prior art made of record and not relied upon is considered pertinent to the applicant’s disclosure: 
Saxena et al. (US 10839156 B1) teaches normalizing addresses using an LSTM model and conditional random field model. 
Dewitt et al. (US 20200356363 A1) teaches performing data normalization by using tokenization and machine learning models. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHOUN ABRAHAM whose telephone number is (571)272-8144. The examiner can normally be reached Mon - Fri 08:00-16:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on (571) 272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/S.J.A./Examiner, Art Unit 2125             

/BRIAN M SMITH/Primary Examiner, Art Unit 2122