DETAILED ACTION
This action is responsive to the Amendment filed on 10/21/2022. Claims 1-11, 13-20 are pending in the case. Claim 12 is canceled. Claims 1, 14 and 18 are the independent claims.
This office action is FINAL.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Applicant’s Response
In Applicant’s response dated 10/21/2022 (hereinafter Response), Applicant amended Claims 1-2, 9, 14, and 17-18; cancelled Claim 12; and argued against all objections and rejections previously set forth in the Office Action dated 07/22/2022.
Applicant’s amendment to claims 1-2, 9, 14, and 17-18 to further clarify the metes and bounds of the invention are acknowledged.
Response to Amendment/Arguments
Applicant’s cancelation of claim 12 renders the rejection of this claim under 35 USC 112(b) as moot.
In response to Applicant's argument with respect to the previous rejection of claims 1-11, 13-20 under 35 USC 112(b) as being indefinite for reciting “fewer” in the limitation formed…from a regularized character set having fewer characters than a natural language character set, Applicant’s arguments (see Response page 8) that they are relying on the broadest interpretation of “fewer” such that so long as the size of the regularized character set (the characters used for encoding a string) is strictly smaller than the size of the natural language character set (the characters used for the string which is encoded), the limitation is met.  Accordingly, Examiner respectfully withdraws the rejection of all claims under 35 USC 112(b), particularly as Applicant has amended the independent claims to include at least one regularized character (a matching character) in the regularized character set.
Applicant’s request (see Response page 10) that Examiner withdraw the interpretation of claims 1-11 and 13-20 under 35 USC 112(f) is acknowledged, however Applicant makes no statement on pages 9-10 which is persuasive that none of the terms of claims 14-17 should be interpreted under 35 USC 112(f). 
If applicant does not intend to have these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
In response to Applicant's argument with respect to the rejection of independent claims 1, 14, and 18 as no longer anticipated by HOSSEINI (see Response pages 10-11) Examiner agrees.
Accordingly, new grounds of rejection under 35 USC 103 are provided below which are responsive to Applicant’s amendment to the independent claims.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitations are, as recited in claim 14.
Note: the generic placeholders have been bolded for Applicant’s convenience, while the functional language has been underlined; the remaining language is any words which may either be a structural modifier that identifies a known structure for performing the function or merely a naming convention for the generic placeholder.
a character embedding component configured to embed each character of the regularized input sequence and the regularized candidate sequence to produce regularized input character embeddings and regularized candidate character embeddings, respectively,
a word embedding component configured to embed the regularized input character embeddings and the regularized candidate character embeddings to produce an embedded input vector and an embedded candidate vector, respectively;
a scoring component configured to compute a similarity score based on the embedded input vector and the embedded candidate vector.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
The relevant portions of the disclosure as originally filed which are identified as providing the structure and/or description (though not necessarily the algorithms necessary to implement each of these computer-implemented limitations) are:
a character embedding (FIG 3 (325) [0045,0051]; FIG 4 (405), [0060]; FIG 5 (505)) component configured to embed… [0053], [0066] (no details, only result)
a word embedding component (FIG 3 (330) [0045,0051]; FIG 4 (410) [0060]; FIG 5 (510) BERT) configured to embed… [0054-0055], [0061], [0075]
a scoring component (FIG 3 (335) [0045,0051]; FIG 4 (415) [0060]; FIG 5 (510) BERT) configured to compute a similarity score… [0056], [0065], [0075]

The character encoder which is also recited in claim 14 does not meet all the requirements of the three-pronged analysis because it recites some structure (wherein the regularized input sequence and the regularized candidate sequence are formed by selecting characters from a regularized character set having fewer characters than a natural language character set of the input name and the candidate name). 
Examiner Note: It was noted in the previous action when character encoder was considered under 35 USC 112(f) that the instant application does not appear to provide a complete algorithm for the character encoder. An interview was held with Applicant’s representative to clarify the algorithm (see interview summary mailed 06/20/2022) at which time Applicant’s representative agreed with Examiner’s suggestion that US Patent Publication US 20180069875A1 to BEN EZRA, FIG 8 showed the results of the encoding.  Thus, it appears that the encoding algorithm described in the instant application is analogous to CIGAR encoding (see BEN EZRA [0092-0094]). As CIGAR encoding is a known technique from bioinformatics (particularly for encoding differences between genetic strings), no rejection under 35 USC 112(b) was previously made for this limitation. 
If applicant does not intend to have these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
The current interpretation under 112(f) precludes a rejection under 35 USC 101 of claims 14-17 as being directed to software per se. 
Claim Rejections – 35 USC 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-11, 13-19 are rejected under 35 U.S.C. 103 as being unpatentable over by HOSSEINI et al. (DeezyMatch: A Flexible Deep Learning Approach to Fuzzy String Matching. Proceedings of the 2020 EMNLP (Systems Demonstrations), pages 62–69 November 16-20, 2020. (c) 2020 Association for Computational Linguistics, previously cited) in view of FAN et al. (Cigar Strings for Dummies. Blog entry post made Mar 28, 2017. Retrieved from [https://jef.works/blog/2017/03/28/CIGAR-strings-for-dummies/] on [11/01/2022]. 4 pages, newly cited), as evidenced by BEN EZRA et al. (Pub. No.: US 2018/0069875 A1, previously cited) and BAR-YOSSEF et al. (Pub. No.: US 2007/0085716 A1, previously cited).
Regarding claim 1, HOSSEINI teaches the method for natural language processing (abstract), comprising (relying on FIG 1, Section 2 starting page 63; and how the system was tested/compared in Section 3):
identifying an input name and a candidate name (FIG 1: Query mention, candidate mention, dashed lines showing input for candidate ranker), wherein the input name and the candidate name are words formed from a natural language character set (page 62 § 1 col 2 “We compare its performance in relation to other approaches on several realistic string matching scenarios, covering different languages, alphabets, and domains, and we evaluate the quality of the candidate ranker in a real-case setting”);
encoding the input name and the candidate name to produce a regularized input sequence and a regularized candidate sequence, respectively, wherein the regularized input sequence and the regularized candidate sequence are formed by selecting characters from a regularized character set having fewer characters than the natural language character set (during training, candidate pairs are converted to lower case and normalized (page 63 § 2.1 col 2); during testing of the candidate ranker, inputs are also converted to lower-case (see page 67 § 3.1.2 col 1; thus the regularized character set has fewer (only lower case) characters than the natural language character set (which has both upper and lower case characters)), 
embedding the regularized input sequence and the regularized candidate sequence to produce an embedded input vector and an embedded candidate vector, respectively (as can be seen in FIG 1, relying on the dashed lines for candidate ranker, after preprocessing (e.g. conversion to lower case), for each query and candidate pair, learned vector representations are first generated using a DeezyMatch model. These vectors are then used to rank candidates according to different metrics (e.g., L2-norm distance, cosine similarity and prediction scores));
computing a similarity score based on the embedded input vector and the embedded candidate vector (in FIG 1, this is the “prediction score” generated from the fully-connected layers by the pair classifier); and
indicating that the input name corresponds to the candidate name based on the similarity score (in FIG 1: rank results according to model prediction scores; (page 65 § 2.2) candidate ranker; generating a result using DeezyMatch programming interface after training, see e.g. § 2.3 on page 65; note discussion of results which allowed the comparison of HOSSEINI’s candidate ranker result with other systems on common data sets (discussion starts page 66 § 3.1.2) thus there must be some mechanism to view the indication of how well the input name corresponds with a candidate name (e.g. the similarity score)).
As noted above, HOSSEINI teaches a regularized character set which has fewer characters than the natural language character set, however HOSSEINI does not appear to expressly disclose wherein the regularized character set includes a matching character indicating a match between a character of the input name and a character of the candidate name.
FAN teaches there is a known method for encoding genomic sequence alignments called ‘CIGAR’ (Compact Idiosyncratic Gapped Alignment Report) which uses a regularized character set (M match, N alignment gap, D deletion, I insertion) as operators when performing alignment of a reference (ref) string and a query string. Thus, FAN may be relied upon to teach a regularized character set includes a matching character indicating a match between a character of the input {query} and a character of the candidate {reference}.
BEN EZRA provides evidence that it was known to use CIGAR encodings when analyzing other kinds of strings to determine a score in order to determine how much the strings (that is, the application of CIGAR to those which are not genomic sequences) because:
BEN EZRA is similarly directed to comparing strings ((abstract) matching event sequences by converting reference and query event sequences to step-value lists and matching to identify at least one common pattern; see broad method in FIG 7 which is step S320 of FIG 3, which is step S230 in FIG 2; event sequences are stored in a database as normalized event structures of string values [0039], step-value lists are represented as character strings as explained below). 
Of particular interest is FIG 8 [0090] which illustrates the sequence alignment comparison of Figure 7. FIG 8 shows how reference sequence 810 and query sequence 820 are compared by [0091] encoding the sequences, based on a dictionary, to unichar strings representing the step-value lists 830 and 840. These string representations of the step-value lists are then [0092] aligned to create aligned strings 850 and 821, including gaps for marking the differences between the aligned strings. A matching indicator string 870 is [0093] created using the CIGAR format “(1,‘ M '); (1,’D') ; (1,‘M'); (1,‘I'); (1,‘M')," where ‘(1,’M’ )' indicates that one ( 1 ) characters matches , '(1,'D') indicates that one ( 1 ) character is deleted from the reference string 850 (i.e., the aligned string of the reference sequence 810) , and ‘(1,’ I’ )' indicates that one ( 1 ) character is inserted into the reference string 850 (i.e. , the aligned string of the reference sequence 810).
The resulting pattern (from reference to query) is subsequently used to [0080-0081] determine a match score, based on, but not limited to, matching characters, provided gaps, or both. Specifically, in an example embodiment, the match score may be computed as a function of a number of matching characters (unichars) and a number and length of gaps or other mismatches.
BAR-YOSSEFF provides evidence that the use of encoded strings, as opposed to the original strings, can be used to improve string comparison generally. 
BAR-YOSSEF is similarly directed to string comparison and matching, specifically for making estimations of a string-matching edit distance (edit distance may be computed based on insertion, deletion, and substitution operations required to obtain a second string from a first string, see BAR-YOSSEF [0005]). 
As stated in [0022] The embodiments of the invention provide a method of producing, for each string, a short sketch (e.g., signature or fingerprint), with the property that the edit distance between two strings can be inferred from looking only at their respective sketches. By applying these methods to large string collections (e.g., documents corpora or databases of known sequences), one can obtain faster and/or more accurate similarity detection systems. The embodiments of the invention are simple to implement in practice which represents a significant advantage over other schemes for edit distance.
From BAR-YOSSEF, it is clear that string matching can be improved by using “sketches” of strings, rather than the strings themselves, where consideration of the sketches of two strings can be used to infer an edit distance. 
FAN (as evidenced by BEN EZRA) clearly teaches an encoding method (CIGAR) which can be used to encode the alignment of a first string in view of a second string. Thus, alignment encoding may be considered a representation of the edit distance (matching, gaps, non-matching). By considering this encoding of the first string in view of the second string as a sketch of the first string and encoding of the second string in view of the first string as a sketch of the second string, the matching system of HOSSEINI may be improved by using the CIGAR encodings taught by FAN, with a reasonable expectation of success.
Accordingly, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention, having the teachings of HOSSEINI and FAN before them, to have contemplated using the CIGAR encoding shown in FAN as the string encoding of HOSSEINI, with no other change necessary in the way HOSSEINI uses the encoded strings, with a reasonable expectation of success, the combination motivated by the teaching example of BEN EZRA using the CIGAR encoding described in FAN for the same ultimate purpose (trying to score the matches of strings) as the end result of the encoding and processing steps of HOSSEINI (trying to score the fuzzy or partial matches of strings) and the teaching in BAR-YOSSEF that string matching can be improved by using the sketches (i.e. encoded representations) of strings, rather than the strings themselves.
Regarding dependent claim 2, incorporating the rejection of claim 1, HOSSEINI in view of FAN, combined at least for the reasons discussed above, further teaches comparing the character of the input name and the character of the candidate name; and selecting a character from the regularized character set for the regularized input sequence based on the comparison (see FAN: this is precisely how CIGAR alignment works; identify character-by-character alignment (M) matches, (N) alignment gaps, (D) deletions, and/or (I) insertions).
Regarding dependent claim 3, incorporating the rejection of claim 2, HOSSEINI in view of FAN, combined at least for the reasons discussed above, further teaches the characters are selected from the regularized character set using a per position alignment method (see FAN: this is precisely how CIGAR alignment works; identify character-by-character alignment (M) matches, (N) alignment gaps, (D) deletions, and/or (I) insertions).
Regarding dependent claim 4, incorporating the rejection of claim 2, HOSSEINI in view of FAN, combined at least for the reasons discussed above, further teaches embedding each character of the regularized input sequence and the regularized candidate sequence to produce regularized input character embeddings and regularized candidate character embeddings, respectively, wherein the embedded input vector and the embedded candidate vector are generated based on the regularized input character embeddings and the regularized candidate character embeddings, respectively (relying on FIG 1 of HOSSEINI, see discussion claim 1; note also discussion claim 5 below).
Regarding dependent claim 5, incorporating the rejection from claim 1, HOSSEINI further teaches the embedded input vector and the embedded candidate vector are generated using a long short-term memory (LSTM) (FIG 1, embeddings are fed to RNN/GRU/LSTM; page 63 § 2.1 col 2 “Currently, DeezyMatch supports Elman Recurrent neural network (RNN) (Elman, 1990), Long short-term memory (LSTM) (Hochreiter and Schmidhuber, 1997) and Gated Recurrent Unit (GRU) (Cho et al., 2014) architectures).
Regarding dependent claim 6, incorporating the rejection from claim 1, HOSSEINI further teaches the embedded input vector and the embedded candidate vector are generated using a transformer network (per instant application [0019], “transformer network” is an example of deep learning network; see HOSSEINI FIG 1, embeddings are fed to RNN/GRU/LSTM; page 63 § 2.1 col 2 “Currently, DeezyMatch supports Elman Recurrent neural network (RNN) (Elman, 1990), Long short-term memory (LSTM) (Hochreiter and Schmidhuber, 1997) and Gated Recurrent Unit (GRU) (Cho et al., 2014) architectures).
Regarding dependent claim 7, incorporating the rejection from claim 1, HOSSEINI further suggests determining that the similarity score exceeds a threshold, wherein the indication is based on the determination (it is noted that the instant application fails to provide any examples of “a threshold”, merely how it is used, thus it is sufficient that HOSSEINI provides a sorted (ranked) list of all possible candidates (in FIG 1: rank results according to model prediction scores; (page 65 § 2.2) candidate ranker), such that the candidate with the highest score can be determined to be the best match (exceeding the threshold of the next highest score).
Regarding dependent claim 8, incorporating the rejection from claim 1, HOSSEINI further teaches generating a cosine similarity score based on the embedded input vector and the embedded candidate vector (FIG 1 legend: These vectors are then used to rank candidates according to different metrics (e.g., L2-norm distance, cosine similarity and prediction scores). See also page 65 § 2.2 which explains that the candidate ranker can use the different metrics and how the selection of which metric can affect the computation time).
Regarding dependent claim 9, incorporating the rejection of claim 1, HOSSEINI in view of FAN, combined at least for the reasons discussed above, further teaches the regularized character set includes the matching character, a non-matching character, and at least one gap character (see FAN: this is precisely how CIGAR alignment works; identify character-by-character alignment (M) matches, (N) alignment gaps, (D) deletions, and/or (I) insertions {insertions/deletions can represent non-matching characters which have either been added or removed, as well as different types of gaps}).
Regarding dependent claim 10, incorporating the rejection of claim 9, HOSSEINI in view of FAN, combined at least for the reasons discussed above, further suggests the at least one gap character comprises a first gap character for the input name and a second gap character for the candidate name (see FAN: this is precisely how CIGAR alignment works; identify character-by-character alignment (M) matches, (N) alignment gaps, (D) deletions, and/or (I) insertions {insertions/deletions can represent non-matching characters which have either been added or removed, as well as different types of gaps}; note that the position of gaps, as well as any initial padding needed (the offset) will be different depending upon whether the query is in view of the reference, or the reference is in view of the query).
Regarding dependent claim 11, incorporating the rejection of claim 1, HOSSEINI further teaches the input name is a misspelled version of the candidate name (this is a feature of the data being analyzed, and is intended to represent the intended use of the method to recognize misspellings; broadly taught in the various data sets tested in § 3.1.2 starting page 66 for toponym (place name) resolution; for example dealing with OCR errors such as such as ‘DORSETSIIIRR’ for ‘Dorsetshire’).
Dependent claim 12 – canceled.
Regarding dependent claim 13, incorporating the rejection of claim 1, HOSSEINI further teaches identifying an alternative candidate name for the input name; and indicating that the input name does not correspond to the alternative candidate name because a ranked (e.g. sorted) list of results is returned (see FIG 1), with the assumption that the highest ranked is the best match.
Regarding claim 14, HOSSEINI similarly teaches the apparatus for natural language processing (computer implemented system executing software to perform functions), comprising:
a character encoder (pre-processing software executing on suitable hardware) configured to encode an input name and a candidate name (FIG 1: Query mention, candidate mention, dashed lines showing input for candidate ranker) to produce a regularized input sequence and a regularized candidate sequence, respectively, wherein the regularized input sequence and the regularized candidate sequence are formed by selecting characters from a regularized character set having fewer characters than a natural language character set of the input name and the candidate name (during training, candidate pairs are converted to lower case and normalized (page 63 § 2.1 col 2); during testing of the candidate ranker, inputs are also converted to lower-case (see page 67 § 3.1.2 col 1; thus the regularized character set has fewer (only lower case) characters than the natural language character set (which has both upper and lower case characters)), 
a character embedding component (software) configured to embed each character of the regularized input sequence and the regularized candidate sequence to produce regularized input character embeddings and regularized candidate character embeddings, respectively (as can be seen in FIG 1, relying on the dashed lines for candidate ranker, after preprocessing (e.g. conversion to lower case), for each query and candidate pair, learned vector representations are first generated using a DeezyMatch model. These vectors are then used to rank candidates according to different metrics (e.g., L2-norm distance, cosine similarity and prediction scores); creating vector representations of character strings necessarily requires embedding the characters prior to generating the vectors); 
a word embedding component (software) configured to embed the regularized input character embeddings and the regularized candidate character embeddings to produce an embedded input vector and an embedded candidate vector, respectively (as can be seen in FIG 1, relying on the dashed lines for candidate ranker, after preprocessing (e.g. conversion to lower case), for each query and candidate pair, learned vector representations are first generated using a DeezyMatch model. These vectors are then used to rank candidates according to different metrics (e.g., L2-norm distance, cosine similarity and prediction scores)); and
a scoring component (software) configured to compute a similarity score based on the embedded input vector and the embedded candidate vector (in FIG 1, this is the “prediction score” or “similarity metric” generated from the fully-connected layers by the pair classifier).
As indicated above, HOSSEINI cannot be relied upon to expressly disclose wherein the regularized character set includes a matching character indicating a match between a character of the input name and a character of the candidate name. Incorporating the teachings of FAN, for the reasons discussed in the rejection of claim 1 and as evidenced by the teachings of BEN EZRA and BAR-YOSSEF, cures this deficiency.
Regarding dependent claim 15, incorporating the rejection of claim 14, HOSSEINI further teaches the word embedding component comprises a long short-term memory (LSTM) network (FIG 1, embeddings are fed to RNN/GRU/LSTM; page 63 § 2.1 col 2 “Currently, DeezyMatch supports Elman Recurrent neural network (RNN) (Elman, 1990), Long short-term memory (LSTM) (Hochreiter and Schmidhuber, 1997) and Gated Recurrent Unit (GRU) (Cho et al., 2014) architectures).
Regarding dependent claim 16, incorporating the rejection of claim 14, HOSSEINI further teaches the word embedding component comprises a transformer network (per instant application [0019], “transformer network” is an example of deep learning network; see HOSSEINI FIG 1, embeddings are fed to RNN/GRU/LSTM; page 63 § 2.1 col 2 “Currently, DeezyMatch supports Elman Recurrent neural network (RNN) (Elman, 1990), Long short-term memory (LSTM) (Hochreiter and Schmidhuber, 1997) and Gated Recurrent Unit (GRU) (Cho et al., 2014) architectures).
Regarding dependent claim 17, incorporating the rejection of claim 14, HOSSEINI in view of FAN, combined at least for the reasons discussed above, further teaches wherein: the regularized character set includes the matching character, a non-matching character, and at least one gap character (see FAN: this is precisely how CIGAR alignment works; identify character-by-character alignment (M) matches, (N) alignment gaps, (D) deletions, and/or (I) insertions {insertions/deletions can represent non-matching characters which have either been added or removed, as well as different types of gaps}; note that the position of gaps, as well as any initial padding needed (the offset) will be different depending upon whether the query is in view of the reference, or the reference is in view of the query).
Regarding claim 18, HOSSEINI teaches the method for training a machine learning model (interpreted as training the pair classifier of FIG 1 which is subsequently used with the method of claim 1), comprising:
identifying a training pair comprising an input name and a candidate name, and further comprising a ground truth match information indicating whether the input name corresponds to the candidate name (selecting an element from the query-candidate pairs dataset for training; (page 63 § 2.1) DeezyMatch’s pair-classifier component has at its core a siamese deep neural network classifier. The network takes query-candidate pairs as inputs which can be further processed (e.g., lower-cased and normalized) and tokenized at different levels (character, n-gram and word). Such pairs are either possible referents of the same entity or not, which form the positive and negative examples for training and testing);
encoding the input name and the candidate name to produce a regularized input sequence and a regularized candidate sequence, respectively, wherein the regularized input sequence and the regularized candidate sequence are formed by selecting characters from a regularized character set having fewer characters than a natural language character set ((page 63 § 2.1), lower-cased and normalized), 
embedding the regularized input sequence and the regularized candidate sequence to produce an embedded input vector and an embedded candidate vector, respectively, using a word embedding component ((page 63 § 2.1 col 2 to page 64 col 1) during training…read, preprocessed, tokenized,..converted into dense vectors…fed into two parallel recurrent units to generate vector representations… vectors are combined in different ways… given as input to a feed-forward network with one hidden layer and with Rectified Linear Unit (ReLU) as the activation function. The output layer has one unit with a sigmoid activation function for producing the final prediction);
computing a similarity score based on the embedded input vector and the embedded candidate vector (page 63 § 2.1 col 2 to page 64 col 1, as above);
HOSSEINI further teaches, without expressly disclosing:
computing a loss function based on the similarity score and the ground truth match information and
updating parameters of the word embedding component based on the loss function (per instant application [0116] The term loss function refers to a function that impacts how a machine learning model is trained in a supervised learning model. Specifically, during each training iteration, the output of the model is compared to the known annotation information in the training data. The loss function provides a value for how close the predicted annotation data is to the actual annotation data. After computing the loss function, the parameters of the model are updated accordingly, and a new set of predictions are mode during the next iteration.)
because ((page 64 col 1) During training, the target and the predicted outputs are compared by the Binary Cross Entropy criterion…Similar to Tam et al. (2019), it also calculates mean average precision (MAP), which evaluates the quality of candidate ranks per query) and the use of a loss function to modify the parameters of the various networks (as noted on (page 63 col 2) Currently, DeezyMatch supports Elman Recurrent neural network (RNN) (Elman, 1990), Long short-term memory (LSTM) (Hochreiter and Schmidhuber, 1997) and Gated Recurrent Unit (GRU) (Cho et al., 2014) architectures) to implement the classifier is a well-known technique in machine learning.  Clearly, the system taught in HOSSEINI is a supervised learning model.
As further evidence, a copy of Hochreiter and Schmidhuber’s 1997 paper on LSTM was provided with the previous Office action which explains how an error formula is used to adjust the parameters of the LSTM. 
Additionally, a copy of Cho et al.’s 2014 paper (Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation) was also provided which explains using Adadelta and stochastic gradient descent to train the RNN Encoder–Decoder with hyperparameters.
As indicated above, HOSSEINI cannot be relied upon to expressly disclose wherein the regularized character set includes a matching character indicating a match between a character of the input name and a character of the candidate name. Incorporating the teachings of FAN, for the reasons discussed in the rejection of claim 1 and as evidenced by the teachings of BEN EZRA and BAR-YOSSEF, cures this deficiency.
Regarding dependent claim 19, incorporating the rejection of claim 18, HOSSEINI further teaches embedding each character of the regularized input sequence and the regularized candidate sequence to produce regularized input character embeddings and regularized candidate character embeddings, respectively, wherein the embedded input vector and the embedded candidate vector are generated based on the regularized input character embeddings and the regularized candidate character embeddings, respectively ((page 63 § 2.1 col 2 to page 64 col 1) during training…read, preprocessed, tokenized,.. converted into dense vectors…fed into two parallel recurrent units to generate vector representations… vectors are combined in different ways… given as input to a feed-forward network with one hidden layer and with Rectified Linear Unit (ReLU) as the activation function. The output layer has one unit with a sigmoid activation function for producing the final prediction).
Claim 20 is rejected under 35 USC 103 as unpatentable over HOSSEINI in view of FAN, further in view of MUMCUYAN et al. (Pub. No.: US 2021/0287069 A1, previously cited).
Regarding dependent claim 20, incorporating the rejection of claim 18, HOSSEINI, while clearly teaching a supervised learning model for fuzzy string matching as discussed above, does not appear to expressly disclose modifying the candidate name to produce the input name for the training pair. At best, HOSSEINI states (page 63 col 2) Such pairs are either possible referents of the same entity or not, which form the positive and negative examples for training and testing. Note, however, that one of training sets that is used in the testing of the system of HOSSEINI includes an actual place name and one that was generated due to OCR technique (in § 3.1.2 starting page 66 for toponym (place name) resolution; for example, dealing with OCR errors such as such as ‘DORSETSIIIRR’ for ‘Dorsetshire’).
MUMCUYAN is similarly directed to (abstract) Name Matching Engine that integrates two Machine Learning (ML) module options.
As can be seen in FIG 1, a query 104 is provided to a name matching engine 100 on computing system 140 and the system outputs candidate matching names 130.  FIG 2 shows some names in a first column and whether the name in the first column matches or does not match a name in the second column (called a “golden dataset” [0013]). The golden dataset is used to train ML models [0038].  Note that a golden dataset includes matching name pairs (i.e., pairs 202-206, 210-212, and 216-218), and also non-matching name pairs (i.e., pairs 208 and 214). The matching name pairs in golden dataset 200 represent various types of differences that commonly occur in different strings representing matching names, such as: inclusion of an initial to represent a word (as in pairs 202, 210, and 218); out of order words (as in pairs 202, 204, 210, 214, 216, and 218); hyphen-insertion (as in pairs 204 and 206); typos (as in pairs 204 and 206); dropped whitespace (as in pairs 210, 212, and 216); and word truncation (as in pair 212).
Thus, the golden dataset used to train the model includes a candidate name and one or more modifications of the candidate name, such that the pair (candidate name, modified candidate name) are used in machine learning for name matching.
Accordingly, it would have been obvious to one having ordinary skill in the art at the time the invention was effectively filed, having the teachings of HOSSEINI in view of FAN, combined as above (training a string matching system using different datasets) and MUMCUYAN (training a string matching system using a “golden dataset” which explicitly includes a candidate string and one or more modifications of the candidate string), to have used the training set of MUMCUYAN to train and test the string matching system of HOSSEINI in view of FAN by simply substituting one training set with another, with a reasonable expectation of success and a predictable result (the system of HOSSEINI will be trained to recognize the name strings in the “golden dataset” of MUMCUYAN just as it was trained by the described data sets to recognize the OCR variation for ‘Dorsetshire’). The motivation for this combination is “simple substitution” (see MPEP 2143(B)), that is “the substitution of one known element for another yields predictable results to one of ordinary skill in the art”.
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. “The use of patents as references is not limited to what the patentees describe as their own inventions or to the problems with which they are concerned. They are part of the literature of the art, relevant for all they contain.” In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)). Further, a reference may be relied upon for all that it would have reasonably suggested to one having ordinary skill the art, including nonpreferred embodiments. Merck & Co. v. Biocraft Laboratories, 874 F.2d 804, 10 USPQ2d 1843 (Fed. Cir.), cert. denied, 493 U.S. 975 (1989). See also Upsher-Smith Labs. v. Pamlab, LLC, 412 F.3d 1319, 1323, 75 USPQ2d 1213, 1215 (Fed. Cir. 2005); Celeritas Technologies Ltd. v. Rockwell International Corp., 150 F.3d 1354, 1361, 47 USPQ2d 1516, 1522-23 (Fed. Cir. 1998).

	
CONCLUSION
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMY M LEVY whose telephone number is 571-270-3771.  The examiner can normally be reached on Mon-Fri 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KIEU VU can be reached on 571-272-4057.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Amy M Levy/Primary Examiner, Art Unit 2173