DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 8-14 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because claims 8-14 are merely directed to code to be executed thus are interpreted as software per se, not one of the patent eligible class.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 lines 5-6 recite “separate a string into a total number of tokens, comprising a token and another token”. Does applicant mean a first token and a second token different from the first token?
Claim 1 lines 7-9 recite “identify a pattern comprising an entity, another entity and a total number of entities that equals the total number of tokens, and another pattern comprising a same total number of entities that equals the total number of tokens”. It is not clear how “a pattern” is related to “another pattern”, how the claimed patterns are related to the string of lines 5-6, how “a pattern” and “another pattern” are identified. 
Claim 1 lines 10-13 recite “determine a combined probability that combines a probability based on a number of entries in a dictionary which stores the token and is associated with the entity, and another probability based on a number of character types in the other entity that match characters in the other token”. It is not clear which “token” is stored in a dictionary and is associated with the entity. Which entity is “the entity”? note claim 1 include a total number of entities, an entity, another entity. Furthermore it is not clear what is meant by “based on a number of entries in a dictionary which stores the token” .
Claim 1 lines 14-15 recite “determine whether the combined probability associated with the pattern is greater than another combined probability associated with the other pattern”. However the combined probability does not further clarify the recited patterns.
Claim 1 lines 16-20 recite “match the prospective record to an existing record in the system based on recognizing the token as the entity and the other token as the other entity, in response to a determination that the combined probability associated with the pattern is greater than the other combined probability associated with the other pattern”. However the limitations do not further clarify the “pattern” recited at lines 7-9.
Claim 2 merely describes a field of street address or school name or person name is associated with the string.
Claim 3 merely adds converting the strings into lower case letters and removing special characters.
Claim 4 merely adds dictionaries associated with data providers.
Claim 5 merely describes the “other probability” based on character types.
Claim 6 merely adds different weights associated with the patterns.
Claim 7 merely recognizes both tokens as associated with the other pattern based on the combined probability.  
Claims 8-14 and 15-20 essentially recite the limitations of claims 1-7 in form of computer program product and method thus contain similar defects.
Therefore although the dependent claims are more detailed than their parent claims, they do not cure the deficiencies of their parent claims.
Art rejection is applied to claims 1-20 as best understood in light of the rejection under 35 U.S.C. 112 discussed above.
Claim Rejections - 35 USC § 103
 In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-4, 6-11, 13-17,19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Galle et al (EP 2664997), further in view of Delaney et al (US 20140280353).
Regarding claim 1, Galle substantially discloses, teaches or suggests a system for adaptive recognition of entities, the system comprising:
one or more processors (Figure 1 item 18); and
a non-transitory computer readable medium storing a plurality of instructions (see at least 0059), which when executed, cause the one or more processors to:
separate a string into a total number of tokens, comprising a token and another token, in response to receiving a prospective record comprising the string (see at least 0071: The entity extractor 40 used for identifying the instances can be any suitable linguistic processor, such as a parser, which partitions the text strings, e.g., sentences, of the document 26 into tokens, possibly filters the tokens to identify those that serve as noun phrases (nouns and longer phrases which serve as nouns) in the sentences, and compares these noun phrases to the list of candidate named entities in the database 30 to identify matching instances.));
Galle further teaches the concept of identify a pattern comprising an entity, another entity, and a total number of entities that equals the total number of tokens, and another pattern comprising a same total number of entities that equals the total number of tokens (see at least 0084-0085);
match the prospective record to an existing record in the system based on recognizing the token as the entity and the other token as the other entity (see at least 0071);
The difference is Galle does not specifically show:
determine a combined probability that combines a probability based on a number of entries in a dictionary which stores the token and is associated with the entity, and another probability based on a number of character types in the other entity that match characters in the other token;
determine whether the combined probability associated with the pattern is greater than another combined probability associated with the other pattern; and
match the prospective record to an existing record in the system based on recognizing the token as the entity and the other token as the other entity, in response to a determination that the combined probability associated with the pattern is greater than the other combined probability associated with the other pattern.
However Galle clearly show determining matched entities based on probabilistic method (see at least 0087). Delaney in the same field of endeavor teaches the concept of combined probability in entity detection (see at least Delaney 0010-0012) and a dictionary of tokens (see at least Delaney 0052). . 
Since the system of Galle recognizes matching entities in databases upon parsing input text string into a token and another token, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include a dictionary and the combined probability technique taught by Delaney in order to accurately match entities to tokens of the input string.

Regarding claim 2, Galle/Delaney teaches the system of claim 1, wherein the string is associated with a field that stores one of a street address, a school name, and a person name (see at least Delaney 0024). 

Regarding claim 3, Galle/Delaney does not specifically show the system of claim 1, wherein separating the string into the total number of tokens comprises converting any uppercase letters in the string to lowercase letters, and stripping the string of any special characters. However any text string includes characters of uppercase, lowercase and special characters. Furthermore Galle clearly shows the system and method utilize a database of named entities (which may have been obtained through use of a parser). The system and method find relationships of synonymy between entries of this database. In other words, for a given real named entity, there may be multiple entries (candidate named entities) in the database, with different canonical strings). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include converting the text string as claimed in order to find multiple matching entries in the database of Galle.

Regarding claim 4, Galle/Delaney does not specifically show the system of claim 1, wherein the dictionary is associated with a data provider and another dictionary is associated with another data provider and another pattern comprising a same total number of entities that equals the total number of tokens; determine a combined probability that combines a probability based on a number of entries in a dictionary which stores the token and is associated with the entity, and another probability based on a number of character types in the other entity that match characters in the other token; determine whether the combined probability associated with the pattern is greater than another combined probability associated with the other pattern; and match the prospective record to an existing record in the system based on recognizing the token as the entity and the other token as the other entity, in response to a determination that the combined probability associated with the pattern is greater than the other combined probability associated with the other pattern.
However the claimed dictionary is merely “associated with a data provider” thus reads on the database of named entities of Galle (see at least  0028) or the predefined dictionaries of Delaney (see at least 0057) . Delaney teaches the concept of combined probability in entity detection (see at least Delaney 0010-0012). Galle further teaches the concept of probabilities associated with patterns when Galle shows identification of candidate named entities appropriate for merging (see at least 0027). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include any matching criteria including the claimed number of entries in dictionaries of data providers and the number of character types depending on users/applications requirements.
Regarding claim 6, Galle/Delaney teaches or suggests the system of claim 1, wherein determining whether the combined probability associated with the pattern is greater than the other combined probability associated with the other pattern is based on a weight associated with the pattern and another weight associated with the other pattern (see at least Delaney 0061). 

Regarding claim 7, Galle/Delaney teaches or suggests the system of claim 1, wherein the plurality of instructions further causes the processor to match the prospective record to an existing record in the system based on recognizing the token as an entity associated with the other pattern and the other token as another entity associated with the other pattern, in response to a determination that the combined probability associated with the pattern is not greater than the other combined probability associated with the other pattern (see at least Galle 0071-0072).

Claims 8-11, 13-14 and 15-17, 19-20 correspond to computer program products and methods respectively for system claims 1-4, 6-7, thus are rejected for the same reasons discussed in claims 1-4, 6-7 above.


Claim(s) 5, 12, 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Galle et al (EP 2664997), in view of Delaney et al (US 20140280353), further in view of Aerts et al (US 9042662).
Regarding claim 5, Galle/Delaney does not specifically show the system of claim 1, wherein the other probability based on the number of character types in the other entity that match characters in the other token comprises a heuristically determined probability. However it is well known in the art to use heuristic in probability analysis as shown by Aerts (see at least Figure 3 blocks 110n, 115, col.3 lines 28-34). it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include such features while implementing the system of Galle/Delaney in order to benefit from a standardized technique in probability analysis. 

Claims 12, 18 correspond to a computer program product and method respectively for system claim 5, thus are rejected for the same reasons discussed in claim 5 above.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Hassanzadeh et al (US 20110106821) teach a method of semantic-aware record matching includes receiving source and target string record specifications associated with a source string record and a target string record, receiving semantic knowledge referring to tokens of the source string record and target string record, creating a first set of tokens for the source string record and a second set of tokens for the target string record based on the semantic knowledge, assigning a similarity score to the source string record and the target string record based on a semantic relationship between the first set of tokens and the second set of tokens, and matching the source string record and the target string record based on the similarity score.
Adams et al (US 20080243832) teach embodiments of systems and methods for comparing attributes of a data record are presented herein. In some embodiments, a weight is based on a comparison of the name (or other) attributes of data records. In some embodiments, an information score may be calculated for each of two name attributes to be compared to get an average information score for the two name attributes. The two name attributes may then be compared against one another to generate a weight between the two attributes. This weight can then be normalized to generate a final weight between the two business name attributes. Comparing attributes according to embodiments disclosed herein can facilitate linking data records even if they comprise attributes in languages which do not use the Latin alphabet.
Jagota (US 20120066160) teaches structured learning system for extracting contact data from quotes. An input string is obtained through a search for timely material associated with the stored contact. The input string is parsed using probabilistic tendencies to extract entities corresponding to those stored with the contact. Secondary entities are used to assist in the identification of the primary entities. The contact is then updated (or added if new) using the extracted primary entities.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to UYEN T LE whose telephone number is (571)272-4021. The examiner can normally be reached M-F 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Vital can be reached on 571-272-4215. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/UYEN T LE/Primary Examiner, Art Unit 2162                                                                                                                                                                                                        18 June 2022