DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 23 February 2022 has been entered.

Introductory Remarks
In response to communications filed on 23 February 2022, claim(s) 1, 6, and 15 is/are amended per Applicant’s request. Therefore, claims 1-20 are presently pending in the application, of which, claim(s) 1, 6, and 15 is/are presented in independent form.

No IDS has been received since the mailing of the last Office action. 

EXAMINER'S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 

Authorization for this examiner’s amendment was given in an interview with Scott Forbes (Reg. No. 73,513) on 7 March 2022.

The application has been amended as follows: 
AMENDMENTS TO THE CLAIMS

1. (Currently amended) A method comprising: 
obtaining a first plurality of records, wherein each record of the first plurality of records is associated with a respective entity and comprises a first one or more fields; 
obtaining a second plurality of records, wherein each record of the second plurality of records is associated with a respective entity and comprises a second one or more fields; 
generating a plurality of record pairs, wherein each record pair in the plurality of record pairs comprises a respective first record from the first plurality of records and a respective second record from the second plurality of records, and wherein at least one field of the first record differs from a corresponding field of the second record; 
applying a machine learning model to determine a respective match score for each of the plurality of record pairs, the respective match scores comprising probabilities that the respective first record and second record of the respective record pairs are associated with a respective same entity; 
 below a pre-established threshold in its assessment of whether the first record and second record of the indeterminate record pair are associated with the same entity; 
causing a client computing device to present the indeterminate record pair to a user; 
receiving, from the client computing device, user feedback indicating whether the first and second record of the indeterminate record pair are associated with the same entity; 
retraining the machine learning model and revising the match score of the indeterminate record pair based at least in part on the user feedback; 
identifying, for each record in the first plurality of records, a respective cluster of record pairs, wherein each record pair in the cluster includes the record; 
determining, based at least in part on the respective match scores and one or more criteria for evaluating clusters, that each cluster of record pairs corresponds to a respective entity; 
identifying, for each cluster of record pairs, a respective record in the second plurality of records based at least in part on the match scores of the record pairs in the cluster; and 
outputting the clusters of record pairs and the respective record in the second plurality of records to the client computing device.



3. (Original) The method of Claim 1, wherein individual records of the second plurality of records include a canonical location for a respective entity.

4. (Original) The method of Claim 1 further comprising determining, for the record that is included in each record pair of a first cluster of record pairs, a canonical value for at least one field of the first one or more fields based at least in part on the match scores of the record pairs in the first cluster.

5. (Original) The method of Claim 4, wherein the canonical value comprises a name, street address, city, business, or phone number.

6. (Currently amended) A system comprising: 
a data store configured to store a first plurality of records and a second plurality of records, wherein each record in the first and second pluralities of records is associated with a respective entity; and 
a computing device including a processor in communication with data store, wherein the processor is configured to execute computer-executable instructions to at least: 
generate a plurality of record pairs, wherein each record pair in the plurality of record pairs comprises a respective first record from the first plurality of records and a 
apply a machine learning model to determine a respective match score for each record pair of the plurality of record pairs, the match score indicating a probability that the first and second records in the record pair are associated with a respective same entity; 
identify, based at least in part on the respective match scores for individual record pairs of the plurality of record pairs, an indeterminate record pair of the plurality of record pairs, wherein the match score of the indeterminate record pair indicates that the machine learning model had  below a pre-established threshold in its assessment of whether the first and second records in the indeterminate record pair are associated with the same entity; 
receive user feedback indicating whether the first and second record of the indeterminate record pair are associated with the same entity; 
retrain the machine learning model and revise the match score of the indeterminate record pair based at least in part on the user feedback; 
identify, for each record in the first plurality of records, a respective cluster of record pairs, wherein each record pair in the cluster includes the record; 
determine, based at least in part on the respective match scores, that each cluster of record pairs corresponds to a respective entity; and 
output the clusters of record pairs and the respective entity for each cluster to a client computing device.

7. (Original) The system of Claim 6, wherein the processor is further configured to at least identify, for each cluster of record pairs, a geographical location corresponding to the cluster based at least in part on the respective match scores.

8. (Original) The system of Claim 7, wherein the processor is further configured to at least cause the client computing device to display a heat map including, for individual clusters of record pairs, information regarding the size of the cluster at the geographical location corresponding to the cluster.

9. (Original) The system of Claim 6, wherein each record in the first plurality of records is associated with a respective merchant, and wherein each record in the second plurality of records is associated with a respective transaction.

10. (Original) The system of Claim 6, wherein no two records of the first plurality of records are associated with the same entity.

11. (Original) The system of Claim 6, wherein the processor is further configured to apply one or more cleaning functions to a first field of each record in the first plurality of records and a corresponding second field of each record in the second plurality of records.



13. (Original) The system of Claim 12, wherein at least one cluster of record pairs is generated based at least in part on the first group.

14. (Original) The system of Claim 12, wherein the processor is further configured to: 
identify a second subset of the second plurality of records as a second group based at least in part on a second field of individual records in the second subset; and 
identify a third subset of the second plurality of records as a third group based at least in part on the first group and the second group.

15. (Currently amended) A non-transitory computer-readable storage medium including computer-executable instructions that, when executed by a processor, cause the processor to: 
generate a plurality of record pairs, wherein each record pair in the plurality of record pairs comprises a respective first record from a first plurality of records and a respective second record from a second plurality of records, and wherein at least one field of the first record differs from a corresponding field of the second record; 
identify, for each record in the first plurality of records, a respective cluster of record pairs based at least in part on probabilities that first and second records in individual record pairs are associated with a respective same entity, the probabilities determined by a machine being below a pre-established threshold for individual record pairs of the plurality of record pairs; 
determine, based at least in part on the probabilities, a respective entity associated with each cluster of record pairs; and 
output the clusters of record pairs and the respective entity associated with each cluster to a client computing device.

16. (Original) The non-transitory computer-readable storage medium of Claim 15, wherein the computer-executable instructions further cause the processor to determine the probabilities that first and second records in individual record pairs are associated with a respective same entity.

17. (Original) The non-transitory computer-readable storage medium of Claim 15, wherein each record pair in the respective cluster of record pairs for each record in the first plurality of records includes the record.

18. (Original) The non-transitory computer-readable storage medium of Claim 15, wherein the computer-executable instructions further cause the processor to filter the record pairs in a first cluster of record pairs, and wherein the entity associated with the first cluster of record pairs is determined based at least in part on the filtered record pairs.

19. (Original) The non-transitory computer-readable storage medium of Claim 15, wherein the computer-executable instructions further cause the processor to prune the record pairs in each cluster of record pairs to produce a bipartite graph.

20. (Original) The non-transitory computer-readable storage medium of Claim 19, wherein the record pairs are pruned based at least in part on the probabilities.

Response to Arguments
Applicant’s arguments, see page 9 et seq., filed 23 February 2022, with respect to the rejections under 35 US 103 have been fully considered and are persuasive.  The obviousness rejections of all pending claims have been withdrawn. 
	
Allowable Subject Matter
Claims 1-20 are allowed.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TYLER J TORGRIMSON whose telephone number is (571)270-5550. The examiner can normally be reached Monday - Friday 9 am - 5:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aleksander Kerzhner can be reached on 571.270.1760. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/TYLER J TORGRIMSON/Primary Examiner, Art Unit 2165