DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Introductory Remarks
In response to communications filed on 19 August 2021, claim(s) 1, 6, and 15 is/are amended per Applicant’s request. Therefore, claims 1-20 are presently pending in the application, of which, claim(s) 1, 6, and 15 is/are presented in independent form.

No IDS has been received since the mailing of the last Office action. 

The previously raised 101 rejection of claims 1-20 is withdrawn in view of the amendments to the independent claims. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-3, 9, 10, and 12-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Motte et al. (U.S. PGPub No. 2016/0239496 A1) (hereinafter Motte) in view of Souza et al. (U.S. PGPub No. 2013/0080521 A1) (hereinafter Souza) in view of Miserendino et al. (U.S. PGPub No. 2017/0032279 A1) (hereinafter Miserendino) in view of Lee et al. (U.S. PGPub No. 2008/0168054 A1) (hereinafter Lee).

As per claim 1, Motte teaches a method comprising: 
obtaining a first plurality of records, wherein each record of the first plurality of records is associated with a respective entity and comprises a first one or more fields (0030 and 0034 – each document, or even sentence, may be considered a record); 
obtaining a second plurality of records, wherein each record of the second plurality of records is associated with a respective entity and comprises a second one or more fields (0030 and 0034); 
generating a plurality of record pairs, wherein each record pair in the plurality of record pairs comprises a respective first record from the first plurality of records and a respective second record from the second plurality of records, and wherein at least one field of the first record differs from a corresponding field of the second record (0045); 
determine a respective match score for each of the plurality of record pairs, the respective match scores comprising probabilities that the respective first record and second record of the respective record pairs are associated with a respective same entity (0045 – reliability score); 
identifying, for each record in the first plurality of records, a respective cluster of record pairs, wherein each record pair in the cluster includes the record (0054-56); 
determining, based at least in part on the respective match scores and one or more criteria for evaluating clusters, that each cluster of record pairs corresponds to a respective entity (0055 – extrinsic criteria = criteria for evaluating); 
identifying, for each cluster of record pairs, a respective record in the second plurality of records based at least in part on the match scores of the record pairs in the cluster (Step 344; see also 0054-56, 0052, and 0045); and 
outputting the respective record in the second plurality of records to [a] client computing device (0091; Figure 7 and corresponding description).


applying a machine learning model to determine are respective match score for each of the plurality of record pairs…;
identifying an indeterminate record pair of the plurality of record pairs, wherein the match score of the indeterminate record pair indicates that the machine learning model had low confidence in its assessment of whether the first record and second record of the indeterminate record pair are associated with the same entity; 
causing a client computing device to present the indeterminate record pair to a user; 
receiving, from the client computing device, user feedback indicating whether the first and second record of the indeterminate record pair are associated with the same entity; 
retraining the machine learning model and revising the match score of the indeterminate record pair based at least in part on the user feedback; and
outputting the clusters of record pairs and the respective record in the second plurality of records to the client computing device. (Emphasis added).

Souza teaches: 
identifying an indeterminate record pair of the plurality of record pairs, wherein the match score of the indeterminate record pair indicates that the machine learning model had low confidence in its assessment of whether the first record and second record of the indeterminate record pair are associated with the same entity; 
causing a client computing device to present the indeterminate record pair to a user; 
receiving, from the client computing device, user feedback indicating whether the first and second record of the indeterminate record pair are associated with the same entity; 
revising the match score of the indeterminate record pair based at least in part on the user feedback. 

It would have been obvious to one of ordinary skill in the art to incorporate the teachings of Souza into the invention of Motte in order to identify an indeterminate record pair of the plurality of record pairs, wherein the match score of the indeterminate record pair indicates that the machine learning model had low confidence in its assessment of whether the first record and second record of the indeterminate record pair are associated with the same entity; cause a client computing device to present the indeterminate record pair to a user; receive, from the client computing device, user feedback indicating whether the first and second record of the indeterminate record pair are associated with the same entity; and revise the match score of the indeterminate record pair based at least in part on the user feedback. This would have been clearly advantageous as it would allow users to aid in the establishment of the entity matches that the system is unable to determine itself. The combination hereinafter MS.

MS does not appear to explicitly disclose:
applying a machine learning model to determine are respective match score for each of the plurality of record pairs…;
retraining the machine learning model…; or 
outputting the clusters of record pairs and the respective record in the second plurality of records to the client computing device. (Emphasis added).

Miserendino teaches the use of machine learning model to classify data. See Miserendino at 0009. Miserendino demonstrates that the machine learning model can be retrained based at least in part user input. Miserendino at 0020 and 0045. It would have been obvious to one of ordinary skill in the art to incorporate the teachings of Miserendino into the combination MS in order to applying a machine learning model to determine are respective match score for each of the plurality of record pairs, and retrain the machine learning model and revise the match score of the indeterminate record pair based at least in part on the user feedback. This would have been clearly advantageous as incorporating the use of machine learning models to the combination of MS would allow for greater adaptability in classifying entity matches, and further the ability to retrain the model via user feedback would aid in ensuring the greatest accuracy (least false positives and false negatives). The combination hereinafter MSM.

But MSM does not appear to explicitly disclose: outputting the clusters of record pairs and the respective record in the second plurality of records to the client computing device. (Emphasis added).

Lee does teach outputting the clusters of record pairs. Lee at 0041. It would have been obvious to one of ordinary skill in the art to incorporate the teachings of Lee into the invention 

As per claim 6, MSML teaches a system (Motte at 0010) comprising: 
a data store configured to store a first plurality of records and a second plurality of records, wherein each record in the first and second pluralities of records is associated with a respective entity (Motte at 0106-09 and 0055); and 
a computing device including a processor in communication with data store, wherein the processor is configured to execute computer-executable instructions (Motte at 0105) to at least: 
For the remaining limitations, see the examiner’s remarks regarding claim 1.

As per claim 15, MSML teaches a non-transitory computer-readable storage medium including computer-executable instructions (Motte at claim 20) that, when executed by a processor, cause the processor to: 
For the remaining limitations, see the examiner’s remarks regarding claim 1.

As per claims 2 and 3, MSML does not appear to explicitly disclose: 
The method of Claim 1, wherein individual records of the second plurality of records include a canonical name for a respective entity;
The method of Claim 1, wherein individual records of the second plurality of records include a canonical location for a respective entity;
However the differences in the claim limitations are only found in the nonfunctional descriptive material and are not functionally involved in the steps recited.  The method would be performed the same regardless of the contents of the individual records of the second plurality of records.   Thus, this descriptive material will not distinguish the claimed invention from the prior art in terms of patentability, see In re Gulack, 703 F.2d 1381, 1385, 217 USPQ 401, 404 (Fed. Cir. 1983); In re Lowry, 32 F.3d 1579, 32 USPQ2d 1031 (Fed. Cir. 1994). 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time the invention was made to have the individual records of the second plurality of records include a canonical name or location for a respective entity. This is so because such data does not functionally relate to the steps in the method claimed and because the subjective interpretation of the data does not patentably distinguish the claimed invention.

As per claim 7, MSML teaches the system of Claim 6, wherein the processor is further configured to at least identify, for each cluster of record pairs, a geographical location corresponding to the cluster based at least in part on the respective match scores (Motte at 0065 and Figure 5A and its corresponding description).

As per claim 9, MSML teaches the system of Claim 6, wherein each record in the first plurality of records is associated with a respective merchant, and wherein each record in the second plurality of records is associated with a respective transaction (Motte at 0042).

As per claim 10, MSML does not appear to explicitly disclose: The system of Claim 6, wherein no two records of the first plurality of records are associated with the same entity. However this is a variation that is obvious to try; In practicality there are only a finite number of options for whether the records of the first plurality of records are associated with the same entity or not. There is always a design pressure in the art to efficiently store and retrieve data, and in particular data about a particular entity (i.e. disambiguated). It would have been obvious to one of ordinary skill in the art to try having no two records of the first plurality of records be associated with the same entity in the system of ML in an attempt to provide a more efficient, via disambiguation, entity storage, as a person with ordinary skill has a good reason to pursue the known options for record storage within his or her technical grasp. In turn, because having the first plurality of records with no two records being associated with the same entity when used in the system of ML has the predicted properties of the more efficient storage, it would have been obvious.

As per claim 12, MSML teaches the system of Claim 6, wherein the processor is further configured to identify a first subset of the second plurality of records as a first group based at least in part on a first field of individual records in the first subset (Motte at Step 344; see also 0054-56, 0052, and 0045).

the system of Claim 12, wherein at least one cluster of record pairs is generated based at least in part on the first group (Motte at Step 344; see also 0054-56, 0052, and 0045).

As per claim 14, MSML teaches the system of Claim 12, wherein the processor is further configured to: identify a second subset of the second plurality of records as a second group based at least in part on a second field of individual records in the second subset; and identify a third subset of the second plurality of records as a third group based at least in part on the first group and the second group (Motte at Step 344; see also 0054-56, 0052, and 0045).

As per claim 16, see examiner’s remarks regarding claims 1 and 15.

As per claim 17, see examiner’s remarks regarding claims 1 and 15.

As per claim 18, MSML teaches the non-transitory computer-readable storage medium of Claim 15, wherein the computer-executable instructions further cause the processor to filter the record pairs in a first cluster of record pairs, and wherein the entity associated with the first cluster of record pairs is determined based at least in part on the filtered record pairs (Motte at 0014; see also examiner’s remarks regarding claims 1 and 15.).

Claims 4 and 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Motte in view of Souza in view of Miserendino in view of Lee as applied to claim 1 above, and further in view of Martinez et al. (U.S. PGPub No. 2010/0125604 A1) (hereinafter Martinez).

As per claim 4, ML does not appear to explicitly disclose: The method of Claim 1 further comprising determining, for the record that is included in each record pair of a first cluster of record pairs, a canonical value for at least one field of the first one or more fields based at least in part on the match scores of the record pairs in the first cluster. However Martinez does teach determining a canonical value. See e.g. Martinez at 0004. It would have been obvious to one of ordinary skill in the art to incorporate the teachings of Martinez into the combination of ML in order to determine, for the record that is included in each record pair of a first cluster of record pairs, a canonical value for at least one field of the first one or more fields based at least in part on the match scores of the record pairs in the first cluster. This would have been clearly advantageous as it would allow the system to better understand the context of the values being interpreted. See generally Martinez at 0004 and Motte at 0033. The combination hereinafter MLM.

As per claim 5, MLM teaches the method of Claim 4, wherein the canonical value comprises a name, street address, city (Martinez at 0134-40), business, or phone number.

Claim 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Motte in view of Souza in view of Miserendino in view of Lee as applied to claim 1 above, and further in view of Yip et al. (U.S. PGPub No. 2013/0016106 A1) (hereinafter Yip).

As per claim 8, ML does not appear to explicitly disclose: the system of Claim 7, wherein the processor is further configured to at least cause the client computing device to display a heat map including, for individual clusters of record pairs, information regarding the size of the cluster at the geographical location corresponding to the cluster. However Yip does explicitly teach that which ML does not. Yip at 0007. It would have been obvious to one of ordinary skill in the art to incorporate the teachings of Yip into the combination of ML in order to have display a heat map including, for individual clusters of record pairs, information regarding the size of the cluster at the geographical location corresponding to the cluster. This would have been clearly advantageous as it would more effectively convey information about the clusters to the users of the system of ML.

Claim 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Motte in view of Souza in view of Miserendino in view of Lee as applied to claim 1 above, and further in view of Griffith (U.S. PGPub No. 2011/0282890 A1) (hereinafter Griffith).

As per claim 11, ML does not appear to explicitly disclose: The system of Claim 6, wherein the processor is further configured to apply one or more cleaning functions to a first field of each record in the first plurality of records and a corresponding second field of each record in the second plurality of records. However Griffith does explicitly teach that which ML does not. Griffith at 0024 and 0095. It would have been obvious to one of ordinary skill in the art to incorporate the teachings of Griffith into the combination of ML in order to have one or more cleaning functions applied to a first field of each record in the first plurality of records and a corresponding second field of each record in the second plurality of records. This would have been clearly advantageous as it would improve the functioning of the system of ML by correcting data that has been determined to be erroneous.

Claims 19 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Motte in view of Souza in view of Miserendino in view of Lee as applied to claim 1 above, and further in view of Singh et al (U.S. PGPub No. 2011/0173189 A1) (hereinafter Singh).

As per claim 19, ML does not appear to explicitly disclose: The non-transitory computer-readable storage medium of Claim 15, wherein the computer-executable instructions further cause the processor to prune the record pairs in each cluster of record pairs to produce a bipartite graph. However Singh does explicitly teach that which ML does not. Singh at 0059; see also Singh at 0090. It would have been obvious to one of ordinary skill in the art to incorporate the teachings of Singh into the combination of ML in order to prune the record pairs in each cluster of record pairs to produce a bipartite graph. This would have been clearly advantageous as it would aid in modeling the relationships between different classes of objects (e.g. entity and address), thereby aiding in disambiguation and clustering. The combination hereinafter MLS.

As per claim 20, MLS teaches the non-transitory computer-readable storage medium of Claim 19, wherein the record pairs are pruned based at least in part on the probabilities (See examiner’s remarks regarding claims 19, 1, and 15).

Response to Arguments
Applicant’s arguments, see Applicant’s Response at page 9 et seq., filed 19 August 2021, with respect to the rejection(s) of claim(s) 1-20 under 35 USC 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Souza and Miserendino, and the other references of record.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TYLER J TORGRIMSON whose telephone number is (571)270-5550.  The examiner can normally be reached on Monday - Friday 9 am - 5:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aleksander Kerzhner can be reached on 571.270.1760.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.











/TYLER J TORGRIMSON/Primary Examiner, Art Unit 2165