DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to communication filed on 15 March 2022. Claims 1-21 are pending in the case. Claims 1, 4, 9, 13, 16, and 21 were amended. Claims 1, 9, 13, and 21 are the independent claims. This action is non-final. 

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on March 15th, 2022 has been entered.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-3, 5, 6, 8-10, 12-15, 17, 18, 20, and 21 are being rejected under 35 U.S.C. 103 as being unpatentable over Cassidy et al. (US 2017/0308557 A1) in view of Lang et al. (US 2003/0225770 A1) in view of DeVries et al. (US 2006/0117238 A1) in view of Thomas et al. (US 2016/0048770 A1), further in view of McKee (US 2004/0220955 A1).
Regarding claim 1, Cassidy teaches a method comprising:
receiving transactional data from a plurality of payment systems, each payment system in the plurality of payment systems configured to process a set of data records representing user-initiated point of sale (POS) transactions for products or services of entities, the transactional data comprising a plurality of input records, each input record  in the plurality of input records comprising one or more string components used to represent an entity specified in a user-initiated POS transaction processed by a payment system in the plurality of payment systems (see Cassidy, Paragraph [0002], “The data is gathered from a variety of different data sources and is electronically stored in various formats as records in databases. Examples of data sources may include, but are not limited to, employee database, sales database, contact center database, offline records, customer escalation records, company's social media followers records, customer query records and mailing lists records.”);
storing master records redundantly in a distributed master record data store comprising a plurality of master record data stores (see Cassidy, Paragraphs [0035], [0062], [0073], “The cleansing module 204 standardizes the data included in the database (say the database 106)… In an embodiment, the merging of records is done based on the master record. In an embodiment, the merged records are then stored in an external database... a centralized or distributed database, and/or associated caches and servers) that store the one or more instructions 624 or data structures.”);

However, Cassidy does not explicitly teach:
asynchronously performing similarity searches using string components from the input records across the distributed master record data store, the similarity searches each configured to retrieve a most similar master record comprising master string components, wherein the most similar master record comprises most similar master string components in the distributed master record data store to string components in each input record; 

Lang teaches:
asynchronously performing similarity searches using string components from the input records across the distributed master record data store, the similarity searches each configured to retrieve a most similar master record comprising master string components, wherein the most similar master record comprises most similar master string components in the distributed master record data store to string components in each input record (see Lang, Paragraph [0045], “a database administrator may use a database string search or other query function to identify master records that are similar but not identical in one or more computer systems.”);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing), and arrived at a method that incorporates similarity searching master records. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of improving data cleansing (see Lang, Paragraph [0044]). In addition, both the references (Cassidy and Lang) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between both of the references highly suggests an expectation of success.

However, the combination of Cassidy and Lang do not explicitly teach:
asynchronously performing similarity searches using string components from the input records across the distributed master record data store, the similarity searches each configured to retrieve a most similar master record comprising master string components, wherein the most similar master record comprises most similar master string components in the distributed master record data store to string components in each input record;

DeVries teaches:
asynchronously performing similarity searches using string components from the input records across the distributed master record data store, the similarity searches each configured to retrieve a most similar master record comprising master string components, wherein the most similar master record comprises most similar master string components in the distributed master record data store to string components in each input record (see DeVries, Paragraph [0043], “The cleansing/curation unit 540 may perform cleansing operations on the staging database 535 using a synchronous or asynchronous scheme, as described in conjunction with FIG. 7.”);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing), further in view of DeVries (teaching a method and system for information workflows), and arrived at a method that incorporates asynchronous data cleansing. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of efficiently cleansing a database (see DeVries, Paragraph [0049]). In addition, the references (Cassidy, Lang, and DeVries) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

However, the combination of Cassidy, Lang, and DeVries do not explicitly teach:
generating a set of weights for a first set of features based on a training set;

Thomas teaches:
generating a set of weights for a first set of features based on a training set (see Thomas, Paragraph [0034], “The weights can be learned by machine learning system 106, and revised over time. Therefore, in one example, weighting component 179 obtains the latest attribute weights from machine learning system 106 and combines the weighted matching attributes to obtain a pairwise match result indicative of the combination of weighted attribute matches.”).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing) in view of DeVries (teaching a method and system for information workflows), further in view of Thomas (teaching entity resolution incorporating data from various data sources), and arrived at a method that incorporates weights with a machine learning algorithm. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of improving the identification of matching records (see Thomas, Paragraph [0034]). In addition, the references (Cassidy, Lang, DeVries, and Thomas) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

The combination of Cassidy, Lang, DeVries, and Thomas further teaches:
for each input record, generating a second set of features based on the input record and the most similar master record corresponding to each input record and generating a final score by multiplying each feature in the second set of features with a weight in the set of weights for a corresponding feature in the first set of features, wherein the final score indicates a likelihood of a match between a particular input record and the most similar master record (see Cassidy, Paragraph [0035]. Also, see Thomas, Paragraphs [0025]-[0034], “The weights can be learned by machine learning system 106, and revised over time. Therefore, in one example, weighting component 179 obtains the latest attribute weights from machine learning system 106 and combines the weighted matching attributes to obtain a pairwise match result indicative of the combination of weighted attribute matches.”);
and asynchronously updating the most similar master records in the distributed master record data store when the final scores are greater than a threshold by adding at least some of the string components from the input records into at least a portion of the most similar master records, said asynchronously updating including (see Cassidy, Paragraph [0062], “The unclassified vectors are then labeled by the processor 404, as matched or unmatched, by applying the machine learning model on the unclassified vectors. Further, all the vectors labeled as match are processed by the processor 404 to create clusters of records that are duplicates of each other in the cleansed database. In an embodiment, the processor 404 identifies a master record in each cluster of records. Subsequently, the processor 404 merges records in each cluster to obtain a de-duplicated cleansed database using predefined consolidated rules. In an embodiment, the merging of records is done based on the master record. In an embodiment, the merged records are then stored in an external database.” Also, see Thomas, Paragraph [0034]):

However, the combination of Cassidy, Lang, DeVries, and Thomas do not explicitly teach:
and asynchronously updating the most similar master records in the distributed master record data store when the final scores are greater than a threshold by adding at least some of the string components from the input records into at least a portion of the most similar master records, said asynchronously updating including:

McKee teaches:
and asynchronously updating the most similar master records in the distributed master record data store when the final scores are greater than a threshold by adding at least some of the string components from the input records into at least a portion of the most similar master records, said asynchronously updating including (see McKee, Paragraph [0031], “because underlying record 110-12 has the same NJ license # as master record 100-1, underlying record 110-12 is added to the stack corresponding to master record 100-1.”):

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing) in view of DeVries (teaching a method and system for information workflows) in view of Thomas (teaching entity resolution incorporating data from various data sources), further in view of McKee (teaching information processing system and method), and arrived at a method that updates master records. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of deduplication (see McKee, Abstract). In addition, the references (Cassidy, Lang, DeVries, Thomas, and McKee) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

The combination of Cassidy, Lang, DeVries, Thomas, and McKee further teaches:
generating a plurality of difference records comprising one or more string components that are different between particular input records and corresponding most similar master records; grouping the plurality of difference records having a same most similar master record retrieved from different master record data stores in the distributed master record data (see Cassidy, Paragraphs [0046]-[0047], “The cluster creation module 216 processes all match-pairs and thereafter creates clusters. Master records are identified using defined rules. For example, the most complete record may be considered as the master record in each cluster.” Also, see Lang, Paragraph [0045]. Also, see McKee, Figure 2, Paragraphs [0022]-[0023], [0032], “Turning now to FIG. 4A, the process of combining a new underlying record 110-12 with an existing master record 100-1 is illustrated assuming that the master records (100-1 to 100-6) and underlying records (110-1 to 110-6) of FIG. 2 already exist within the system. The system of this illustrated example includes the matching criteria that if the NJ License # of a new underlying record (regardless of source) matches the NJ License # field of a master record, then the two underlying records are considered to refer to the same entity and should be included in the same stack corresponding to that entity's master record. Accordingly, because underlying record 110-12 has the same NJ license # as master record 100-1, underlying record 110-12 is added to the stack corresponding to master record 100-1. In addition, the data (i.e., SS# and Gender) that was not initially available in the master record 100-1 are added thereto from the new underlying record 110-12.”).
receiving transactional data from a plurality of payment systems, each payment system in the plurality of payment systems configured to process a set of data records representing user-initiated point of sale (POS) transactions for products or services of entities, the transactional data comprising a plurality of input records, each input record  in the plurality of input records comprising one or more string components used to represent an entity specified in a user-initiated POS transaction processed by a payment system in the plurality of payment systems (see Thomas, Paragraph [0004], “The data (e.g., entities or other business records) or other information can exist in disparate applications sourced for different business functions. Some of those functions can include, for instance, sales, marketing, customer service, e-commerce, among others.”);

Regarding claim 2, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 1. McKee further teaches:
asynchronously retrieving most similar master records for each group of difference records from the distributed master record data store based on a master record identification in each corresponding difference record; appending the string components in the difference records in a group of difference records to a corresponding retrieved master record to produce appended master records; and asynchronously inserting the appended master records into the distributed master record data store (see McKee, Paragraph [0032], “A number of implementations may achieve the addition of a new record to the system. In a first embodiment, a separate record is added to the table that stores all the underlying records, where one part of the key (acting as a “backward link”) ties it to the master record and another part ties it to its source (or layer). (The data duplicated between the records could be deleted.) In a second embodiment, separate tables are used for each information source, so the new underlying record is added to the table for the corresponding information source. This reduces the need for storing a reference to the source of the data; it is inherently known by the table that the record is stored in.”).

Regarding claim 3, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 1. Cassidy further teaches:
generating the second set of features for each input record comprises comparing string components in the input record and in the most similar master record (see Cassidy, Paragraphs [0041]-[0044], “the similarity vectors are generated using one or more string matching algorithms. One non limiting example of string matching algorithm is Jaro-Winkler string matching algorithm. Each vector corresponds to a pairwise comparison of two distinct records. Each vector has as many components as there are fields in the cleansed data set... The machine learning algorithm module 212 analyses the labeled vector and classifies the remaining non labeled vector. The set of labeled vectors is used as a training and test set for a machine learning model for classification of unlabeled vector. An example of machine learning algorithm is an adaptive boosting algorithm. In an embodiment, the machine learning algorithm module 212 classifies the remaining vectors and returns a confidence level with each label. The labeled vectors are then checked by users using the human assisted checking module 214.”).

Regarding claim 5, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 1. Lang further teaches:
sending a first instruction to a primary one of the plurality of master record data stores; and delegating the first instruction for execution by a first secondary one of the plurality of master record data stores (see Lang, Paragraph [0057], “The processor sends a blocking message to each computer system involved in the collaborative data cleansing process 400. The blocking message notifies the receiving computer system that the cleansing case is being processed (step 463). A computer system that receives a blocking message for a particular cleansing case is prohibited from processing the particular cleansing case until a message is received that unblocks the particular cleansing case.”).

Regarding claim 6, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 5. Lang further teaches:
sending a second instruction to the primary one of the plurality of master data stores before a result of the first instruction is received from the secondary one of the master data stores; and generating a notification when the first instruction has finished executing (see Lang, Paragraph [0057], “The processor sends a blocking message to each computer system involved in the collaborative data cleansing process 400. The blocking message notifies the receiving computer system that the cleansing case is being processed (step 463). A computer system that receives a blocking message for a particular cleansing case is prohibited from processing the particular cleansing case until a message is received that unblocks the particular cleansing case.”).

Regarding claim 8, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 1. Cassidy further teaches:
sequentially mapping, before the similarity searches, one or more of the string components in each input record into one or more corresponding master string components using a plurality of program execution threads (see Cassidy, Paragraph [0070], “The application program 626 includes instructions that are capable of cleansing and de-duplication of data in a database. The application program 626 is executed by the processor 602. The database is either associated with the computer system 600 or is a part of different computer system. In an embodiment, the application program 626 performs all the function of performed by different modules included in the data cleansing and de-duplicating system 108 (explained in conjunction with FIG. 2).”).

Regarding claim 9, Cassidy teaches a method comprising:
receiving transactional data from a plurality of payment systems, each payment system in the plurality of payment systems configured to process a set of data records representing user-initiated point of sale (POS) transactions for products or services of entities, the transactional data comprising a plurality of input records, each input record in the plurality of input records comprising one or more string components used to represent an entity specified in a user-initiated POS transaction processed by a payment system in the plurality of payment systems (see Cassidy, Paragraph [0002], “The data is gathered from a variety of different data sources and is electronically stored in various formats as records in databases. Examples of data sources may include, but are not limited to, employee database, sales database, contact center database, offline records, customer escalation records, company's social media followers records, customer query records and mailing lists records.”);
storing master records redundantly in a distributed master record data store, the master records comprising a plurality of master string components, and the distributed master record data store comprising a primary master record data store and a plurality of secondary master record data stores (see Cassidy, Paragraphs [0035], [0062], [0073], “The cleansing module 204 standardizes the data included in the database (say the database 106)… In an embodiment, the merging of records is done based on the master record. In an embodiment, the merged records are then stored in an external database... a centralized or distributed database, and/or associated caches and servers) that store the one or more instructions 624 or data structures.”);

However, Cassidy does not explicitly teach:
asynchronously processing a plurality of similarity search queries, each similarity search query comprising at least one of the string components from a particular input record, the similarity search queries retrieving master records having most similar string components in the distributed master record data store corresponding to each similarity search query;

Lang teaches:
asynchronously processing a plurality of similarity search queries, each similarity search query comprising at least one of the string components from a particular input record, the similarity search queries retrieving master records having most similar string components in the distributed master record data store corresponding to each similarity search query (see Lang, Paragraph [0045], “a database administrator may use a database string search or other query function to identify master records that are similar but not identical in one or more computer systems.”);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing), and arrived at a method that incorporates similarity searching master records. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of improving data cleansing (see Lang, Paragraph [0044]). In addition, both the references (Cassidy and Lang) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between both of the references highly suggests an expectation of success.

However, the combination of Cassidy and Lang do not explicitly teach:
asynchronously processing a plurality of similarity search queries, each similarity search query comprising at least one of the string components from a particular input record, the similarity search queries retrieving master records having most similar string components in the distributed master record data store corresponding to each similarity search query;

DeVries teaches:
asynchronously performing similarity searches using string components from the input records across the distributed master record data store, the similarity searches each configured to retrieve a most similar master record comprising master string components, wherein the most similar master record comprises most similar master string components in the distributed master record data store to string components in each input record (see DeVries, Paragraph [0043], “The cleansing/curation unit 540 may perform cleansing operations on the staging database 535 using a synchronous or asynchronous scheme, as described in conjunction with FIG. 7.”);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing), further in view of DeVries (teaching a method and system for information workflows), and arrived at a method that incorporates asynchronous data cleansing. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of efficiently cleansing a database (see DeVries, Paragraph [0049]). In addition, the references (Cassidy, Lang, and DeVries) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

The combination of Cassidy, Lang, and DeVries further teaches: 
receiving the plurality of similarity search queries in the primary master record data store; delegating one or more of the similarity search queries from the primary master record data store to the secondary master record data stores; executing each query to retrieve master records having most similar string components in the distributed master record data store corresponding to each similarity search query (see Lang, Paragraph [0045], “a database administrator may use a database string search or other query function to identify master records that are similar but not identical in one or more computer systems.”);

However, the combination of Cassidy, Lang, and DeVries do not explicitly teach:
generating a set of weights for a first set of features based on a training set;

Thomas teaches:
generating a set of weights for a first set of features based on a training set (see Thomas, Paragraph [0034], “The weights can be learned by machine learning system 106, and revised over time. Therefore, in one example, weighting component 179 obtains the latest attribute weights from machine learning system 106 and combines the weighted matching attributes to obtain a pairwise match result indicative of the combination of weighted attribute matches.”).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing) in view of DeVries (teaching a method and system for information workflows), further in view of Thomas (teaching entity resolution incorporating data from various data sources), and arrived at a method that incorporates weights with a machine learning algorithm. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of improving the identification of matching records (see Thomas, Paragraph [0034]). In addition, the references (Cassidy, Lang, DeVries, and Thomas) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

The combination of Cassidy, Lang, DeVries, and Thomas further teaches:
for each input record, generating a second set of features based on the input record and the most similar master record corresponding to each input record and generating a final score by multiplying each feature in the second set of features with a weight in the set of weights for a corresponding feature in the first set of features, wherein the final score indicates a likelihood of a match between a particular input record and the most similar master record (see Cassidy, Paragraph [0035]. Also, see Thomas, Paragraphs [0025]-[0034], “The weights can be learned by machine learning system 106, and revised over time. Therefore, in one example, weighting component 179 obtains the latest attribute weights from machine learning system 106 and combines the weighted matching attributes to obtain a pairwise match result indicative of the combination of weighted attribute matches.”);
modifying the retrieved master records based on the corresponding input records when the final scores are greater than a threshold by adding at least some of the string components from the input records into at least a portion of the most similar master records, said modifying including (see Cassidy, Paragraph [0062], “the processor 404 identifies a master record in each cluster of records. Subsequently, the processor 404 merges records in each cluster to obtain a de-duplicated cleansed database using predefined consolidated rules. In an embodiment, the merging of records is done based on the master record. In an embodiment, the merged records are then stored in an external database.” Also, see Thomas, Paragraph [0034]):

However, the combination of Cassidy, Lang, Thomas, and DeVries do not explicitly teach:
modifying the retrieved master records based on the corresponding input records when the final scores are greater than a threshold by adding at least some of the string components from the input records into at least a portion of the most similar master records, said modifying including:

McKee teaches:
modifying the retrieved master records based on the corresponding input records when the final scores are greater than a threshold by adding at least some of the string components from the input records into at least a portion of the most similar master records, said modifying including (see McKee, Paragraph [0031], “because underlying record 110-12 has the same NJ license # as master record 100-1, underlying record 110-12 is added to the stack corresponding to master record 100-1.”):

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing) in view of DeVries (teaching a method and system for information workflows) in view of Thomas (teaching entity resolution incorporating data from various data sources), further in view of McKee (teaching information processing system and method), and arrived at a method that updates master records. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of deduplication (see McKee, Abstract). In addition, the references (Cassidy, Lang, DeVries, Thomas, and McKee) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

The combination of Cassidy, Lang, DeVries, Thomas and McKee further teaches:
receiving a plurality of difference records comprising one or more string components that are different between particular input records and corresponding most similar master records; grouping the plurality of difference records having a same most similar master record retrieved from different master record data stores in the distributed master record data store (see Cassidy, Paragraphs [0046]-[0047], “The cluster creation module 216 processes all match-pairs and thereafter creates clusters. Master records are identified using defined rules. For example, the most complete record may be considered as the master record in each cluster.” Also, see Lang, Paragraph [0045]. Also, see McKee, Figure 2, Paragraphs [0022]-[0023], [0032], “Turning now to FIG. 4A, the process of combining a new underlying record 110-12 with an existing master record 100-1 is illustrated assuming that the master records (100-1 to 100-6) and underlying records (110-1 to 110-6) of FIG. 2 already exist within the system. The system of this illustrated example includes the matching criteria that if the NJ License # of a new underlying record (regardless of source) matches the NJ License # field of a master record, then the two underlying records are considered to refer to the same entity and should be included in the same stack corresponding to that entity's master record. Accordingly, because underlying record 110-12 has the same NJ license # as master record 100-1, underlying record 110-12 is added to the stack corresponding to master record 100-1. In addition, the data (i.e., SS# and Gender) that was not initially available in the master record 100-1 are added thereto from the new underlying record 110-12.”).
receiving transactional data from a plurality of payment systems, each payment system in the plurality of payment systems configured to process a set of data records representing user-initiated point of sale (POS) transactions for products or services of entities, the transactional data comprising a plurality of input records, each input record in the plurality of input records comprising one or more string components used to represent an entity specified in a user-initiated POS transaction processed by a payment system in the plurality of payment systems (see Thomas, Paragraph [0004], “The data (e.g., entities or other business records) or other information can exist in disparate applications sourced for different business functions. Some of those functions can include, for instance, sales, marketing, customer service, e-commerce, among others.”);

Regarding claim 10, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 9. McKee further teaches:
asynchronously retrieving most similar master records for each group of difference records from the distributed master record data store based on a master record identification in each corresponding difference record; and appending the sting components in the difference records in a group of difference records to a corresponding retrieved master record to produce appended master records (see McKee, Paragraph [0032], “A number of implementations may achieve the addition of a new record to the system. In a first embodiment, a separate record is added to the table that stores all the underlying records, where one part of the key (acting as a “backward link”) ties it to the master record and another part ties it to its source (or layer). (The data duplicated between the records could be deleted.) In a second embodiment, separate tables are used for each information source, so the new underlying record is added to the table for the corresponding information source. This reduces the need for storing a reference to the source of the data; it is inherently known by the table that the record is stored in.”).

Regarding claim 12, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 9. Cassidy further teaches:
storing each master record in each of the secondary master record data stores and the primary master record data store; and when a particular master record is changed in one master record data store of the distributed master record data store, propagating the changes to each of the other master record data stores (see Cassidy, Paragraph [0062], “the processor 404 identifies a master record in each cluster of records. Subsequently, the processor 404 merges records in each cluster to obtain a de-duplicated cleansed database using predefined consolidated rules. In an embodiment, the merging of records is done based on the master record. In an embodiment, the merged records are then stored in an external database.”).

Regarding claim 13, Cassidy teaches a non-transitory machine-readable medium storing a program executable by at least one processing unit of a computer, the program comprising sets of instructions for: 
receiving transactional data from a plurality of payment systems, each payment system in the plurality of payment systems configured to process a set of data records representing user-initiated point of sale (POS) transactions for products or services of entities, the transactional data comprising a plurality of input records, each input record  in the plurality of input records comprising one or more string components used to represent an entity specified in a user-initiated POS transaction processed by a payment system in the plurality of payment systems (see Cassidy, Paragraph [0002], “The data is gathered from a variety of different data sources and is electronically stored in various formats as records in databases. Examples of data sources may include, but are not limited to, employee database, sales database, contact center database, offline records, customer escalation records, company's social media followers records, customer query records and mailing lists records.”);
storing master records redundantly in a distributed master record data store comprising a plurality of master record data stores (see Cassidy, Paragraphs [0035], [0062], [0073], “The cleansing module 204 standardizes the data included in the database (say the database 106)… In an embodiment, the merging of records is done based on the master record. In an embodiment, the merged records are then stored in an external database... a centralized or distributed database, and/or associated caches and servers) that store the one or more instructions 624 or data structures.”);

However, Cassidy does not explicitly teach:
asynchronously performing similarity searches using string components from the input records across the distributed master record data store, the similarity searches each configured to retrieve a most similar master record comprising master string components, wherein the most similar master record comprises most similar master string components in the distributed master record data store to string components in each input record;

Lang teaches:
asynchronously performing similarity searches using string components from the input records across the distributed master record data store, the similarity searches each configured to retrieve a most similar master record comprising master string components, wherein the most similar master record comprises most similar master string components in the distributed master record data store to string components in each input record (see Lang, Paragraph [0045], “a database administrator may use a database string search or other query function to identify master records that are similar but not identical in one or more computer systems.”);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing), and arrived at a machine that incorporates similarity searching master records. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of improving data cleansing (see Lang, Paragraph [0044]). In addition, both the references (Cassidy and Lang) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between both of the references highly suggests an expectation of success.

However, the combination of Cassidy and Lang do not explicitly teach:
asynchronously performing similarity searches using string components from the input records across the distributed master record data store, the similarity searches each configured to retrieve a most similar master record comprising master string components, wherein the most similar master record comprises most similar master string components in the distributed master record data store to string components in each input record;

DeVries teaches:
asynchronously performing similarity searches using string components from the input records across the distributed master record data store, the similarity searches each configured to retrieve a most similar master record comprising master string components, wherein the most similar master record comprises most similar master string components in the distributed master record data store to string components in each input record (see DeVries, Paragraph [0043], “The cleansing/curation unit 540 may perform cleansing operations on the staging database 535 using a synchronous or asynchronous scheme, as described in conjunction with FIG. 7.”);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing), further in view of DeVries (teaching a method and system for information workflows), and arrived at a machine that incorporates asynchronous data cleansing. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of efficiently cleansing a database (see DeVries, Paragraph [0049]). In addition, the references (Cassidy, Lang, and DeVries) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

However, the combination of Cassidy, Lang, and DeVries do not explicitly teach:
generating a set of weights for a first set of features based on a training set;

Thomas teaches:
generating a set of weights for a first set of features based on a training set (see Thomas, Paragraph [0034], “The weights can be learned by machine learning system 106, and revised over time. Therefore, in one example, weighting component 179 obtains the latest attribute weights from machine learning system 106 and combines the weighted matching attributes to obtain a pairwise match result indicative of the combination of weighted attribute matches.”).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing) in view of DeVries (teaching a method and system for information workflows), further in view of Thomas (teaching entity resolution incorporating data from various data sources), and arrived at a machine that incorporates weights with a machine learning algorithm. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of improving the identification of matching records (see Thomas, Paragraph [0034]). In addition, the references (Cassidy, Lang, DeVries, and Thomas) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

The combination of Cassidy, Lang, DeVries, and Thomas further teaches:
for each input record, generating a second set of features based on the input record and the most similar master record corresponding to each input record and generating a final score by multiplying each feature in the second set of features with a weight in the set of weights for a corresponding feature in the first set of features, wherein the final score indicates a likelihood of a match between a particular input record and the most similar master record (see Cassidy, Paragraph [0035]. Also, see Thomas, Paragraphs [0025]-[0034], “The weights can be learned by machine learning system 106, and revised over time. Therefore, in one example, weighting component 179 obtains the latest attribute weights from machine learning system 106 and combines the weighted matching attributes to obtain a pairwise match result indicative of the combination of weighted attribute matches.”);
and asynchronously updating the most similar master records in the distributed master record data store when the final scores are greater than a threshold by adding at least some of the string components from the input records into at least a portion of the most similar master records, said asynchronously updating including (see Cassidy, Paragraph [0062], “The unclassified vectors are then labeled by the processor 404, as matched or unmatched, by applying the machine learning model on the unclassified vectors. Further, all the vectors labeled as match are processed by the processor 404 to create clusters of records that are duplicates of each other in the cleansed database. In an embodiment, the processor 404 identifies a master record in each cluster of records. Subsequently, the processor 404 merges records in each cluster to obtain a de-duplicated cleansed database using predefined consolidated rules. In an embodiment, the merging of records is done based on the master record. In an embodiment, the merged records are then stored in an external database.” Also, see Thomas, Paragraph [0034]):

However, the combination of Cassidy, Lang, Thomas, and DeVries do not explicitly teach:
and asynchronously updating the most similar master records in the distributed master record data store when the final scores are greater than a threshold by adding at least some of the string components from the input records into at least a portion of the most similar master records, said asynchronously updating including:

McKee teaches:
and asynchronously updating the most similar master records in the distributed master record data store when the final scores are greater than a threshold by adding at least some of the string components from the input records into at least a portion of the most similar master records, said asynchronously updating including (see McKee, Paragraph [0031], “because underlying record 110-12 has the same NJ license # as master record 100-1, underlying record 110-12 is added to the stack corresponding to master record 100-1.”):

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing) in view of DeVries (teaching a method and system for information workflows) in view of Thomas (teaching entity resolution incorporating data from various data sources), further in view of McKee (teaching information processing system and method), and arrived at a machine that updates master records. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of deduplication (see McKee, Abstract). In addition, the references (Cassidy, Lang, DeVries, Thomas, and McKee) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

The combination of Cassidy, Lang, DeVries, Thomas, and McKee further teaches:
generating a plurality of difference records comprising one or more string components that are different between particular input records and corresponding most similar master records; grouping the plurality of difference records having a same most similar master record retrieved from different master record data stores in the distributed master record data store (see Cassidy, Paragraphs [0046]-[0047], “The cluster creation module 216 processes all match-pairs and thereafter creates clusters. Master records are identified using defined rules. For example, the most complete record may be considered as the master record in each cluster.” Also, see Lang, Paragraph [0045]. Also, see McKee, Figure 2, Paragraphs [0022]-[0023], [0032], “Turning now to FIG. 4A, the process of combining a new underlying record 110-12 with an existing master record 100-1 is illustrated assuming that the master records (100-1 to 100-6) and underlying records (110-1 to 110-6) of FIG. 2 already exist within the system. The system of this illustrated example includes the matching criteria that if the NJ License # of a new underlying record (regardless of source) matches the NJ License # field of a master record, then the two underlying records are considered to refer to the same entity and should be included in the same stack corresponding to that entity's master record. Accordingly, because underlying record 110-12 has the same NJ license # as master record 100-1, underlying record 110-12 is added to the stack corresponding to master record 100-1. In addition, the data (i.e., SS# and Gender) that was not initially available in the master record 100-1 are added thereto from the new underlying record 110-12.”).
receiving transactional data from a plurality of payment systems, each payment system in the plurality of payment systems configured to process a set of data records representing user-initiated point of sale (POS) transactions for products or services of entities, the transactional data comprising a plurality of input records, each input record  in the plurality of input records comprising one or more string components used to represent an entity specified in a user-initiated POS transaction processed by a payment system in the plurality of payment systems (see Thomas, Paragraph [0004], “The data (e.g., entities or other business records) or other information can exist in disparate applications sourced for different business functions. Some of those functions can include, for instance, sales, marketing, customer service, e-commerce, among others.”);

Regarding claim 14, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 13. McKee further teaches:
asynchronously retrieving most similar master records for each group of difference records from the distributed master record data store based on a master record identification in each corresponding difference record; appending the string components in the difference records in a group of difference records to a corresponding retrieved master record to produce appended master records; and asynchronously inserting the appended master records into the distributed master record data store (see McKee, Paragraph [0032], “A number of implementations may achieve the addition of a new record to the system. In a first embodiment, a separate record is added to the table that stores all the underlying records, where one part of the key (acting as a “backward link”) ties it to the master record and another part ties it to its source (or layer). (The data duplicated between the records could be deleted.) In a second embodiment, separate tables are used for each information source, so the new underlying record is added to the table for the corresponding information source. This reduces the need for storing a reference to the source of the data; it is inherently known by the table that the record is stored in.”).

Regarding claim 15, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 13. Cassidy further teaches:
generating the second set of features for each input record comprises comparing string components in the input record and in the most similar master record (see Cassidy, Paragraphs [0041]-[0044], “the similarity vectors are generated using one or more string matching algorithms. One non limiting example of string matching algorithm is Jaro-Winkler string matching algorithm. Each vector corresponds to a pairwise comparison of two distinct records. Each vector has as many components as there are fields in the cleansed data set... The machine learning algorithm module 212 analyses the labeled vector and classifies the remaining non labeled vector. The set of labeled vectors is used as a training and test set for a machine learning model for classification of unlabeled vector. An example of machine learning algorithm is an adaptive boosting algorithm. In an embodiment, the machine learning algorithm module 212 classifies the remaining vectors and returns a confidence level with each label. The labeled vectors are then checked by users using the human assisted checking module 214.”).

Regarding claim 17, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 13. Lang further teaches:
sending a first instruction to a primary one of the plurality of master record data stores; and delegating the first instruction for execution by a first secondary one of the plurality of master record data stores (see Lang, Paragraph [0057], “The processor sends a blocking message to each computer system involved in the collaborative data cleansing process 400. The blocking message notifies the receiving computer system that the cleansing case is being processed (step 463). A computer system that receives a blocking message for a particular cleansing case is prohibited from processing the particular cleansing case until a message is received that unblocks the particular cleansing case.”).

Regarding claim 18, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 17. Lang further teaches:
sending a second instruction to the primary one of the plurality of master data stores before a result of the first instruction is received from the secondary one of the master data stores; and generating a notification when the first instruction has finished executing (see Lang, Paragraph [0057], “The processor sends a blocking message to each computer system involved in the collaborative data cleansing process 400. The blocking message notifies the receiving computer system that the cleansing case is being processed (step 463). A computer system that receives a blocking message for a particular cleansing case is prohibited from processing the particular cleansing case until a message is received that unblocks the particular cleansing case.”).

Regarding claim 20, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 13. Cassidy further teaches:
sequentially mapping, before the similarity searches, one or more of the string components in each input record into one or more corresponding master string components using a plurality of program execution threads (see Cassidy, Paragraph [0070], “The application program 626 includes instructions that are capable of cleansing and de-duplication of data in a database. The application program 626 is executed by the processor 602. The database is either associated with the computer system 600 or is a part of different computer system. In an embodiment, the application program 626 performs all the function of performed by different modules included in the data cleansing and de-duplicating system 108 (explained in conjunction with FIG. 2).”).

Regarding claim 21, Cassidy teaches a system comprising:
a processor; a memory; and non-transitory machine-readable medium storing a program executable by the processor, the program comprising sets of instructions for (see Cassidy, Paragraph [0007], “The system includes a memory and a processor. The memory stores instructions for cleansing and de-duplicating data. The processor is operatively coupled with the memory to fetch instructions from the memory for cleansing and de-duplicating data.”):
receiving transactional data from a plurality of payment systems, each payment system in the plurality of payment systems configured to process a set of data records representing user-initiated point of sale (POS) transactions for products or services of entities, the transactional data comprising a plurality of input records, each input record  in the plurality of input records comprising one or more string components used to represent an entity specified in a user-initiated POS transaction processed by a payment system in the plurality of payment systems (see Cassidy, Paragraph [0002], “The data is gathered from a variety of different data sources and is electronically stored in various formats as records in databases. Examples of data sources may include, but are not limited to, employee database, sales database, contact center database, offline records, customer escalation records, company's social media followers records, customer query records and mailing lists records.”);
storing master records redundantly in a distributed master record data store comprising a plurality of master record data stores (see Cassidy, Paragraphs [0035], [0062], [0073], “The cleansing module 204 standardizes the data included in the database (say the database 106)… In an embodiment, the merging of records is done based on the master record. In an embodiment, the merged records are then stored in an external database... a centralized or distributed database, and/or associated caches and servers) that store the one or more instructions 624 or data structures.”);

However, Cassidy does not explicitly teach:
asynchronously performing similarity searches using string components from the input records across the distributed master record data store, the similarity searches each configured to retrieve a most similar master record comprising master string components, wherein the most similar master record comprises most similar master string components in the distributed master record data store to string components in each input record;

Lang teaches:
asynchronously performing similarity searches using string components from the input records across the distributed master record data store, the similarity searches each configured to retrieve a most similar master record comprising master string components, wherein the most similar master record comprises most similar master string components in the distributed master record data store to string components in each input record (see Lang, Paragraph [0045], “a database administrator may use a database string search or other query function to identify master records that are similar but not identical in one or more computer systems.”);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing), and arrived at a system that incorporates similarity searching master records. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of improving data cleansing (see Lang, Paragraph [0044]). In addition, both the references (Cassidy and Lang) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between both of the references highly suggests an expectation of success.

However, the combination of Cassidy and Lang do not explicitly teach:
asynchronously performing similarity searches using string components from the input records across the distributed master record data store, the similarity searches each configured to retrieve a most similar master record comprising master string components, wherein the most similar master record comprises most similar master string components in the distributed master record data store to string components in each input record;

DeVries teaches:
asynchronously performing similarity searches using string components from the input records across the distributed master record data store, the similarity searches each configured to retrieve a most similar master record comprising master string components, wherein the most similar master record comprises most similar master string components in the distributed master record data store to string components in each input record (see DeVries, Paragraph [0043], “The cleansing/curation unit 540 may perform cleansing operations on the staging database 535 using a synchronous or asynchronous scheme, as described in conjunction with FIG. 7.”);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing), further in view of DeVries (teaching a method and system for information workflows), and arrived at a system that incorporates asynchronous data cleansing. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of efficiently cleansing a database (see DeVries, Paragraph [0049]). In addition, the references (Cassidy, Lang, and DeVries) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

However, the combination of Cassidy, Lang, and DeVries do not explicitly teach:
generating a set of weights for a first set of features based on a training set;

Thomas teaches:
generating a set of weights for a first set of features based on a training set (see Thomas, Paragraph [0034], “The weights can be learned by machine learning system 106, and revised over time. Therefore, in one example, weighting component 179 obtains the latest attribute weights from machine learning system 106 and combines the weighted matching attributes to obtain a pairwise match result indicative of the combination of weighted attribute matches.”).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing) in view of DeVries (teaching a method and system for information workflows), further in view of Thomas (teaching entity resolution incorporating data from various data sources), and arrived at a system that incorporates weights with a machine learning algorithm. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of improving the identification of matching records (see Thomas, Paragraph [0034]). In addition, the references (Cassidy, Lang, DeVries, and Thomas) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

The combination of Cassidy, Lang, DeVries, and Thomas further teaches:
for each input record, generating a second set of features based on the input record and the most similar master record corresponding to each input record and generating a final score by multiplying each feature in the second set of features with a weight in the set of weights for a corresponding feature in the first set of features, wherein the final score indicates a likelihood of a match between a particular input record and the most similar master record (see Cassidy, Paragraph [0035]. Also, see Thomas, Paragraphs [0025]-[0034], “The weights can be learned by machine learning system 106, and revised over time. Therefore, in one example, weighting component 179 obtains the latest attribute weights from machine learning system 106 and combines the weighted matching attributes to obtain a pairwise match result indicative of the combination of weighted attribute matches.”);
and asynchronously updating the most similar master records in the distributed master record data store when the final scores are greater than a threshold by adding at least some of the string components from the input records into at least a portion of the most similar master records, said asynchronously updating including (see Cassidy, Paragraph [0062], “The unclassified vectors are then labeled by the processor 404, as matched or unmatched, by applying the machine learning model on the unclassified vectors. Further, all the vectors labeled as match are processed by the processor 404 to create clusters of records that are duplicates of each other in the cleansed database. In an embodiment, the processor 404 identifies a master record in each cluster of records. Subsequently, the processor 404 merges records in each cluster to obtain a de-duplicated cleansed database using predefined consolidated rules. In an embodiment, the merging of records is done based on the master record. In an embodiment, the merged records are then stored in an external database.” Also, see Thomas, Paragraph [0034]):

However, the combination of Cassidy, Lang, Thomas, and DeVries do not explicitly teach:
and asynchronously updating the most similar master records in the distributed master record data store when the final scores are greater than a threshold by adding at least some of the string components from the input records into at least a portion of the most similar master records, said asynchronously updating including:

McKee teaches:
and asynchronously updating the most similar master records in the distributed master record data store when the final scores are greater than a threshold by adding at least some of the string components from the input records into at least a portion of the most similar master records, said asynchronously updating including (see McKee, Paragraph [0031], “because underlying record 110-12 has the same NJ license # as master record 100-1, underlying record 110-12 is added to the stack corresponding to master record 100-1.”):

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing) in view of DeVries (teaching a method and system for information workflows) in view of Thomas (teaching entity resolution incorporating data from various data sources), further in view of McKee (teaching information processing system and method), and arrived at a system that updates master records. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of deduplication (see McKee, Abstract). In addition, the references (Cassidy, Lang, DeVries, Thomas, and McKee) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

The combination of Cassidy, Lang, DeVries, Thomas, and McKee further teaches:
generating a plurality of difference records comprising one or more string components that are different between particular input records and corresponding most similar master records; grouping the plurality of difference records having a same most similar master record retrieved from different master record data stores in the distributed master record data store (see Cassidy, Paragraphs [0046]-[0047], “The cluster creation module 216 processes all match-pairs and thereafter creates clusters. Master records are identified using defined rules. For example, the most complete record may be considered as the master record in each cluster.” Also, see Lang, Paragraph [0045]. Also, see McKee, Figure 2, Paragraphs [0022]-[0023], [0032], “Turning now to FIG. 4A, the process of combining a new underlying record 110-12 with an existing master record 100-1 is illustrated assuming that the master records (100-1 to 100-6) and underlying records (110-1 to 110-6) of FIG. 2 already exist within the system. The system of this illustrated example includes the matching criteria that if the NJ License # of a new underlying record (regardless of source) matches the NJ License # field of a master record, then the two underlying records are considered to refer to the same entity and should be included in the same stack corresponding to that entity's master record. Accordingly, because underlying record 110-12 has the same NJ license # as master record 100-1, underlying record 110-12 is added to the stack corresponding to master record 100-1. In addition, the data (i.e., SS# and Gender) that was not initially available in the master record 100-1 are added thereto from the new underlying record 110-12.”).
receiving transactional data from a plurality of payment systems, each payment system in the plurality of payment systems configured to process a set of data records representing user-initiated point of sale (POS) transactions for products or services of entities, the transactional data comprising a plurality of input records, each input record  in the plurality of input records comprising one or more string components used to represent an entity specified in a user-initiated POS transaction processed by a payment system in the plurality of payment systems (see Thomas, Paragraph [0004], “The data (e.g., entities or other business records) or other information can exist in disparate applications sourced for different business functions. Some of those functions can include, for instance, sales, marketing, customer service, e-commerce, among others.”);

Claims 4, 11, and 16 are being rejected under 35 U.S.C. 103 as being unpatentable over Cassidy in view of Lang in view of DeVries in view of Thomas in view of McKee, further in view of Tereshkov et al. (US 2016/0180245 A1).
Regarding claim 4, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 3. However, the combination of Cassidy, Lang, DeVries, Thomas, and McKee do not explicitly teach:
wherein the set of features includes dice coefficients associated with comparing a name string component and an address string component of the each input record with a name string component and an address string component of the most similar master record.

Tereshkov teaches:
wherein the set of features includes dice coefficients associated with comparing a name string component and an address string component of the each input record with a name string component and an address string component of the most similar master record (see Tereshkov, Paragraph [0068], “As discussed above, some of the distance algorithms used for atomic distance assessment between two different fields of the two compared records include Dice Coefficient”).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing) in view of DeVries (teaching a method and system for information workflows) in view of Thomas (teaching entity resolution incorporating data from various data sources) in view of McKee (teaching information processing system and method), further in view of Tereshkov (teaching a method and system for linking heterogeneous data sources), and arrived at a method that incorporates a dice coefficients. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of assessing distance (see Tereshkov, Paragraph [0068]). In addition, the references (Cassidy, Lang, DeVries, McKee, Thomas, and Tereshkov) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

Regarding claim 11, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 9. Cassidy further teaches:
wherein generating the second set of features for each input record comprises comparing string components in the input record and in the most similar master record (see Cassidy, Paragraphs [0041]-[0044], “the similarity vectors are generated using one or more string matching algorithms. One non limiting example of string matching algorithm is Jaro-Winkler string matching algorithm. Each vector corresponds to a pairwise comparison of two distinct records. Each vector has as many components as there are fields in the cleansed data set... The machine learning algorithm module 212 analyses the labeled vector and classifies the remaining non labeled vector. The set of labeled vectors is used as a training and test set for a machine learning model for classification of unlabeled vector. An example of machine learning algorithm is an adaptive boosting algorithm. In an embodiment, the machine learning algorithm module 212 classifies the remaining vectors and returns a confidence level with each label. The labeled vectors are then checked by users using the human assisted checking module 214.”),

However, the combination of Cassidy, Lang, DeVries, Thomas, and McKee do not explicitly teach:
wherein the second set of features includes dice coefficients associated with comparing a name string component and an address string component of the each input record with a name string component and an address string component of the most similar master record.

Tereshkov teaches:
wherein the second set of features includes dice coefficients associated with comparing a name string component and an address string component of the each input record with a name string component and an address string component of the most similar master record (see Tereshkov, Paragraph [0068], “As discussed above, some of the distance algorithms used for atomic distance assessment between two different fields of the two compared records include Dice Coefficient”).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing) in view of DeVries (teaching a method and system for information workflows) in view of Thomas (teaching entity resolution incorporating data from various data sources) in view of McKee (teaching information processing system and method), further in view of Tereshkov (teaching a method and system for linking heterogeneous data sources), and arrived at a method that incorporates a dice coefficients. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of assessing distance (see Tereshkov, Paragraph [0068]). In addition, the references (Cassidy, Lang, DeVries, Thomas, McKee, and Tereshkov) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

Regarding claim 16, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 15. However, the combination of Cassidy, Lang, DeVries, Thomas, and McKee do not explicitly teach:
wherein the set of features includes dice coefficients associated with comparing a name string component and an address string component of the each input record with a name string component and an address string component of the most similar master record.

Tereshkov teaches:
wherein the set of features includes dice coefficients associated with comparing a name string component and an address string component of the each input record with a name string component and an address string component of the most similar master record (see Tereshkov, Paragraph [0068], “As discussed above, some of the distance algorithms used for atomic distance assessment between two different fields of the two compared records include Dice Coefficient”).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing) in view of DeVries (teaching a method and system for information workflows) in view of Thomas (teaching entity resolution incorporating data from various data sources) in view of McKee (teaching information processing system and method), further in view of Tereshkov (teaching a method and system for linking heterogeneous data sources), and arrived at a machine that incorporates a dice coefficients. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of assessing distance (see Tereshkov, Paragraph [0068]). In addition, the references (Cassidy, Lang, DeVries, Thomas, McKee, and Tereshkov) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

Claims 7, and 19 are being rejected under 35 U.S.C. 103 as being unpatentable over Cassidy in view of Lang in view of DeVries in view of Thomas in view of McKee, further in view of Dirac et al. (US 2015/0379430 A1).
Regarding claim 7, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 1. However, the combination of Cassidy, Lang, DeVries, Thomas, and McKee do not explicitly teach:
a gradient descent function is used to generate the set of weights for each input record based on the training set.

Dirac teaches:
a gradient descent function is used to generate the set of weights for each input record based on the training set (see Dirac, Paragraph [0263], “the previously-stored parameters or weights may be updated if needed in one or more learning iterations, e.g., using a stochastic gradient descent technique or some similar optimization approach.”).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing) in view of DeVries (teaching a method and system for information workflows) in view of Thomas (teaching entity resolution incorporating data from various data sources) in view of McKee (teaching information processing system and method), further in view of Dirac (teaching efficient duplicate detection for machine learning data sets), and arrived at a method that incorporates a gradient descent function. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of optimizing the training model (see Dirac, Paragraph [0263]). In addition, the references (Cassidy, Lang, DeVries, Thomas, McKee, and Dirac) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

Regarding claim 19, Cassidy in view Lang in view of DeVries in view of Thomas, further in view of McKee teaches all the limitations of claim 13. However, the combination of Cassidy, Lang, DeVries, Thomas, and McKee do not explicitly teach:
a gradient descent function is used to generate the set of weights for each input record based on the training set.

Thomas teaches:
a gradient descent function is used to generate the set of weights for each input record based on the training set (see Dirac, Paragraph [0263], “the previously-stored parameters or weights may be updated if needed in one or more learning iterations, e.g., using a stochastic gradient descent technique or some similar optimization approach.”).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined Cassidy (teaching a method and system for cleansing and de-duplicating data) in view of Lang (teaching collaborative data cleansing) in view of DeVries (teaching a method and system for information workflows) in view of Thomas (teaching entity resolution incorporating data from various data sources) in view of McKee (teaching information processing system and method), further in view of Dirac (teaching efficient duplicate detection for machine learning data sets), and arrived at a method that incorporates a gradient descent function. One of ordinary skill in the art would have been motivated to make such a combination for the purposes of optimizing the training model (see Dirac, Paragraph [0263]). In addition, the references (Cassidy, Lang, DeVries, Thomas, McKee, and Dirac) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as data cleansing. The close relation between the references highly suggests an expectation of success.

Response to Arguments
Applicant’s Arguments, filed March 15th, 2022, have been fully considered, but are not persuasive. 

Applicant argues on page 12 of Applicant's Remarks that the cited references do not teach “receiving transactional data from a plurality of payment systems, each payment system in the plurality of payment systems configured to process a set of data records representing user-initiated point of sale (POS) transactions for products or services of entities, the transactional data comprising a plurality of input records, each input record  in the plurality of input records comprising one or more string components used to represent an entity specified in a user-initiated POS transaction processed by a payment system in the plurality of payment systems.” The Examiner respectfully disagrees.

Cassidy discloses that the data is retrieved from multiple sources and stored as records, in which the “data sources may include, but are not limited to, employee database, sales database, contact center database, offline records, customer escalation records, company's social media followers records, customer query records and mailing lists records” (see Cassidy, Paragraph [0002]). Therefore, Cassidy teaches the amended claims because a sales database can be anything sales related.
In addition, Thomas discloses in paragraph [0004], that “The data (e.g., entities or other business records) or other information can exist in disparate applications sourced for different business functions. Some of those functions can include, for instance, sales, marketing, customer service, e-commerce, among others.” Therefore, Thomas also teaches retrieving data for sales, marketing, customer service, and e-commerce. Therefore, it is believed that the amended limitation is taught because e-commerce involves the online transaction of goods and services. 

For the above reasons, it is believed that the rejections should be sustained.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUSAM TURKI SAMARA whose telephone number is (571)272-6803.  The examiner can normally be reached on Monday - Thursday, Alternate Fridays.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Apu Mofiz can be reached on (571)-272-4080.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/HUSAM TURKI SAMARA/
Examiner, Art Unit 2161














/APU M MOFIZ/Supervisory Patent Examiner, Art Unit 2161