Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Vaver (US 8676799 B1, herein Vaver) in view of Pang et al (CN 110659961 A, herein Pang), in further view of Basu et al (“Semi-supervised Clustering by Seeding”, herein Basu).
Regarding claim 1, Vaver teaches a method (col. 1 lines 42-26 recite “one innovative aspect of the subject matter described in this specification can be embodied in methods that include the actions of storing data identifying a plurality of geographic entities; using a first clustering algorithm to cluster the plurality of entities into a first set of clusters”), comprising:
accessing a geographical location and one or more attributes of each entity of a plurality of entities (Vaver col. 7 lines 30-43 recite “As part of the clustering and evaluating, the cluster selection engine 104 uses geographic entity data 106. The geographic entity data 106 describes each geographic entity being clustered. The description for a given geographic entity includes the physical coordinates of boundaries associated with the geographic entity and optionally the physical coordinates of the center of the geographic entity. The data can optionally include other descriptive details for the geographic entity, for example, the population of the geographic entity, the volume of internet activity in the geographic entity (e.g., a search volume or a volume of visits to particular web sites), the number of businesses of a particular type that are located within the geographic entity, and any sub-entities associated with the geographic entity” (i.e. accessing geographical and attribute data for a plurality of entities)), wherein a [third] subset of the entities of the plurality of entities each have an unknown [offline] presence (col. 11 liens 35-40 recite “The process 500 determines a cluster measurement for the set of clusters resulting from the clustering algorithm from a quantification of an attribute of each of the definitively classified geographic entities and a quantification of a same attribute of each of the ambiguously classified geographic entities (508).” (i.e. each entity yet to be classified));
identifying, from the [third] subset of the entities and based on the accessing, one or more first entities having geographical locations within a predefined distance from the geographical location of any of the entities in the first subset of the entities, and one or more second entities having geographical locations within the predefined distance from the geographical location of any of the entities in the second subset of the entities (col. 11 lines 18-23 recite “The process 500 stores data identifying geographic entities (502), for example, as described above with reference to FIG. 15 by the experiment system itself, for example, Internet usage data indicating queries submitted by users or websites visited by users. This data is preferably anonymized to preserve user privacy. The process 500 uses a clustering algorithm to cluster the geographic entities into a set of clusters (504). For example, one of the clustering algorithms described above with reference to §2.0, or another clustering algorithm, can be used.” Col. 10 lines 40-52 recites “Each geographic entity shown in the plot 402 is either a definitively classified geographic entity or an ambiguously classified geographic entity. A definitively classified entity is an entity that is more than a threshold distance metric, e.g., twenty miles, from a geographic entity in any cluster other than the cluster to which it is assigned. The distance metric can be distance itself, or a value derived from the distance. The distance between two geographic entities can be measured, for example, from the center of the physical region corresponding to the geographic entities. For example, geographic entity 404 is a definitively classified entity because it is at least a threshold distance away from geographic entities in the other clusters” (I.e. clustering entities based on each entity being a threshold distance from the other entities in the cluster));
grouping the one or more identified first entities into the first subset of the entities and grouping the one or more identified second entities into the second subset of the entities; removing the one or more identified first entities and second entities from the third subset of the entities (Col. 19 lines 10-21 recite “In other implementations, the termination condition is satisfied when each geographic entity has been assigned to a cluster. In these implementations, process 1000 optionally stores data after each iteration that identifies the number of assigned geographic entities after the iteration and the cluster measurement determined for the iteration. This data can later be used by system 100 to determine whether some of the geographic entities should be omitted from an experiment, e.g., not considered when evaluating the results of an experiment. Removing some of the geographic entities from the experiment reduces geolocation uncertainty and results in more certain results from the experiment” (i.e. the process of clustering entities is repeated until all entities have been assigned or removed from the subset));
grouping the one or more identified third entities into the first subset of the entities and grouping the one or more identified fourth entities into the second subset of the entities; and removing the one or more identified third entities and fourth entities from the third subset of the entities (Col. 18 lines 8-15 recite “FIG. 10 is a flow diagram of an example process 1000 for incrementally adding geographic entities to clusters and evaluating the tradeoff between entity coverage and clustering accuracy. The clustering algorithm used by process 1000 is another example clustering algorithm. The process 1000 can be implemented, for example, by the system 100, described above with reference to FIG. 1.” Col. 18 lines 26-29 recite “The process 1000 then repeats the following steps until a termination condition is met. In some implementations, the process 1000 repeats the following steps for multiple sets of initial cluster centroids” (i.e. the method from Vaver can be repeated for multiple groups of  entities)).
However, Vaver does not explicitly teach wherein a first subset of the entities of the plurality of entities each have an offline presence, respectively, wherein a second subset of the entities of the plurality of entities each have no offline presence.
Pang teaches wherein a first subset of the entities of the plurality of entities each have an offline presence, respectively, wherein a second subset of the entities of the plurality of entities each have no offline presence (page 1 para. 3 recites “[A] merchant may contract transaction platform, the transaction platform to manage the transaction of the merchant, such as electronic payment and settlement. merchant may include offline merchants and online merchant. [An] offline merchant refers to a merchant's offline entity operation point, and the online merchant is not offline entity operating point”. Page 3 para. 4 recites “the method further comprises: a trusted transaction if the transaction cluster is equal to or higher than the first threshold, it is determined that the transaction cluster represents a trusted point-of-transaction of the merchant, and if trusted transaction the transaction cluster is lower than the first threshold value, (i.e. the merchant has an offline presence) it is determined that the transaction cluster represents a potential risk, point-of transaction (i.e. the merchant has no offline presence)”. Page 6 para. 2 recites “FIG. 1 is an offline merchant for identifying the system architecture diagram according to one embodiment of the present disclosure. various merchants (e.g., including offline merchants and online merchant) may contract one or more transaction platform, the electronic payment transaction platform of the terminal device, or application can be installed transaction platform on a computer or terminal device” (i.e. the merchants may or may not have an offline presence)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by using the online/offline labels from Pang to identify geographical entities from Vaver. Vaver and Pang are both directed to evaluating location attributes of an entity, but Vaver does not teach whether its entities correspond to online or offline entities. One of ordinary skill would be able to apply the known technique of clustering entities based on geographic location to the known system of online/offline entity data from Pang to cluster online and offline entities.
However, the combination of Vaver and Pang does not teach training a machine learning model based on the one or more attributes; determining, via the machine learning model, a first probability and a second probability for each remaining entity in the third subset of the entities, the first probability corresponding to a probability of having the offline presence, the second probability corresponding to a probability of having no offline presence; identifying, from the third subset of the entities and based on the determined first probability and second probability, one or more third entities each having the determined first probability exceeding a first predefined threshold and one or more fourth entities each having the determined second probability exceeding a second predefined threshold.
Basu teaches training a machine learning model based on the one or more attributes (section I para. 1 recites “In semi-supervised clustering, some labeled data is used along with the unlabeled data to obtain a better clustering. This paper explores the use of labeled data to generate seed clusters that initialize a clustering algorithm, as well as the use of constraints generated from the labeled data to guide the clustering process” (i.e. training a model using the attributes from Vaver));
determining, via the machine learning model, a first probability and a second probability for each remaining entity in the third subset of the entities, the first probability corresponding to a probability [of having the offline presence], the second probability corresponding to a probability [of having no offline presence] (section 3.1 para. 1 recites “Given a dataset X , as previously mentioned, KMeans clustering of the dataset generates a K-partitioning {Xl}Ki=1 of X so that the KMeans objective is locally minimized. Let S ⊂ X , called the seed set, be the subset of data-points on which supervision is provided as follows: for each xi ϵ S, the user provides the cluster Xl of the partition to which it belongs”. Section 3.2 para. 1 recites “In Seeded-KMeans, the seed clustering is used to initialize the KMeans algorithm. Thus, rather than initializing KMeans from K random means, the mean of the i-th cluster is initialized with the mean of the l-th partition Sl of the seed set. The seed clustering is only used for initialization, and the seeds are not used in the following steps of the algorithm” (i.e. the labelled entities from the first and second subsets are not evaluated again). Section 3.3 para. 1 recites “The EM (expectation maximization) algorithm is a very general method of finding the maximum-likelihood estimate of the parameters of an underlying distribution, or, more generally, a probabilistic data generation process, from a set of observed data that has incomplete or missing values”);
identifying, from the third subset of the entities and based on the determined first probability and second probability, one or more third entities each having the determined first probability exceeding a first predefined threshold and one or more fourth entities each having the determined second probability exceeding a second predefined threshold (step 2a from Algorithm 1 and section 3.3 para. 2 recites “The KMeans clustering algorithm is essentially an EM (expectation maximization) algorithm on a mixture of K Gaussians under certain assumptions. The data-generation process in KMeans is assumed to be as follows – first, one Gaussian is chosen out of the K following their prior probability distribution; then, a data-point is sampled following the distribution of the chosen Gaussian. Let X = {x1, . . .,  xN} be the set of data-points we want to cluster with each xi ϵ Rd. The missing data Z is the cluster assignment of the data-points” (i.e. the k-means algorithm assigns each data point to a given cluster when the probability that that data point belongs in the given cluster is high enough));
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by using the k-means clustering method from Basu to cluster entities based on attribute data other than geographic location with the clustering method based on geographic location from Vaver (as modified by Pang). Vaver and Basu are both directed to methods of clustering data, so one of ordinary skill in the art would benefit from using another method to ensure that any entities that could not be clustered by the method from Vaver (see Vaver column 19 lines 1-25) may be clustered by the method from Basu, which can use any kind of attribute data.
Regarding claim 2, the combination of Vaver, Pang, and Basu teaches the method of claim 1, further comprising, before the identifying the one or more first entities: determining, via digital media of the first subset of the entities, that the first subset of entities each have the offline presence; or determining, via digital media of the second subset of the entities, that the second subset of entities each have no offline presence (Pang page 2 para. 2 recites “A method for identifying the present offline merchant is using service white list”. Vaver col. 7 lines 30-32 recite “the cluster selection engine 104 uses geographic entity data 106. The geographic entity data 106 describes each geographic entity being clustered”. Vaver col. 12 lines 11-12 recite “The data for each geographic entity can be received, for example, from various commercial sources” (i.e. the offline presence of an entity can be determined using public digital information)).
Regarding claim 3, the combination of Vaver, Pang, and Basu teaches method of claim 1, further comprising, before the identifying the one or more first entities: determining, via one or more humans, that the first subset of entities each have the offline presence or that the second subset of entities each have no offline presence (Basu section 4.4 para. 2 recites “In a real-life application, since the semi-supervision will be provided by a human user, there is a chance that the supervision may be erroneous in some cases” Pang page 6 para. 2 recites “FIG. 1 is an offline merchant for identifying the system architecture diagram according to one embodiment of the present disclosure. various merchants (e.g., including offline merchants and online merchant) may contract one or more transaction platform, the electronic payment transaction platform of the terminal device, or application can be installed transaction platform on a computer or terminal device” (i.e. using humans to provide initial entity labelling on whether an entity does or does not have an online presence)).
Regarding claim 4, the combination of Vaver, Pang, and Basu teaches the method of claim 1, wherein the accessing comprises retrieving data pertaining to the geographical location and the one or more attributes from an electronic database (Vaver col. 7 lines 30-32 recite “the cluster selection engine 104 uses geographic entity data 106. The geographic entity data 106 describes each geographic entity being clustered”. Vaver col. 7 lines 36-43 recite “The data can optionally include other descriptive details for the geographic entity, for example, the population of the geographic entity, the volume of internet activity in the geographic entity (e.g., a search volume or a volume of visits to particular web sites), the number of businesses of a particular type that are located within the geographic entity, and any sub-entities associated with the geographic entity”. Col. 12 lines 11-12 recite “The data for each geographic entity can be received, for example, from various commercial sources” (i.e. retrieving geographical location data)).
Regarding claim 5, the combination of Vaver, Pang, and Basu teaches the method of claim 1, wherein the plurality of entities comprises a plurality of merchants that each have at least an online presence (Pang page 1 para. 3 recites “[A] merchant may contract transaction platform, the transaction platform to manage the transaction of the merchant, such as electronic payment and settlement. merchant may include offline merchants and online merchant. [An] offline merchant refers to a merchant's offline entity operation point, and the online merchant is not offline entity operating point”. Pang page 6 para. 2 recites “FIG. 1 is an offline merchant for identifying the system architecture diagram according to one embodiment of the present disclosure. various merchants (e.g., including offline merchants and online merchant) may contract one or more transaction platform, the electronic payment transaction platform of the terminal device, or application can be installed transaction platform on a computer or terminal device” (i.e. the merchants each have an online presence)).
Regarding claim 6, the combination of Vaver, Pang, and Basu teaches the method of claim 5, wherein the offline presence is a physical location of a respective merchant of the plurality of merchants at which at least some transactions are conducted in person with customers of the respective merchant (Pang page 1 para. 3 recites “[A] merchant may contract transaction platform, the transaction platform to manage the transaction of the merchant, such as electronic payment and settlement. merchant may include offline merchants and online merchant. [An] offline merchant refers to a merchant's offline entity operation point, and the online merchant is not offline entity operating point”. Page 2 para. 5 recites “If the transaction cluster comprises more than a predetermined number of transaction, then it can confirm that the merchant is offline merchant and the transaction cluster may represent a line, point-of-transaction of the merchant” (i.e. merchants may have an offline location where transactions are conducted in person)).
Regarding claim 7, the combination of Vaver, Pang, and Basu teaches the method of claim 1, wherein the training is based on the one or more attributes of the first subset of the entities and the second subset of the entities but not based on the one or more attributes of the third subset of the entities (Basu section 3.2 para. 1 recites “In Seeded-KMeans, the seed clustering is used to initialize the KMeans algorithm. Thus, rather than initializing KMeans from K random means, the mean of the i-th cluster is initialized with the mean of the l-th partition Sl of the seed set. The seed clustering is only used for initialization, and the seeds are not used in the following steps of the algorithm” (i.e. the method is trained/initialized using labelled entities from the supervised/seeded set, but not unlabeled entities yet to be clustered)).
Regarding claim 8, the combination of Vaver, Pang, and Basu teaches the method of claim 1, further comprising: repeating the identifying the one or more first entities and the one or more second entities, the grouping the one or more identified first entities and the one or more identified second entities, the removing the one or more identified first entities and second entities, the training, the determining, the identifying the one or more third entities and the one or more fourth entities, the grouping the one or more identified third entities and fourth entities, and the removing the one or more identified third entities and fourth entities one or more times (Vaver col. 8 lines 31-41 recite “The algorithm assigns each of the geographic entities to a respective cluster whose centroid is the closest to the geographic entity according to a distance metric. The algorithm then updates the centroids for each cluster and re-assigns the geographic entities to the cluster having an updated centroid that is the closest to the geographic entity, according to the distance metric. The k-means clustering algorithm repeats the updating of the centroids and the reassignment of the geographic entities until no geographic entities are reassigned, or until a convergence criterion or predefined iteration limit is met”. Col. 18 lines 8-15 recite “FIG. 10 is a flow diagram of an example process 1000 for incrementally adding geographic entities to clusters and evaluating the tradeoff between entity coverage and clustering accuracy. The clustering algorithm used by process 1000 is another example clustering algorithm. The process 1000 can be implemented, for example, by the system 100, described above with reference to FIG. 1.” Col. 18 lines 26-29 recite “The process 1000 then repeats the following steps until a termination condition is met. In some implementations, the process 1000 repeats the following steps for multiple sets of initial cluster centroids” (i.e. the steps of the method from Vaver are iterative and therefore repeat until the convergence criterion is met)).
Regarding claim 9, the combination of Vaver, Pang, and Basu teaches the method of claim 8, wherein the repeating is performed until: every entity in the third subset of the entities has been grouped into the first subset of the entities or into the second subset of the entities; or no first, second, third, or fourth entities can be identified from the third subset of the entities (Vaver col. 19 lines 1-2 recite “If the termination condition is not satisfied (1008), the process 1000 returns to step 1004.”. Col. 19 lines 10-21 recite “In other implementations, the termination condition is satisfied when each geographic entity has been assigned to a cluster. In these implementations, process 1000 optionally stores data after each iteration that identifies the number of assigned geographic entities after the iteration and the cluster measurement determined for the iteration. This data can later be used by system 100 to determine whether some of the geographic entities should be omitted from an experiment, e.g., not considered when evaluating the results of an experiment. Removing some of the geographic entities from the experiment reduces geolocation uncertainty and results in more certain results from the experiment” (i.e. the process of clustering entities is repeated until all entities have been assigned or removed from the subset)).
Regarding claim 10, the combination of Vaver, Pang, and Basu teaches the method of claim 1, wherein the first predefined threshold is equal to the second predefined threshold (Basu step 2a from Algorithm 1 and section 3.3 para. 2 recites “The KMeans clustering algorithm is essentially an EM (expectation maximization) algorithm on a mixture of K Gaussians under certain assumptions. The data-generation process in KMeans is assumed to be as follows – first, one Gaussian is chosen out of the K following their prior probability distribution; then, a data-point is sampled following the distribution of the chosen Gaussian. Let X = {x1, . . .,  xN} be the set of data-points we want to cluster with each xi ϵ Rd. The missing data Z is the cluster assignment of the data-points” (i.e. the k-means algorithm assigns each data point to a given cluster when the probability that that data point belongs in the given cluster is high enough – unless specified otherwise each cluster has the same threshold)).
Regarding claim 11, the combination of Vaver, Pang, and Basu teaches the method of claim 1, wherein the one or more attributes comprise economic traits of each of the plurality of entities (Vaver col. 7 lines 36-43 recite “The data can optionally include other descriptive details for the geographic entity, for example, the population of the geographic entity, the volume of internet activity in the geographic entity (e.g., a search volume or a volume of visits to particular web sites), the number of businesses of a particular type that are located within the geographic entity, and any sub-entities associated with the geographic entity” (i.e. the data includes economic traits of the plurality of entities)).
Regarding claim 12, the combination of Vaver, Pang, and Basu teaches the method of claim 1, wherein at least the training the machine learning model is performed by one or more hardware processors (Vaver col. 20 lines 12-15 recite “The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output”).
Claim 13 is a system claim and its limitation is included in claim 1, but for the limitations regarding the plurality of merchants, which is included in claim 5. The only difference is that claim 13 requires a system (Vaver col. 20 lines 12-15 recite “The processes and logic flows described in this specification can be performed by one or more programmable processors
executing one or more computer programs to perform functions by operating on input data and generating output 15”). Therefore, claim 13 is rejected for the same reasons as claim 5 (which includes the limitations of claim 1 on which it depends).
Regarding claim 14, the combination of Vaver, Pang, and Basu teaches the system of claim 13, wherein the performing the first phase of the machine learning process comprises labeling the one or more first merchants as having the offline location and labeling the one or more second merchants as having no offline location (Pang page 3 para. 4 recites “the method further comprises: a trusted transaction if the transaction cluster is equal to or higher than the first threshold, it is determined that the transaction cluster represents a trusted point-of-transaction of the merchant, and if trusted transaction the transaction cluster is lower than the first threshold value, (i.e. the merchant has an offline presence) it is determined that the transaction cluster represents a potential risk, point-of transaction (i.e. the merchant has no offline presence)”. Page 6 para. 2 recites “FIG. 1 is an offline merchant for identifying the system architecture diagram according to one embodiment of the present disclosure. various merchants (e.g., including offline merchants and online merchant) may contract one or more transaction platform, the electronic payment transaction platform of the terminal device, or application can be installed transaction platform on a computer or terminal device (i.e. labelling a merchant as having or not having an offline presence)).
Regarding claim 15, the combination of Vaver, Pang, and Basu teaches the system of claim 13, wherein the operations further comprise: performing a fourth phase of the machine learning process by repeating the second phase and the third phase one or more times (Vaver col. 18 lines 8-15 recite “FIG. 10 is a flow diagram of an example process 1000 for incrementally adding geographic entities to clusters and evaluating the tradeoff between entity coverage and clustering accuracy. The clustering algorithm used by process 1000 is another example clustering algorithm. The process 1000 can be implemented, for example, by the system 100, described above with reference to FIG. 1.” Col. 18 lines 26-29 recite “The process 1000 then repeats the following steps until a termination condition is met. In some implementations, the process 1000 repeats the following steps for multiple sets of initial cluster centroids” (i.e. the steps of the method from Vaver are iterative and therefore repeat until the convergence criterion is met)).
Regarding claim 16, the combination of Vaver, Pang, and Basu teaches the system of claim 15, wherein the repeating the second phase comprises: labeling one or more seventh merchants of the plurality of merchants as having the offline location in response to the one or more seventh merchants being located within the first predefined geographical distance from at least one of the one or more first, third, or fifth merchants; or labeling one or more eighth merchants of the plurality of merchants as having no offline location in response to the one or more eighth merchants being located within the second predefined geographical distance from at least one of the one or more second, fourth, or sixth merchants (Vaver col. 8 lines 31-41 recite “The algorithm assigns each of the geographic entities to a respective cluster whose centroid is the closest to the geographic entity according to a distance metric. The algorithm then updates the centroids for each cluster and re-assigns the geographic entities to the cluster having an updated centroid that is the closest to the geographic entity, according to the distance metric. The k-means clustering algorithm repeats the updating of the centroids and the reassignment of the geographic entities until no geographic entities are reassigned, or until a convergence criterion or predefined iteration limit is met”. Col. 18 lines 8-15 recite “FIG. 10 is a flow diagram of an example process 1000 for incrementally adding geographic entities to clusters and evaluating the tradeoff between entity coverage and clustering accuracy. The clustering algorithm used by process 1000 is another example clustering algorithm. The process 1000 can be implemented, for example, by the system 100, described above with reference to FIG. 1.” Col. 18 lines 26-29 recite “The process 1000 then repeats the following steps until a termination condition is met. In some implementations, the process 1000 repeats the following steps for multiple sets of initial cluster centroids” (i.e. the steps of the method from Vaver are iterative and therefore repeat until the convergence criterion is met)).
Regarding claim 17, the combination of Vaver, Pang, and Basu teaches the system of claim 15, wherein the repeating the third phase  comprises: labeling one or more seventh merchants of the plurality of merchants as having the offline location in response to the predicted first probability of the one or more seventh merchants exceeding the first predefined confidence threshold; or labeling one or more eighth merchants of the plurality of merchants as having no offline location in response to the predicted second probability of the one or more eight merchants exceeding the second predefined confidence threshold (Vaver col. 18 lines 8-15 recite “FIG. 10 is a flow diagram of an example process 1000 for incrementally adding geographic entities to clusters and evaluating the tradeoff between entity coverage and clustering accuracy. The clustering algorithm used by process 1000 is another example clustering algorithm. The process 1000 can be implemented, for example, by the system 100, described above with reference to FIG. 1.” Col. 18 lines 26-29 recite “The process 1000 then repeats the following steps until a termination condition is met. In some implementations, the process 1000 repeats the following steps for multiple sets of initial cluster centroids” Basu step 2a from Algorithm 1 and section 3.3 para. 2 recites “The KMeans clustering algorithm is essentially an EM (expectation maximization) algorithm on a mixture of K Gaussians under certain assumptions. The data-generation process in KMeans is assumed to be as follows – first, one Gaussian is chosen out of the K following their prior probability distribution; then, a data-point is sampled following the distribution of the chosen Gaussian. Let X = {x1, . . .,  xN} be the set of data-points we want to cluster with each xi ϵ Rd. The missing data Z is the cluster assignment of the data-points” (i.e. the k-means algorithm assigns each data point to a given cluster when the probability that that data point belongs in the given cluster is high enough)).
Regarding claim 18, the combination of Vaver, Pang, and Basu teaches the system of claim 15, wherein the fourth phase is performed until: every merchant of the plurality of merchants has been labeled as having the offline location or having no offline location; or no merchant can be labeled as one of the third or fourth merchants in the second phase of the machine learning process and no merchant can be labeled as one of the fifth or sixth merchants in the third phase of the machine learning process (Vaver col. 19 lines 1-2 recite “If the termination condition is not satisfied (1008), the process 1000 returns to step 1004.”. Col. 19 lines 10-21 recite “In other implementations, the termination condition is satisfied when each geographic entity has been assigned to a cluster. In these implementations, process 1000 optionally stores data after each iteration that identifies the number of assigned geographic entities after the iteration and the cluster measurement determined for the iteration. This data can later be used by system 100 to determine whether some of the geographic entities should be omitted from an experiment, e.g., not considered when evaluating the results of an experiment. Removing some of the geographic entities from the experiment reduces geolocation uncertainty and results in more certain results from the experiment” (i.e. the process of clustering entities is repeated until all entities have been assigned or removed from the subset)).
Claim 19 is a non-transitory machine-readable medium claim and its limitation is included in claim 13. The only difference is that claim 19 requires a non-transitory machine-readable medium (Vaver col. 20 lines 38-44 recite “Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks.”). Therefore, claim 19 is rejected for the same reasons as claim 13.
Regarding claim 20, the combination of Vaver, Pang, and Basu teaches the non-transitory machine-readable medium of claim 19, wherein the operations further comprise: 
repeating the identifying the one or more first merchants and the one or more second merchants, the labeling the one or more identified first merchants, the labeling the one or more identified second merchants, the training the machine learning model, the predicting, the identifying the one or more third merchants and the one or more fourth merchants, the labeling the one or more identified third merchants, and the labeling the one or more identified fourth merchants one or more times until: every merchant of the plurality of the merchants has been labeled as belonging to the first subset or to the second subset; or no merchant of the plurality of the merchants can be labeled as belonging to the first subset or to the second subset (Vaver col. 8 lines 31-41 recite “The algorithm assigns each of the geographic entities to a respective cluster whose centroid is the closest to the geographic entity according to a distance metric. The algorithm then updates the centroids for each cluster and re-assigns the geographic entities to the cluster having an updated centroid that is the closest to the geographic entity, according to the distance metric. The k-means clustering algorithm repeats the updating of the centroids and the reassignment of the geographic entities until no geographic entities are reassigned, or until a convergence criterion or predefined iteration limit is met”. Col. 18 lines 8-15 recite “FIG. 10 is a flow diagram of an example process 1000 for incrementally adding geographic entities to clusters and evaluating the tradeoff between entity coverage and clustering accuracy. The clustering algorithm used by process 1000 is another example clustering algorithm. The process 1000 can be implemented, for example, by the system 100, described above with reference to FIG. 1.” Col. 18 lines 26-29 recite “The process 1000 then repeats the following steps until a termination condition is met. In some implementations, the process 1000 repeats the following steps for multiple sets of initial cluster centroids” (i.e. the steps of the method from Vaver are iterative and therefore repeat until the convergence criterion is met) Vaver col. 19 lines 1-2 recite “If the termination condition is not satisfied (1008), the process 1000 returns to step 1004.”. Col. 19 lines 10-21 recite “In other implementations, the termination condition is satisfied when each geographic entity has been assigned to a cluster. In these implementations, process 1000 optionally stores data after each iteration that identifies the number of assigned geographic entities after the iteration and the cluster measurement determined for the iteration. This data can later be used by system 100 to determine whether some of the geographic entities should be omitted from an experiment, e.g., not considered when evaluating the results of an experiment. Removing some of the geographic entities from the experiment reduces geolocation uncertainty and results in more certain results from the experiment” (i.e. the process of clustering entities is repeated until all entities have been assigned or removed from the subset)).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20210097583 A1 (Ramachandran et al) teaches identifying a physical location associated with content interaction data between a user and an online content provider and determining a set of offline interaction conversion data based on the set of content interaction data and a set of offline interaction data.
US 20190340520 A1 (Oyamada et al) teaches a clustering method based on predicting the relationships between types of ID data.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEAH M FEITL whose telephone number is (571)272-8350. The examiner can normally be reached on M-F 0800-1700.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B. Zhen can be reached on (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll- free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
	/L.M.F./             Examiner, Art Unit 2121                                                                                                                                                                                           


	/Li B. Zhen/	Supervisory Patent Examiner, Art Unit 2121