Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION

1.	The pending claims 1-22 are presented for examination.

Claim Rejections - 35 USC § 101
2.	35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


	Claims 1-2, 4, 6-14, 16, 18-22 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The analysis below of the claims’ subject matter eligibility follows the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50-57 (October 17, 2019) (“2019 PEG”).
Regarding claim 1, the claim recites: A method for data entries deduplication, comprising:
	indexing an input data set, wherein the input data set is in a tabular formant and the indexing includes providing a unique Row identifier (RowID), wherein rows are the data entries;
	computing attribute similarity for each column across each pair of rows;
	computing, for each pair of rows, row-to-row similarity as a weighted sum of attribute similarities;
	clustering pairs of rows based on their row-to-row similarities; and
	providing an output data set including at least the clustered pairs of rows.
Step 1 Analysis: Claim 1 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to an abstract idea. In particular, the claim recites mental processes that are concepts performed in the human mind (including an observation, evaluation, judgment, opinion).
The above-noted limitations of indexing, computing, computing, clustering and providing as drafted, are processes that, under their broadest reasonable interpretation, covers performance of the limitations in the mind but for the recitation of generic computer components. That is nothing in the claim precludes these steps from practically being performed in the mind. For example, indexing input data to providing a unique row identifier, computing similarity between rows, clustering pairs of rows based on the similarity, and providing the clustered pairs of rows in the context of this claim encompasses a concept performed in the human mind (including observations to compare the records and then evaluation, judgment, and opinion to cluster similar rows) and can be performed with pen and paper. 
If the claim limitations, under their broadest reasonable interpretations, cover performance of the limitation in the mind, then they fall within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception is not integrated into a practical application.
Accordingly, the judicial exception is not integrated into a practical application.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 2, the claim recites prior to the indexing input data set, the method further comprises:
	standardizing the input data set into a predefined and unified format; and
	segmenting the standardized input data set, wherein each segment includes a subset of the rows included in the input data set.
Step 1 Analysis: Claim 2 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to an abstract idea. In particular, the claim recites mental processes that are concepts performed in the human mind (including an observation, evaluation, judgment, opinion). Claim 2 is dependent on claim 1, which as indicated in the analysis above, is directed to an abstract idea without significantly more. Claim 2 recites the additional element of “prior to the indexing input data set, the method further comprises:
	standardizing the input data set into a predefined and unified format; and
	segmenting the standardized input data set, wherein each segment includes a subset of the rows included in the input data set" That is, the claim recites standardizing the input data into a predefined and unified format, and segmenting the standardized data with rows. The above-noted limitation of claim 2, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, “standardizing the input data into a predefined and unified format, and segmenting the standardized data with rows” in the context of this claim encompasses a concept performed in the human mind (including observations to compare the records and then evaluation, judgment, and opinion to cluster similar rows) and can be performed with pen and paper. If the claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.
This additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Accordingly, the judicial exception is not integrated into a practical application.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 4, the claim recites utilizing a comparator based on a type of an attribute to compute the attribute similarity, wherein the comparator is any one of: exact matching and fuzzy matching.
Step 1 Analysis: Claim 4 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to an abstract idea. In particular, the claim recites mental processes that are concepts performed in the human mind (including an observation, evaluation, judgment, opinion). Claim 4 is dependent on claim 1, which as indicated in the analysis above, is directed to an abstract idea without significantly more. Claim 4 recites the additional element of “utilizing a comparator based on a type of an attribute to compute the attribute similarity, wherein the comparator is any one of: exact matching and fuzzy matching" That is, the claim recites using either exact matching or fuzzy matching to compute the similarity. The above-noted limitation of claim 4, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, “using either exact matching or fuzzy matching to compute the similarity” in the context of this claim encompasses a concept performed in the human mind (including observations to compare the records and then evaluation, judgment, and opinion to cluster similar rows) and can be performed with pen and paper. If the claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.
This additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Accordingly, the judicial exception is not integrated into a practical application.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 6, the claim recites
	generating a graph including nodes and edges, wherein the nodes represent rows and edges represent the row-to-row similarities; and
	applying a greedy algorithm on the graph to cluster rows, wherein each cluster includes at two similar data entries.
Step 1 Analysis: Claim 6 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to an abstract idea. In particular, the claim recites mental processes that are concepts performed in the human mind (including an observation, evaluation, judgment, opinion). Claim 6 is dependent on claim 1, which as indicated in the analysis above, is directed to an abstract idea without significantly more. Claim 6 recites the additional element of “
	generating a graph including nodes and edges, wherein the nodes represent rows and edges represent the row-to-row similarities; and
	applying a greedy algorithm on the graph to cluster rows, wherein each cluster includes at two similar data entries" That is, the claim recites generating a graph including nodes and edges, wherein the nodes represent rows and edges represent the row-to-row similarities; and
	applying a greedy algorithm on the graph to cluster rows, wherein each cluster includes at two similar data entries. The above-noted limitation of claim 6, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, “generating a graph including nodes and edges, wherein the nodes represent rows and edges represent the row-to-row similarities; and
	applying a greedy algorithm on the graph to cluster rows, wherein each cluster includes at two similar data entries” in the context of this claim encompasses a concept performed in the human mind (including observations to compare the records and then evaluation, judgment, and opinion to cluster similar rows) and can be performed with pen and paper. If the claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.
This additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Accordingly, the judicial exception is not integrated into a practical application.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 7, the claim recites
	the clustering results in isolated rows, wherein each isolated row is individually clustered.
Step 1 Analysis: Claim 7 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to an abstract idea. In particular, the claim recites mental processes that are concepts performed in the human mind (including an observation, evaluation, judgment, opinion). Claim 7 is dependent on claims 1 & 6, which as indicated in the analysis above, is directed to an abstract idea without significantly more. Claim 7 recites the additional element of “
	the clustering results in isolated rows, wherein each isolated row is individually clustered" That is, the claim recites the clustering results in isolated rows, wherein each isolated row is individually clustered. The above-noted limitation of claim 7, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, “the clustering results in isolated rows, wherein each isolated row is individually clustered” in the context of this claim encompasses a concept performed in the human mind (including observations to compare the records and then evaluation, judgment, and opinion to cluster similar rows) and can be performed with pen and paper. If the claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.
This additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Accordingly, the judicial exception is not integrated into a practical application.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 8, the claim recites
	determining clusters that are substantially related based on a cluster signature; and
	iteratively merging clusters that are substantially related.
Step 1 Analysis: Claim 8 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to an abstract idea. In particular, the claim recites mental processes that are concepts performed in the human mind (including an observation, evaluation, judgment, opinion). Claim 8 is dependent on claims 1 & 6, which as indicated in the analysis above, is directed to an abstract idea without significantly more. Claim 8 recites the additional element of “
	determining clusters that are substantially related based on a cluster signature; and
	iteratively merging clusters that are substantially related" That is, the claim recites determining clusters that are substantially related based on a cluster signature; and iteratively merging clusters that are substantially related. The above-noted limitation of claim 8, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, “determining clusters that are substantially related based on a cluster signature; and iteratively merging clusters that are substantially related” in the context of this claim encompasses a concept performed in the human mind (including observations to compare the records and then evaluation, judgment, and opinion to cluster similar rows) and can be performed with pen and paper. If the claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.
This additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Accordingly, the judicial exception is not integrated into a practical application.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 9, the claim recites
	the output data set further includes the input data set, a cluster identification, a cluster anchor information, and a confidence score.
Step 1 Analysis: Claim 9 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to an abstract idea. In particular, the claim recites mental processes that are concepts performed in the human mind (including an observation, evaluation, judgment, opinion). Claim 9 is dependent on claims 1, 6 & 8, which as indicated in the analysis above, is directed to an abstract idea without significantly more. Claim 9 recites the additional element of “
	the output data set further includes the input data set, a cluster identification, a cluster anchor information, and a confidence score" That is, the claim recites the output data set further includes the input data set, a cluster identification, a cluster anchor information, and a confidence score. The above-noted limitation of claim 9, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, “the output data set further includes the input data set, a cluster identification, a cluster anchor information, and a confidence score” in the context of this claim encompasses a concept performed in the human mind (including observations to compare the records and then evaluation, judgment, and opinion to cluster similar rows) and can be performed with pen and paper. If the claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.
This additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Accordingly, the judicial exception is not integrated into a practical application.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 10, the claim recites the format of the output data set is any one of: a table and a graph.
Step 1 Analysis: Claim 10 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to an abstract idea. In particular, the claim recites mental processes that are concepts performed in the human mind (including an observation, evaluation, judgment, opinion). Claim 10 is dependent on claims 1, 6 & 8, which as indicated in the analysis above, is directed to an abstract idea without significantly more. Claim 10 recites the additional element of “the format of the output data set is any one of: a table and a graph" That is, the claim recites the format of the output data set is any one of: a table and a graph. The above-noted limitation of claim 10, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, “the format of the output data set is any one of: a table and a graph” in the context of this claim encompasses a concept performed in the human mind (including observations to compare the records and then evaluation, judgment, and opinion to cluster similar rows) and can be performed with pen and paper. If the claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.
This additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Accordingly, the judicial exception is not integrated into a practical application.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 11, the claim recites the input data set is sourced from a plurality of data sources.
Step 1 Analysis: Claim 11 is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim is directed to an abstract idea. In particular, the claim recites mental processes that are concepts performed in the human mind (including an observation, evaluation, judgment, opinion). Claim 11 is dependent on claims 1, 6, 8 & 10, which as indicated in the analysis above, is directed to an abstract idea without significantly more. Claim 11 recites the additional element of “the input data set is sourced from a plurality of data sources" That is, the claim recites the input data set is sourced from a plurality of data sources. The above-noted limitation of claim 11, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, “the input data set is sourced from a plurality of data sources” in the context of this claim encompasses a concept performed in the human mind (including observations to compare the records and then evaluation, judgment, and opinion to cluster similar rows) and can be performed with pen and paper. If the claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application.
This additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Accordingly, the judicial exception is not integrated into a practical application.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The claim is not patent eligible.

Claim 12 is rejected under 35 U.S.C. 101 with the same rational of claim 1.
Claims 13-14, 16, and 19-22 are rejected under 35 U.S.C. 101 with the same rational of claims 1-2, 4, 6 and 8-11.

Claim Rejections - 35 USC § 103
3.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
4.	This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
5.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

6.	The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
7.	Claims 1-2 and 12-14 are rejected under 35 U.S.C. 103 as being unpatentable over Lang et al (US 20080140707 A1, hereinafter “Lang”) in view of Gibson (U.S. Patent 8832547 B2 hereinafter, “Gibson”).
8.	With respect to claim 1,
Lang discloses
a method for data entries deduplication, comprising:
	indexing an input data set, wherein the input data set is in a tabular formant and the indexing includes providing a Row identifier (RowID), wherein rows are the data entries;
	computing attribute similarity for each column across each pair of rows;
	computing, for each pair of rows, row-to-row similarity as a weighted sum of attribute similarities;
	clustering pairs of rows based on their row-to-row similarities; and
	providing an output data set including at least the clustered pairs of rows  (Lang [0022], [0028] – [0029], [0031] – [0035], [0038] – [0040] e.g. Clustering Using Indexes for a Matrix [0022] The present invention is generally directed towards a system and method for clustering objects using indexes for a matrix representing a collection of objects.  More particularly, objects to be clustered may be represented as a rectangular matrix.  An index may be created for accessing the rows of the matrix and an index may be created for accessing the columns of the matrix based upon the connectivity of the edges between rows and columns of the matrix.  Each node represented by a row may be joined to a nearest node represented by another row to produce disjoint sets of nodes.  The disjoint sets of nodes may represent clusters that may then be output for use by an application. [0028] FIG. 3 presents a flowchart generally representing the steps undertaken in one embodiment for clustering objects using indexes for a matrix representing the objects.  At step 302, a rectangular matrix with each row representing an object from a collection of objects to be clustered may be received. Each user may be represented by a row in the matrix and each service or class of services may be represented by a column in the matrix. [0029] Once the relationship between objects and classes of attributes may be represented as an m×n matrix, indexes may be created at step 304 for the M nodes and the N nodes based on the connectivity of the edges between M and N. In an embodiment, a forward index for the M nodes representing rows of the matrix may be created.  For example, an array, which may be denoted as R, may be created that includes a list of nonzero columns for each row and another array that stores the offset to the array R for each row.  Thus, the forward index may map objects to attributes. [0031] FIG. 4 presents a flowchart generally representing the steps undertaken in one embodiment for producing disjoint sets of nearest nodes of a matrix accessed using the indexes.  For each row node, the forward index on M may be used at step 402 to map a row node xm to a subset Z of column nodes connected to xm by an edge in the bipartite graph representing the matrix. [0034] At step 406, correlated nodes in M may be determined by using the overlaps ov(k) to compute one of several similarity scores (including correlation and cosine similarity) between the current node xm and each node yk in C. The row node ym in C that may be most correlated with xm may then be chosen. [0035] In other embodiments, weights may be used for nodes or edges or both to determine correlation.  In such embodiments, edge weights on indexes may be pre-computed and stored, and node weights on an indexed array may be pre-computed and stored. [0038] At step 504, the nodes represented by rows of the matrix may be joined to produce disjoint sets representing clusters of a level of the hierarchical clustering.  In an embodiment, the steps of FIG. 4 may be executed for producing disjoint sets of nearest nodes of the matrix that may represent correlated clusters of a level of the hierarchical clustering. [0039] At step 506, the disjoint sets representing clusters of a level of the hierarchical clustering may be stored.  And it may be determined at step 508 whether the number of levels of the hierarchical clustering may be less than a threshold.  If so, then the objects of a disjoint set may be combined for each of the disjoint sets to create a rectangular matrix of meta-objects and processing may continue at step 504.  The objects of a disjoint set may be combined in an embodiment by OR'ing or summing the rows of objects belonging to the disjoint set, or by contracting the object nodes of a disjoint set in the bipartite graph view of the relationship of the collection of objects or clusters.  Note that the rectangular matrix of meta-objects may represent the relationship between clusters and attributes, or clusters of attributes at a level of the hierarchical clustering.  In various embodiments, a weighted version of the clustering algorithm may be used for clustering at levels 2 and above of the hierarchical clustering. [0041] In an alternate embodiment, a hierarchical clustering may be produced by iterating the steps generally described in conjunction with FIGS. 3 and 4, and by using the initial dataset of the collection of object when computing the similarities of all pairs of objects, or clusters, that have nonzero overlap at each level of the hierarchical clustering [as
	indexing (e.g. index) an input data set, wherein the input data set is in a tabular formant (e.g. matrix) and the indexing includes providing a Row identifier (RowID) (e.g. row – object, such as each user), wherein rows are the data entries;
	computing attribute similarity (e.g. similarity) for each column (e.g. column) across each pair of rows (e.g. all pair of objects (rows));
	computing, for each pair of rows, row-to-row similarity as a weighted sum (e.g. weighted similarity) of attribute similarities;
	clustering (e.g. cluster) pairs of rows based on their row-to-row similarities; and
providing an output data set including at least the clustered pairs of rows (e.g. Each node represented by a row may be joined to a nearest node represented by another row to produce disjoint sets of nodes.  The disjoint sets of nodes may represent clusters that may then be output for use by an application)]).
Although Lang substantially teaches the claimed invention, Lang does not explicitly indicate a unique Row identifier (RowID).
Gibson teaches the limitations by stating indexing an input data set, wherein the input data set is in a tabular formant and the indexing includes providing a unique Row identifier (RowID), wherein rows are the data entries (Gibson claim 1 e.g. retrieving a table object that includes data associated with the table in the webpage, the table object having an indexed array of row objects, each of the row objects containing a unique row identifier assigned to a respective row of the table).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention, in view of the teachings of Lang and Gibson, to provide a way to more efficiently perform hierarchical clustering for identifying related groups of users or objects (Lang [0004]). 
9.	With respect to claim 2,
	Lang further discloses wherein prior to the indexing input data set, the method further comprises:
	standardizing the input data set into a predefined and unified format (Lang [0022], [0028] – [0029], [0031] – [0035], [0038] – [0040] e.g. matrix); and
	segmenting the standardized input data set, wherein each segment includes a subset of the rows included in the input data set (Lang [0022], [0028] – [0029], [0031] – [0035], [0038] – [0040] e.g. subset of row nodes).
10.	Claim 12 is same as claim 1 and is rejected for the same reasons as applied hereinabove.
11.	Claims 13-14 are same as claims 1-2 and are rejected for the same reasons as applied hereinabove.

12.	Claims 3 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Lang in view of Gibson, and further in view of Parkkinen et al (U.S. Patent 10114908 B2 hereinafter, “Parkkinen”).
13.	With respect to claim 3,
Lang further discloses
indexing each segment (Lang [0022], [0028] – [0029], [0031] – [0035], [0038] – [0040] e.g. index).
Although Lang and Gibson combination substantially teaches the claimed invention, they do not explicitly indicate using a text search engine.
Parkkinen teaches the limitations by stating indexing each segment using a text search engine (Parkkinen Claim 17 e.g. a database search engine having at least a processor for managing a data search index structure, said search index having a plurality of reference values each of a first or a second type such associating data to said in-memory and disk memory storage, wherein the search index has densely indexed rows that correspond to the subset of the rows of the database stored in the in-memory storage and sparsely indexed rows that correspond to the subset of the rows of the database stored in the disk memory storage).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention, in view of the teachings of Lang, Gibson and Parkkinen, to provide a way to more efficiently perform hierarchical clustering for identifying related groups of users or objects (Lang [0004]). 
14.	Claim 15 is same as claim 3 and is rejected for the same reasons as applied hereinabove.

15.	Claims 4-7 and 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over Lang in view of Gibson, and further in view of Malak et al (U.S. 20190102441 A1 hereinafter, “Malak”).
16.	With respect to claim 4,
Although Lang and Gibson combination substantially teaches the claimed invention, they do not explicitly indicate utilizing a comparator based on a type of an attribute to compute the attribute similarity, wherein the comparator is any one of: exact matching and fuzzy matching.
Malak teaches the limitations by stating utilizing a comparator based on a type of an attribute to compute the attribute similarity, wherein the comparator is any one of: exact matching (Malak [0010], [0021], [0147] e.g. exact matching) and fuzzy matching (Malak [0113] e.g. Clustering may be performed using a "fuzzy group-by" operation that is much like a SQL GROUP BY operation for grouping representations based upon similarity scores).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention, in view of the teachings of Lang, Gibson and Malak, to provide a way to more efficiently perform hierarchical clustering for identifying related groups of users or objects (Lang [0004]). 
17.	With respect to claim 5,
Malak further discloses wherein row-to-row similarity demonstrates pairs of rows are similar, and wherein the weights are determined based on a machine learning model (Malak [0061] e.g. Unsupervised Learning).
18.	With respect to claim 6,
	Lang further discloses generating a graph including nodes and edges, wherein the nodes represent rows and edges represent the row-to-row similarities (Lang [0022], [0028] – [0029], [0031] – [0035], [0037] – [0042] e.g. graph).
Malak further discloses applying a greedy algorithm on the graph to cluster rows, wherein each cluster includes at two similar data entries (Malak [0085] e.g. clustering may be performed according to a greedy algorithm).
19.	With respect to claim 7,
	Lang further discloses wherein the clustering results in isolated rows, wherein each isolated row is individually clustered (Lang [0022], [0028] – [0029], [0031] – [0035], [0038] – [0040] e.g. Clustering Using Indexes for a Matrix [0022] The present invention is generally directed towards a system and method for clustering objects using indexes for a matrix representing a collection of objects.  More particularly, objects to be clustered may be represented as a rectangular matrix.  An index may be created for accessing the rows of the matrix and an index may be created for accessing the columns of the matrix based upon the connectivity of the edges between rows and columns of the matrix.  Each node represented by a row may be joined to a nearest node represented by another row to produce disjoint sets of nodes.  The disjoint sets of nodes may represent clusters that may then be output for use by an application).
20.	Claims 16-18 same as claims 4-6 and are rejected for the same reasons as applied hereinabove.

21.	Claims 8-11 and 19-22 are rejected under 35 U.S.C. 103 as being unpatentable over Lang in view of Gibson and Malak, and further in view of Morton et al (U.S. 20150199363 A1 hereinafter, “Morton”).
22.	With respect to claim 8,
Lan further discloses iteratively merging clusters that are substantially related (Lang [0037], [0041] e.g. [0037] A hierarchical clustering may be produced by iterating the steps generally described in conjunction with FIGS. 3 and 4 to produce clusters at each level of the hierarchical clustering.  [0041] In an alternate embodiment, a hierarchical clustering may be produced by iterating the steps generally described in conjunction with FIGS. 3 and 4, and by using the initial dataset of the collection of object when computing the similarities of all pairs of objects, or clusters, that have nonzero overlap at each level of the hierarchical clustering).
Although Lang, Gibson and Malak combination substantially teaches the claimed invention, they do not explicitly indicate determining clusters that are substantially related based on a cluster signature.
Morton teaches the limitations by stating determining clusters that are substantially related based on a cluster signature (Morton [0041] – [0048], [0088] – [0093] e.g. cluster ID; attribute(s) (such as Last Name, First Name, Address, Social Security Number, etc.,) with attribute ID assigned to a cluster).
Therefore, it would have been obvious to one of ordinary skill in the art at the time of the effective filing date of the invention, in view of the teachings of Lang, Gibson, Malak and Morton, to provide a way to more efficiently perform hierarchical clustering for identifying related groups of users or objects (Lang [0004]). 
23.	With respect to claim 9,
Morton further discloses wherein the output data set further includes the input data set, a cluster identification, a cluster anchor information, and a confidence score (Morton [0041] – [0048], [0088] – [0093] e.g. cluster ID; attribute(s) (such as Last Name, First Name, Address, Social Security Number, etc.,) with attribute ID assigned to a cluster; similarity – match score).
24.	With respect to claim 10,
	Lang further discloses wherein the format of the output data set is any one of: a table and a graph (Lang [0022], [0028] – [0029], [0031] – [0035], [0037] – [0042] e.g. graph).
25.	With respect to claim 11,
	Malak further discloses wherein the input data set is sourced from a plurality of data sources (Malak [0126], [0203] e.g. [0126] Reference data 408 may comprise information published by web sites, web services, curated knowledge stores, and other sources. [0203] one or more third party information sources).
26.	Claims 19-22 same as claims 8-11 and are rejected for the same reasons as applied hereinabove.

Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon, if any, is considered pertinent to applicant's disclosure.
27.	The examiner requests, in response to this office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
28.	When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the reference cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111(c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SyLing Yen whose telephone number is 571-270-1306.  The examiner can normally be reached on Mon-Fri 8:30am - 5:00pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mark Featherstone can be reached at 571-270-3750.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


SyLing Yen
Examiner
Art Unit 2166



/SYLING YEN/Primary Examiner, Art Unit 2166                                                                                                                                                                                                        
October 20, 2021