DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Remarks
This action is in response to the amendments received on 11/3/21.  Claims 21-40 are pending in the application.  Claims 1-20 were cancelled.  Applicants' arguments have been carefully and respectfully considered.
Claims 21-40 are rejected under 35 U.S.C. 112.
Claims 21-26, 28-36, and 38-40 are rejected under 35 U.S.C. 103 as being unpatentable over Kardes et al. (US 2015/0269494), and further in view of Ye et al. (US 8,452,755).
Claims 27 and 37 are rejected under 35 U.S.C. 103 as being unpatentable over Kardes in view of Ye, and further in view of Healing et al. (US 2018/0227190).

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 21-40 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claims 21, 31, and 40 recite “determining a subset of attributes of the set of attributes, the subset of attributes including a next attribute having a next assigned order number and every attribute having an assigned order number that is less than the next assigned order number, the next assigned order number immediately following the determined level number.”  The specification does not disclose an “assigned order number that is less than the next assigned order number, the next assigned order number immediately following the determined level number.”  Paragraph 0069 

Claims 21, 31, and 40 recites the limitation "the unique value" in the “for the first order attribute” and “storing each record” limitations.  The “for the first order attribute” limitation discloses “each unique value of the first order attribute” which suggests there can be more than one.  The recitation of “the unique value” refers to a specific unique value.  It is unclear what is intended by the recitation of “for each unique value of the first order attribute.”

Claims 21, 31, and 40 recites the limitation "the data block" in the “determining, for the particular first level data block” limitation.  There is insufficient antecedent basis for this limitation in the claim.  This will be interpreted as referring to “a data block.”

Claims 21, 31, and 40 recites the limitation "the order number" in the 
“determining, for the particular first level data block” limitation.  There is insufficient antecedent basis for this limitation in the claim.  This will be interpreted as referring to “an order number.”

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


Claims 21-26, 28-36, and 38-40 are rejected under 35 U.S.C. 103 as being unpatentable over Kardes et al. (US 2015/0269494), and further in view of Ye et al. (US 8,452,755).

With respect to claim 21, Kardes teaches a method for blocking data including a number of records, each record having a value corresponding to one or more attributes associated therewith, the method comprising:
for the first order attribute, creating a first level data block for each unique value of the first order attribute (Kardes, pa 0055, all the input records are grouped into blocks defined by the top-level properties), each created first level data block having a predetermined maximum number of records that include the unique value of the first order attribute that can be assigned thereto (Kardes, pa 0055, taking as input a maximum block size M); 
storing each record in a created first level data block that corresponds to the unique value of the first order attribute of the record (Kardes, pa 0055, all the input records are grouped into blocks defined by the top-level properties); 
in response to determining, for a particular first level data block, that a number of records stored therein exceeds the corresponding predetermined maximum number of records (Kardes, pa 0055, We traverse the tree breadth-first and only recurse into nodes above the maximum block size.): 
determining, for the particular first level data block, a level number of the data block (Kardes, pa 0055, taking as input a set of records … all the input records are grouped into blocks defined by the top-level properties. Examiner note: top-level properties each provide a “level” of a data block); 
determining a subset of attributes of the set of attributes (Kardes, pa 0055, The remaining oversized blocks are partitioned into sub-blocks by sub-blocking properties that the records they contain share, and those properties are appended to the key (block 404).); 
creating, for the particular first level data block, a next level data block for each unique value of the next attribute (Kardes, pa 0055, The remaining oversized blocks are partitioned into sub-blocks by sub-blocking properties that the records they contain share, and those properties are appended to the key (block 404). In the example embodiment, the process is continued recursively until all sub-blocks have been whittled down to an acceptable size.); and 
storing each record in the created next level data block that corresponds to the unique value of the next attribute of the record (Kardes, pa 0055, The remaining oversized blocks are partitioned into sub-blocks by sub-blocking properties that the records they contain share, and those properties are appended to the key (block 404).); and 
creating a group of data blocks associated with the first order attribute, the group of data blocks including each first level data block and each next level data block (Kardes, pa 0054, explore the space of possible sub-blocks in cardinality order for a given branch & pa 0055, All the input records are grouped into blocks defined by 
Kardes does not expressly discuss assigning sequential order numbers to a set of attributes, a first order corresponding to a first attribute; determining, for the particular first level data block, a level number of the data block, the determined level number corresponding to the order number of the attribute for which the data block was created; determining a subset of attributes of the set of attributes, the subset of attributes including a next attribute having a next assigned order number and every attribute having an assigned order number that is less than the next assigned order number, the next assigned order number immediately following the determined level number.
Ye teaches assigning sequential order numbers to a set of attributes, a first order corresponding to a first attribute (Ye, Col. 8 Li. 30-38, determine to order attribute classes by ordering attribute classes with a lower number of distinct data values prior to attribute classes with a higher number of distinct data values); 
determining, for the particular first level data block, a level number of the data block, the determined level number corresponding to the order number of the attribute for which the data block was created (Ye, Col. 8 Li. 35-38, The system 200 may determine that a first attribute class has a lower number of distinct values than a second attribute class and, therefore, order the first attribute class prior to the second attribute class in the determined relative order.); 
determining a subset of attributes of the set of attributes, the subset of attributes including a next attribute having a next assigned order number and every attribute having an assigned order number that is less than the next assigned order number, the next assigned order number immediately following the determined level number (Ye, Col. 12 Li. 33-40, the system 200 may access a rule that indicates a preference to maintain an order of attribute classes within a dimension of attribute classes despite redundancy characteristics. A dimension of attribute classes may define a subset of multiple attribute classes that have a parent-child relationship (e.g., a dimension of time may have a parent class of "year" and child classes of "month," "day," and "hour").).
It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Kardes in view of Ye because it reduces the time spent in performing grouping and filtering operations (Ye, Col. 1 Li. 54-58).

With respect to claim 22, Kardes in view of Ye teaches the method of claim 21, wherein the values corresponding to the attributes of the subset of attributes comprise an empty value corresponding to an attribute other than the first order attribute (Kardes, pa 0032, all records first go through a cleaning process that starts with the removal of bogus, junk and spam records. Then all records are normalized to an approximately common representation. Finally, all major noise types and inconsistencies are addressed, such as empty/bogus fields, field duplication, outlier values and encoding issues.).

claim 23, Kardes in view of Ye teaches the method of claim 21, further comprising: assigning sequential order numbers to a further set of attributes; and repeating the method for the further set of attributes (Kardes, pa 0054, We traverse the tree breadth-first and only recurse into nodes above the maximum block size. This allows us to explore the space of possible sub-blocks in cardinality order for a given branch, stopping as soon as we have a small enough sub-block.).

With respect to claim 24, Kardes in view of Ye teaches the method of claim 23, wherein the set of attributes and the further set of attributes include different attributes (Kardes, Fig. 3, different token combinations).

With respect to claim 25, Kardes in view of Ye teaches the method of claim 23, wherein: each created next level data block has a predetermined maximum number of records that include the unique value of the next attribute that can be assigned thereto (Kardes, pa 0055, taking as input a maximum block size M); and the method further comprises: receiving a record having a non-empty value corresponding to an attribute; identifying at least one group of data blocks associated with the attribute; and for each identified group of data blocks: identifying all full data blocks, wherein full data blocks are those in which the number of records stored exceeds the corresponding predetermined maximum number of records; and identifying an additional data block that has fewer than the corresponding predetermined maximum number of records stored therein and that has a level number that is at least as large as the highest level number of the identified full data blocks; and comparing the received record with each record of the identified full data blocks and with the additional data block (Kardes, pa 0033, Since 

With respect to claim 26, Kardes in view of Ye teaches the method of claim 21, wherein: each created next level data block has a predetermined maximum number of records that include the unique value of the next attribute that can be assigned thereto (Kardes, pa 0055, taking as input a maximum block size M); and the method further comprises: receiving a record having a non-empty value corresponding to the first order attribute; identifying from the group of data blocks: all full data blocks, wherein full data blocks are those in which the number of records stored exceeds the corresponding predetermined maximum number of records; and an additional data block that has fewer than the corresponding predetermined maximum number of records stored therein and that has a level number that is at least as large as the highest level number of the identified full data blocks; and comparing the received record with each record of the identified full data blocks and with the additional data block. (Kardes, pa 0033, Since comparing all pairs of records is quadratic in the number of records and hence is intractable for large data sets, the blocking 10 groups records by shared properties to determine which pairs of records should be examined by the pairwise linker 20 as potential duplicates).

claim 28, Kardes in view of Ye teaches the method of claim 26, further comprising deduplicating the data based on the comparison result (Kardes, pa 0078, The second job, TC-Dedup 304, just deduplicates the output of the CCF-Iterate job).

With respect to claim 29, Kardes in view of Ye teaches the method of claim 26, wherein comparing the received record includes loading into a memory the identified full data blocks and the additional data block for performing the comparison in the memory (Kardes, pa 0057, The reducer function iterates over all the records in a newly created sub-block, counting them to determine whether or not the block is small enough or needs to be further subdivided…Care is taken that the memory requirements of the reducer function are constant in the size of a fixed buffer because otherwise the reducer runs out of memory on large blocks.).

With respect to claim 30, Kardes in view of Ye teaches the method of claim 29, wherein the predetermined maximum number of records is determined based on the size of the memory (Kardes, pa 0057, Care is taken that the memory requirements of the reducer function are constant in the size of a fixed buffer because otherwise the reducer runs out of memory on large blocks.).

With respect to claims 31-36, 38, and 39, the limitations are essentially the same as claims 21-26 and 28-30, and are thus rejected for the same reasons.

claim 40, the limitations are essentially the same as claim 21, and are thus rejected for the same reasons.

Claims 27 and 37 are rejected under 35 U.S.C. 103 as being unpatentable over Kardes in view of Ye, and further in view of Healing et al. (US 2018/0227190).

With respect to claim 27, Kardes in view of Ye teaches the method of claim 26, as discussed above.  Kardes in view of Ye doesn't expressly discuss the additional block being further associated with an attribute of the subset of attributes that has an empty value.
Healing teaches wherein the additional data block is further associated with an attribute of the subset of attributes that has an empty value. (Healing, pa 0090, steps performed for each attribute in the cluster & pa 0093, one of the attribute values may be null).
	It would have been obvious at the effective filing date of the invention to a person having ordinary skill in the art to which said subject matter pertains to have modified Kardes in view of Ye with the teachings of Healing because it creates an accurate representation of the data for comparing across attributes (Healing, pa 0025).

	With respect to claim 17, the limitations are essentially the same as claim 7, and are thus rejected for the same reasons.

Conclusion
THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRITTANY N ALLEN whose telephone number is (571)270-3566.  The examiner can normally be reached on M-F 9 am - 5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 571-272-4046.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/BRITTANY N ALLEN/           Primary Examiner, Art Unit 2169