Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

RELEVANT ART CITED BY THE EXAMINER
The following prior art made of record and not relied upon is cited to establish the level of skill in the applicant’s art and those arts considered reasonably pertinent to applicant’s disclosure. See MPEP 707.05(c).

Amit et al. (US 2017/0262466) teaches “The representative fingerprint memory 240 stores the representative fingerprints 235A and makes them available to the fingerprint comparison module 250. The fingerprint comparison module 250 compares the sampled fingerprints 235B to representative fingerprints 235A and indicates to the deduplication estimation module 260, which (or how many) chunks 215B that are sampled from the measured machine image region 220, match chunks 215A within the representative images 210. The deduplication estimation module 260 may estimate an overall deduplication ratio for the measured machine image region(s) and/or a computing environment or infrastructure from the information provided by the fingerprint comparison module.” ([0023]).

Saliba et al. (US 2014/0136490) teaches “when a CRC match occurs, the "matching" blocks might be compared bit-by-bit, byte-by-byte, word-by-word, etc. in order to determine that an actual match has occurred. Alternatively, in another embodiment, another CRC might be calculated. For example, when a match occurs new CRCs might be calculated for the "matching" blocks. The new CRCs might use more bits. In this way the probability of a collision may be 

Harnik et al. (US 2018/0074745) teaches: “a volume sketch consists of all the hashes that are in the volume and have a special property (herein, "special hashes"). An example is all the hashes that start with 16 zero bits. In such a case, a sketch would hold approximately 1/65,536 of the hashes in the volume. By this downsizing, the process is made computationally feasible. The choice of 16 leading zeros as a filter to determine special hashes is a specific example. In general, the special hashes can be determined by any predefined or specific pattern of bits at any location in the bit string. For example, strings ending with the bit pattern "10011010" would be an equally a viable option as an 8 bit string, meaning that the sketch holds approximately 1/256 

REASONS FOR ALLOWANCE
Per the instant office action, claims 1-7, 9-15, 17-18, and 20-23 are considered as allowable subject matter. 
The reasons for allowance of claim 1 are that the prior art of record, including the reference(s) cited above, neither anticipates, nor renders obvious the recited combination as a whole; including the limitations of an apparatus comprising “… to identify a dataset to be scanned to generate a deduplication estimate for that dataset; to designate a subset inclusion characteristic to be utilized in the scan; for each of a plurality of pages of the dataset, to scan the page by: computing a polynomial-based signature for the page; determining whether or not the polynomial-based signature satisfies the designated subset inclusion characteristic; and responsive to the polynomial-based signature satisfying the designated subset inclusion characteristic, computing a content-based signature for the page and updating a corresponding entry of a deduplication estimate table for the dataset based at least in part on the content-based signature; and to generate the deduplication estimate for the dataset based at least in part on contents of the deduplication estimate table; wherein for a first one of the pages for which the polynomial-based signature of that page satisfies the designated subset inclusion characteristic, a ; the designated subset inclusion characteristic specifying that application of a designated function to the polynomial-based signature produces a particular result: and the designated function comprising a designated modulo arithmetic operation.”
The reasons for allowance of claim 15 are that the prior art of record, including the reference(s) cited above, neither anticipates, nor renders obvious the recited combination as a whole; including the limitations of a method comprising “identifying a dataset to be scanned to generate a deduplication estimate for that dataset; designating a subset inclusion characteristic to be utilized in the scan; for each of a plurality of pages of the dataset, scanning the page by: computing a polynomial-based signature for the page; determining whether or not the polynomial-based signature satisfies the designated subset inclusion characteristic; and responsive to the polynomial-based signature satisfying the designated subset inclusion characteristic, computing a content-based signature for the page and updating a corresponding entry of a deduplication estimate table for the dataset based at least in part on the content-based signature; and generating the deduplication estimate for the dataset based at least in part on contents of the deduplication estimate table; wherein for a first one of the pages for which the polynomial-based signature of that page satisfies the designated subset inclusion characteristic, a content-based signature is computed for that page and an update is made to the deduplication estimate table; and wherein for a second one of the pages for which the polynomial-based 
The reasons for allowance of claim 18 are that the prior art of record, including the reference(s) cited above, neither anticipates, nor renders obvious the recited combination as a whole; including the limitations of a method comprising “to identify a dataset to be scanned to generate a deduplication estimate for that dataset; to designate a subset inclusion characteristic to be utilized in the scan; for each of a plurality of pages of the dataset, to scan the page by: computing a polynomial-based signature for the page; determining whether or not the polynomial-based signature satisfies the designated subset inclusion characteristic; and responsive to the polynomial-based signature satisfying the designated subset inclusion characteristic, computing a content-based signature for the page and updating a corresponding entry of a deduplication estimate table for the dataset based at least in part on the content-based signature; and to generate the deduplication estimate for the dataset based at least in part on contents of the deduplication estimate table; wherein for a first one of the pages for which the polynomial-based signature of that page satisfies the designated subset inclusion characteristic, a content-based signature is computed for that page and an update is made to the deduplication estimate table; and wherein for a second one of the pages for which the polynomial-based 

Dependent claims 2-7, 9-14, 17, and 20-23 are allowable at least for the reasons recited above including all the limitations of the allowable independent base claim upon which they depend.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”


DIRECTION OF FUTURE CORRESPONDENCES

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GIOVANNA B COLAN whose telephone number is (571)272-2752.  The examiner can normally be reached on Mon - Fri 8:30-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/GIOVANNA B COLAN/Primary Examiner, Art Unit 2615                                                                                                                                                                                                        February 23, 2021