Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This action is response to the communication filed on October 19th 2021 After thorough search and examination of the present application and in light of the prior art made of record, claims 2-10, 12-18, 20-21 (re-numbered as 1-18) are allowed.

Claims 1, 11, and 19 are previously or currently cancelled.

EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given in a telephone interview with Timothy Hwang (Reg. # 61145) on October 28, 2021.

This listing of claims will replace all prior versions, and listings, of claims in the Application: 

(Canceled)	
(Currently Amended)	A method, comprising:
scanning a set of files to determine how to divide the set of files into a plurality of subsets of files;
assigning a corresponding subset of the plurality of subsets of files to each node of a plurality of nodes of a cluster;
providing the corresponding assigned subsets of files to each of the cluster nodes, wherein each of the cluster nodes is configured to: 
perform, in parallel, deduplication with respect to the corresponding assigned subset of files, comprising to: 
create, using a fingerprint algorithm, variable sized chunks of data associated with a file of the corresponding assigned subset of files;
create fingerprints of the variable sized chunks using a hash algorithm; 
determine whether a fingerprint of the fingerprints already exists or is present in a parallel database; and
in response to a determination that the fingerprint does not already exist or is not present in the parallel database:
add a new entry to a database table, wherein the new entry identifies a current chunk associated with the fingerprint, a file offset in the file where the current chunk is stored, and the length of the current chunk;
update a lookup table with location information associated with the current chunk; and

generate corresponding deduplication statistics associated with the corresponding assigned subset of files; and 
assigning an additional subset of files to a node from which a notification that deduplication has been completed with respect to an assigned subset of files is received;
aggregating from each of the cluster nodes the corresponding deduplication statistics.
(Previously presented)	The method of claim 2, wherein the set of files is associated with a directory or folder.
(Previously presented)	The method of claim 2, further comprising dividing the set of files into the plurality of subsets of files.
(Previously presented)	The method of claim 4, wherein dividing the set of files into the plurality of subsets of files comprises assigning one or more files included in the set of files to a subset of files of the plurality of subsets of files until a threshold size of files is assigned to the subset of files.
(Previously presented)	The method of claim 4, wherein a node of the plurality of cluster nodes is assigned an additional subset of files after each of the other cluster nodes is assigned an initial subset of files.
(Currently Amended)	The method of claim 2
(Currently Amended)	The method of claim 2, wherein a node of the plurality of cluster nodes is comprised of a plurality of compute containers.
(Previously presented)	The method of claim 8, wherein the node assigns a received subset of files one of the plurality of compute containers.
(Currently Amended)	The method of claim 8, further comprising receiving [[an]] the notification that deduplication has been completed with respect to the assigned 
(Canceled)	
(Currently Amended)	A computer program product, the computer program product being embodied in non-transitory computer readable medium and comprising instructions for:
scanning a set of files to determine how to divide the set of files into a plurality of subsets of files;
assigning a corresponding subset of the plurality of subsets of files to each node of a plurality of nodes of a cluster;
providing the corresponding assigned subsets of files to each of the cluster nodes, wherein each of the cluster nodes is configured to:
perform, in parallel, deduplication with respect to the corresponding assigned subset of files, comprising to: 
create, using a fingerprint algorithm, variable sized chunks of data associated with a file of the corresponding assigned subset of files;
create fingerprints of the variable sized chunks using a hash algorithm; 
determine whether a fingerprint of the fingerprints already exists or is present in a parallel database; and
in response to a determination that the fingerprint does not already exist or is not present in the parallel database:
add a new entry to a database table, wherein the new entry identifies a current chunk associated with the fingerprint, a file offset in the file where the current chunk is stored, and the length of the current chunk;
update a lookup table with location information associated with the current chunk; and
update a block table to, for the file, indicate the fingerprint, offset, and length information; and
generate corresponding deduplication statistics associated with the corresponding assigned subset of files; and
assigning an additional subset of files to a node from which a notification that deduplication has been completed with respect to an assigned subset of files is received;
aggregating from each of the cluster nodes the corresponding deduplication statistics.
(Previously presented)	The computer program product of claim 12, further comprising instructions for dividing the set of files into the plurality of subsets of files.
(Previously presented)	The computer program product of claim 13, wherein dividing the set of files into the plurality of subsets of files comprises assigning one or more files included in the set of files to a subset of files of the plurality of subsets of files until a threshold size of files is assigned to the subset of files.
(Previously presented)	The computer program product of claim 13, wherein a node of the plurality of cluster nodes is assigned an additional subset of files after each of the other cluster nodes is assigned an initial subset of files.
(Previously presented)	The computer program product of claim 12, wherein a node of the plurality of cluster nodes is comprised of a plurality of compute containers.
(Previously presented)	The computer program product of claim 16, wherein the node assigns a received subset of files one of the plurality of compute containers.
(Currently Amended)	The computer program product of claim 16, further comprising instructions for receiving [[a]] the notification that deduplication has been completed with respect to the assigned 
(Canceled)	
(Previously presented)	The computer program product of claim 12,	wherein the set of files is associated with a directory or folder.
(Currently Amended) 	A system, comprising:
a processor; and
a memory coupled with the processor, wherein the memory is configured to provide the processor with instructions which when executed cause the processor to:
scan a set of files to determine how to divide the set of files into a plurality of subsets of files;
assign a corresponding subset of the plurality of subsets of files to each node of a plurality of nodes of a cluster;
provide the corresponding assigned subsets of files to each of the cluster nodes, wherein each of the cluster nodes is configured to:
perform, in parallel, deduplication with respect to the corresponding assigned subset of files, comprising to: 
create, using a fingerprint algorithm, variable sized chunks of data associated with a file of the corresponding assigned subset of files;
create fingerprints of the variable sized chunks using a hash algorithm; 
determine whether a fingerprint of the fingerprints already exists or is present in a parallel database; and
in response to a determination that the fingerprint does not already exist or is not present in the parallel database:
add a new entry to a database table, wherein the new entry identifies a current chunk associated with the fingerprint, a file offset in the file where the current chunk is stored, and the length of the current chunk;
update a lookup table with location information associated with the current chunk; and
update a block table to, for the file, indicate the fingerprint, offset, and length information; and
generate corresponding deduplication statistics associated with the corresponding assigned subset of files; and
assign an additional subset of files to a node from which a notification that deduplication has been completed with respect to an assigned subset of files is received;
aggregate from each of the cluster nodes the corresponding deduplication statistics.


Reasons for Allowance
The following is an examiner’s statement of reasons for allowance: In interpreting the claims, in light of the specification, the examiner finds the claimed invention to be patentably distinct from the prior art of record. The prior art made of record does not teach or fairly suggest the combination of elements, as recited in independent claims 2, 12, and 21. Particularly the prior art of record fails to teach perform, in parallel, deduplication with respect to the corresponding assigned subset of files, comprising to: create, using a fingerprint algorithm, variable sized chunks of data associated with a file of the corresponding assigned subset of files; add a new entry to a database table, wherein the new entry identifies a current chunk associated with the fingerprint, a file offset in the file where the current chunk is stored, and the length of the current chunk; update a lookup table with location information associated with the current chunk; and update a block table to, for the file, indicate the fingerprint, offset, and length information; and generate corresponding deduplication statistics associated with the corresponding assigned subset of files; and assigning an additional subset of files to a node from which a notification that deduplication has been completed with respect to an assigned subset of files is received; aggregating from each of the cluster nodes the corresponding deduplication statistics.
The above features together with other limitations of the independent claims are novel and non-obvious over the prior art of record.  The dependent claims 3-10, 13-18, 20 are being definite, enabled by the specification, and further limiting to the independent claims, are also allowable.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MD I UDDIN whose telephone number is (571)270-3559. The examiner can normally be reached M-F, 8:00 am to 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 571-272-4046. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/MD I UDDIN/Primary Examiner, Art Unit 2169