Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
2.	This action is in response to the communication filed on October 06, 2021.
Response to Amendment
3.	Applicant’s amendment filed on 10/06/2021 with respect to claims 1-20 has been received, entered into the record and considered.
4.	As a result of the amendment, claims 1-11 and 14-20 has been amended.
5.	Claims 1-20 remain pending in this office action.
Information Disclosure Statement
6.	The information disclosure statement (IDS) submitted on 10/21/2021; 10/06/2021.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 112
7.	The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

8.	Claims 1, 10 and 19 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to 
Amended claim 1, 10 and 19 recites limitation without checking whether chunks of the first plurality of chunks other than the subset are in the chunk hash data structure. In remarks page 8, applicant’s states that support for these amendment may be found in, at least, Para [0013], [0062]-[0067] of the originally filed application. However, after take a close look at these paragraph, examiner unable to find support for this amended limitation, without checking whether chunks of the first plurality of chunks other than the subset are in the chunk hash data structure. Therefore, the amended limitation introduce new matter.
Claim Rejections - 35 USC § 103
9.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

10.	Claims 1, 9, 10 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Gaonkar et al (US 2012/0158709 A1), in view of Das (US 2014/0188822 A1).
	As per claim 1, Gaonkar disclose:
	- a method of deduplicating a first file, the method comprising (method of determining similarity (i.e. duplicate) between two or more data sets (i.e. first file and second file), Abstract, Fig. 3, item 108, 114),  
	- separating the first file into a first plurality of chunks (separating the first file into plurality of blocks (i.e. chunk), Para [0048], Fig. 4a, item 402,  
- choosing a first chunk of the first file (selecting (i.e. choosing) a block of the first file, Fig. 4a, 4b, Para [0050]),
- determining, for a subset of the plurality of chunks that is a percentage of the first plurality of chunks and that is less than all of the plurality of chunks whether a hash of each of the chunks of the subset is in the chunk data structure, without checking whether chunks of the first plurality of chunks other than the subset are in the chunk hash data structure (Fig. 4b, item 412 selected portion (i.e. subset of the first plurality of chunk) that is percentage (i.e. k value, in this case k=5) of the first plurality of chunk (i.e. item 410), and that is less that all plurality of chunks, (i.e. 5 is less than total number of chunk in item 410), without checking all chunks in item 410 other than selected portion, item 412 (i.e. subset of the plurality of chunk), Para [0045], [0050]-[0051]),
- based on the determining for the subset that none of the hashes of the chunks of the subset are in the chunk hash data structure, including at least one of the chunks of the subset in the hash data structure without including all of the first plurality of chunks in the chunk hash data structure (determining degree of similarity between 2 file or datasets, without including all of the first plurality of chunk, Para [0012], “…the similarity value is re-determined without resorting or hashing the blocks of a dataset other than the blocks of the subset, resulting in an increased performance of the similarity comparison”), 
Gaonkar does not explicitly disclose determining a hash of the first chunk is not in a chunk hash data structure stored in a chunk store. However, in the same field of endeavor Das in an analogous art disclose determining a hash of the first chunk is not in a chunk hash data structure stored in a chunk store (determining hash is present or not present in the hash data structure, Fig. 5, Para [0037]).
Therefore, it would have been obvious to a person of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Gaonkar with the teaching of Das by modifying Gaonkar such that similarity analysis of Gaonkar to detect 
	As per claim 9, rejection of claim 1 is incorporated, and further Gaonkar discloses:
	- separating a second file into a second plurality of chunks (Fig. 4b, Para [0050]),
	- choosing a second chunk of the second file (selecting (i.e. choosing) a block of the second file, Fig. 4a, 4b, Para [0050]),
Gaonkar does not explicitly disclose determining a second hash of the second chunk of the second file is in the chunk hash data structure. However, in the same field of endeavor Das in an analogous art disclose determining a second hash of the second chunk of the second file is in the chunk hash data structure (determining hash present in the hash data structure, Fig. 5, Para [0029], [0037]),
Gaonkar does not explicitly disclose performing a process for including all of the second plurality of chunks of the second file in the chunk store, wherein the process for including all of the second plurality of chunks of the second file in the chunk store comprises deduplicating the second file using the chunk store (deduplicatig file using chunk store (i.e. hash table store), Abstract, line 1-10, Para [0005]-[0006], [0027], [0037], Fig. 5).
Therefore, it would have been obvious to a person of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Gaonkar with the teaching of Das by modifying Gaonkar such that similarity analysis of Gaonkar to detect duplicate chunk of a file or data stream of Das for efficient analysis of de-duplication of data. The motivation for doing so would be determining degree of similarity between two file with similar content efficiently, (Gaonkar, Para [0012]).
As per claims 10 and 18, 

Claim 19 is the system claim corresponding to method claim 1 respectively and rejected under the same reason set forth to the rejection of claim 1 above.
Claim Rejections - 35 USC § 103
11.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

12.	Claims 2, 6-8, 11, 15-17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Gaonkar et al (US 2012/0158709 A1), in view of Das (US 2014/0188822 A1), as applied to claim 1, 10 and 19 above and further in view of Vincent US 9,298,723 B1).
As per claim 2, rejection of claim 1 is incorporated, 
Combined method of Gaonkar and Das does not explicitly disclose wherein the chunk hash data structure comprises a first plurality of key-value mappings between first plurality of keys and first plurality of values, the first plurality of keys each being a hash of a corresponding chunk. However, in the same field of endeavor Vincent in an analogous art discloses wherein the chunk hash data structure comprises a first plurality of key-value mappings between first plurality of keys and first plurality of values, the first plurality of keys each being a hash of a corresponding chunk (key-value mapping, column 3, line 60-67, column 4, line 1-10).

As per claim 6, rejection of claim 1 is incorporated,
Combined method of Gaonkar and Das does not explicitly disclose the method further comprising, based on determining the hash of the first chunk is not in the chunk hash data structure: adding a first key-value mapping to the chunk hash data structure, the first key-value mapping comprising (a) a first key that is the hash of the first chunk, and (b) a first value that is a chunk ID of the first chunk. However, in the same filed of endeavor Vincent in an analogous art disclose the method further comprising, based on determining the hash of the first chunk is not in the chunk hash data structure: adding a first key-value mapping to the chunk hash data structure, the first key-value mapping comprising (a) a first key that is the hash of the first chunk, and (b) a first value that is a chunk ID of the first chunk (adding key-value mapping in the data store, column 13, line 35-67), and mapping with chunk and chunk id, column 6, line 50-60).
Therefore, it would have been obvious to a person of the ordinary skill in the art, before the effective filing date of the claimed invention to map chunk hash using a key value of Vincent into the combined method of Gaonkar and Das for an efficient de-duplication (Das Para [0024]).
As per claim 7, rejection of claim 6 is incorporated, 
Combined method of Gaonkar and Das does not explicitly disclose the method further comprising, based on determining the hash of the first chunk is not in the chunk hash data structure: adding a second key-value mapping to a chunk ID data structure, wherein the chunk ID data structure comprises second plurality of key-value mappings between second plurality of keys and second plurality of values, the second plurality of keys being the chunk IDs of the chunk hash data structure, and the second plurality of values being sets of information about a corresponding chunk. However, in the same filed of the method further comprising, based on determining the hash of the first chunk is not in the chunk hash data structure: adding a second key-value mapping to a chunk ID data structure, wherein the chunk ID data structure comprises second plurality of key-value mappings between second plurality of keys and second plurality of values, the second plurality of keys being the chunk IDs of the chunk hash data structure, and the second plurality of values being sets of information about a corresponding chunk (adding key-value mapping in the data store, column 13, line 35-67), second plurality of value with second chunk id, column 13, line 25-35).
Therefore, it would have been obvious to a person of the ordinary skill in the art, before the effective filing date of the claimed invention to map chunk hash using a key value of Vincent into the combined method of Gaonkar and Das for an efficient de-duplication (Das Para [0024]).
As per claim 8, rejection of claim 7 is incorporated, 
Combined method of Gaonkar and Das does not explicitly disclose wherein the set of information of the corresponding chunk comprises at least one of: (a) the hash of the corresponding chunk, (b) a pointer to the corresponding chunk, or (c) a reference count of the corresponding chunk. However, in the same field of endeavor Vincent in an analogous art discloses wherein the set of information of the corresponding chunk comprises at least one of: (a) the hash of the corresponding chunk, (b) a pointer to the corresponding chunk, or (c) a reference count of the corresponding chunk (pointing or referencing to the chunk, column 13, line 15-25).
Therefore, it would have been obvious to a person of the ordinary skill in the art, before the effective filing date of the claimed invention to map chunk hash using a key value of Vincent into the combined method of Gaonkar and Das for an efficient de-duplication (Das Para [0024]).
As per claims 11 and 15-17,  

Claim 20 is the system claim corresponding to method claim 2 respectively and rejected under the same reason set forth to the rejection of claim 2 above.
13.	Claims 3-5, 12-14 are rejected under 35 U.S.C. 103 as being unpatentable over Gaonkar et al (US 2012/0158709 A1), in view of Das (US 2014/0188822 A1), as applied to claim 1, 10 and 19 above and further in view of Williams US 5,999,810.
As per claim 3, rejection of claim 1 is incorporated, 
Combined method of Gaonkar and Das does not explicitly disclose wherein the first chunk of the first file is in a given position within the first file, wherein the position is chosen non-randomly. However, in the same field of endeavor Williams in an analogous art disclose wherein the first chunk of the first file is in a given position within the first file, wherein the position is chosen non-randomly (randomness selection (i.e. non-random selection), column 14, line 5-10).
Therefore, it would have been obvious to a person of the ordinary skill in the art, before the effective filing date of the claimed invention to determine a chunk position of Williams into the combined method of Gaonkar and Das for identifying identical portion of data to increase efficiency of the system that store and communicate data (Williams, column 1, and line 35-40).
As per claim 4, rejection of claim 3 is incorporated, 
Combined method of Gaonkar and Das does not explicitly disclose wherein the given position is a leading position of the first file. However, in the same field of endeavor Williams in an analogous art disclose wherein the given position is a leading position of the first file (starting position (i.e. leading position), column 2, and line 5-11).
Therefore, it would have been obvious to a person of the ordinary skill in the art, before the effective filing date of the claimed invention to determine a chunk position of Williams into 
	As per claim 5, rejection of claim 3 is incorporated,
	Combined method of Gaonkar and Das does not explicitly disclose wherein each chunk of the subset of the first plurality of chunks is randomly selected from among all of the first plurality of chunks of the first file. However, in the same field of endeavor Williams in an analogous art disclose wherein each chunk of the subset of the first plurality of chunks is randomly selected from among all of the first plurality of chunks of the first file (random selection of chunk, column 11, line 10-35).
Therefore, it would have been obvious to a person of the ordinary skill in the art, before the effective filing date of the claimed invention to determine a chunk position of Williams into the combined method of Gaonkar and Das for identifying identical portion of data to increase efficiency of the system that store and communicate data (Williams, column 1, and line 35-40).
Response to Arguments
14.	Applicant’s arguments, filed on 10/06/2021, with respect to claims 1-20 has been considered but are moot because of the new ground of rejection necessitated by the amendment to the claims.
	In response to the applicant’s argument ion page 9, applicant argued that, Das does not teach, disclose, or suggest all of the features recited in independent claim 1. For example, Das does not teach, disclose, or suggest "determining, for a subset of the first plurality of chunks that is a percentage of the first plurality of chunks and that is less than all the first plurality of chunks, whether a hash of each of the chunks of the subset is in the chunk hash data structure, without checking whether chunks of the first plurality of chunks [of the first file] other than the subset are in the chunk hash data structure" and "based on the determining for the subset that none of the hashes of the chunks of the subset are in the chunk hash data structure, including at least one of the chunks of the subset in the chunk hash data structure without including all of the first plurality of chunks in the chunk hash data structure. Examiner respectfully response that these amended limitations taught by newly cited reference Gaonkar et al. Gaonkar teaches determining, for a subset of the plurality of chunks that is a percentage of the first plurality of chunks and that is less than all of the plurality of chunks whether a hash of each of the chunks of the subset is in the chunk data structure, without checking whether chunks of the first plurality of chunks other than the subset are in the chunk hash data structure in Fig. 4b, item 412 selected portion (i.e. subset of the first plurality of chunk) that is percentage (i.e. k value, in this case k=5) of the first plurality of chunk (i.e. item 410), and that is less that all plurality of chunks, (i.e. 5 is less than total number of chunk in item 410), without checking all chunks in item 410 other than selected portion, item 412 (i.e. subset of the plurality of chunk), Para [0045], [0050]-[0051]) and Gaongar teaches based on the determining for the subset that none of the hashes of the chunks of the subset are in the chunk hash data structure, including at least one of the chunks of the subset in the hash data structure without including all of the first plurality of chunks in the chunk hash data structure (determining degree of similarity between 2 file or datasets, without including all of the first plurality of chunk, Para [0012], “…the similarity value is re-determined without resorting or hashing the blocks of a dataset other than the blocks of the subset, resulting in an increased performance of the similarity comparison.
	Therefore, combined method of Gaonkar and Das alone or in combination teaches these amended limitation as claimed. 
 Conclusion
15.	THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
			Contact Information
16.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMED R UDDIN whose telephone number is (571)270-3138.  The examiner can normally be reached on M-F: 9:00 AM-5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Beausoliel Robert can be reached on 571-272-3645.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/MOHAMMED R UDDIN/Primary Examiner, Art Unit 2167