Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
2.	Claims 19-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claim(s) do not fall within at least one of the four categories of patent eligible subject matter because they are directed towards computer program perse. Specifically, claims 19 and 20 are directed towards machine-readable storage medium. The broadest reasonable interpretation of “a machine-readable storage medium” covers transitory propagating signals, which are non-statutory. The disclosure in [0294] Computing devices typically include a variety of media, which can include computer-readable storage media, machine-readable storage media which does not explicitly say that it is a non-transitory or a hardware or is tangible. Therefore, a broadest reasonable interpretation of claims 19-20 covers a transitory signal. When the broadest reasonable interpretation of a claim covers a signal per se, the claim must be rejected under 35 U.S.C. 101 as covering non-statutory subject matter. See In re Nuijten, 500 F.3d 1346, 1356-57 (Fed. Cir. 2007) (transitory embodiments are not directed to statutory subject matter) and Interim Examination Instructions for Evaluating Subject Matter Eligibility under 35 U.S.C. 101, Aug. 24, 2009; p. 2. To overcome this rejection, applicant should insert –- non-transitory — before “computer readable storage medium”. Such an amendment is not considered new 

Claim Rejections - 35 U.S.C. § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

3.	 Claims 1-10 and 13-20 are rejected under 35 U.S.C. 103 as being unpatentable over WANG; Wenguang (US 20210064582 A1) in view of BALDWIN; Duane Mark (US 20130268497 A1).

Regarding independent claim 1, WANG; Wenguang (US 20210064582 A1) teaches, a method, comprising: initiating, by a system comprising a processor (Fig. 1 Paragraph [0018]), a write operation to initiate writing a set of data to a first data store (Paragraph [0011] An improvement in deduplication improves the way a computer stores and retrieves data in memory and in storage (i.e., writing a set of data to the first data store. Examiner interprets memory 110 as first data store which stores programs and data for the host computer);  
and during the write operation, determining, by the system, … data deduplication is to be performed to remove a first subset of data of the set of data from the write operation based at least in part on a first result of determining whether a hash value associated with the first subset of data satisfies a first match criterion with respect to a stored hash value stored in a memory index (Fig. 3 Paragraph [0032] Method 300 may be triggered when deduplication module 144 identifies within chunk store 134 a new file 200 that has not been previously deduplicated (i.e., Data deduplication is a process that eliminates excessive copies of data).  [0033] step 305 shows that the file is divided into chunks or subsets. Paragraph [0034] step 310 shows each chunk or subset computes a hash of the chunk, Paragraph [0036] At step 315, deduplication module 144 determines whether the hash of chunk 202, computed at step 310, is in chunk hash table 140. If so, then the identical contents of chunk 202 have been previously processed by deduplication module 144, such as for example as part of a previous execution of method 300 (i.e., determining whether the chunk/subset satisfies the first criterion that is matching hash of chunk 202 to chunk hash table which is stored in the chunk hash table which is the cache. Examiner interprets cache as memory index. See Fig. 1B);
WANG et al fails to explicitly teach, …an inline data deduplication is to be performed to remove a first … set of data from the write operation …
an inline data deduplication is to be performed to remove a first … set of data from the write operation …(Paragraph [0015] As mentioned previously, with the emergence of storage cloud services, a new set of issues for storage cloud service providers are present in the area of data de-duplication, specifically when the storage cloud services providers want to reduce the consumption of their storage space using techniques such as deduplication. A storage cloud services provider may elect to use post process deduplication and/or in-line deduplication. In-line deduplication is the process where the deduplication hash calculations are created on the target device as the data enters the device in real time. If the device spots a block that the device already stored on the storage system, the device does not store the new block, but rather, simply makes a reference to the existing block.
Therefore it would have been obvious to one of the ordinarily skilled in the art at the time of the filing of the invention to have modified the teachings of WANG et al to provide system, methods for increased in-line deduplication efficiency in a computing environment as taught by BALDWIN et al (Paragraph [0016]).
It would have been obvious to one of the ordinary skill in the art, to provide a system, method for increased in-line deduplication efficiency in a computing environment. The benefit of in-line deduplication over post-process deduplication is that in-line deduplication requires less storage as data is not duplicated. On the other hand, because hash calculations and lookup operations in the hash table index experience significant time delays resulting in data ingestion being significantly slower, efficiency is decreased as the backup throughput of the device is reduced as taught by BALDWIN et al (Paragraph [0017]).

Regarding dependent claim 2, WANG et al and BALDWIN et al teach, the method of claim 1. 
WANG et al further teaches, further comprising: … determining, by the system, whether the hash value associated with the first subset of data satisfies the first match criterion with respect to the stored hash value associated with a stored subset of data stored in a second data store and associated with a file that is associated with the memory index (Paragraph [0036] At step 315, deduplication module 144 determines whether the hash of chunk 202, computed at step 310, is in chunk hash table 140 (i.e., first criterion being matching the hash of chunk to chunk hash table). If so, then the identical contents of chunk 202 have been previously processed by deduplication module 144, such as for example as part of a previous execution of method 300. Also if so, then a chunk identical to chunk 202 is already present within chunk store 134. [0038] At step 320, deduplication module 144 adds an entry for chunk 202 to chunk hash table 140 (i.e., first criterion being matching the hash of chunk to chunk hash table). As discussed above, an entry in chunk hash table 140 includes a key-value mapping between (a) the key, which is the hash of the contents of chunk 202 (i.e., chunk hash 150), and (b) the value, which is a chunk ID 152. [0039] At step 325, deduplication module 144 adds an entry for chunk 202 to chunk ID table 142. As described above, an entry in chunk ID table 142 includes a key-value mapping between (a) the key, which is the chunk ID 152 assigned at step 320, and (b) the value, which is a set of information 158 about chunk 202 corresponding to that chunk ID 152 (i.e., chunk ID is associated with the stored 
and in response to determining that the hash value satisfies the first match criterion, determining, by the system, whether the first subset of data is to be removed from the write operation and not written to the first data store based at least in part on a second result of determining whether the first subset of data satisfies a second match criterion with respect to the stored subset of data (Paragraph [0031] Chunk ID table 142 is shown in detail in FIG. 1C. Chunk ID table 142 is a key-value data structure that, when given a key, returns a value that is mapped to that key. The key-value mappings are mappings from the key to the value. Chunk ID table 142 includes key-value mappings, each mapping being between (a) the key, which is chunk ID 152 (e.g., obtained from chunk hash table 140), and (b) the value, which is a set of information 158 about chunk 202 corresponding to that chunk ID 152. Set of information 158 may include: chunk hash 150, a pointer 154 to the contents of chunk 202 within chunk store 134, and a reference count 156 of chunk 202. [0047] At step 345, deduplication module 144 checks that the hash calculated at step 310 is the same as chunk hash 150 within the obtained set of information 158 (i.e., based on the second result is equated to set of information which includes the chunk content/address. If the second criterion matches then the data is already in the storage and hence it is not written that is data is removed from the write operation). 
BALDWIN et al further teaches, during the write operation, determining, by the system, whether the hash value associated with the first subset of data satisfies the first match criterion with respect to the stored hash value associated with a stored subset of data stored in a second data store  …(Paragraph [0017], [0020] In current systems for inline deduplication over objects (Fixed amount of chunks/subsets are extracted from an object. In other words the object file is divided/segmented into subsets/chunks as taught in Paragraph [0020]), the technique of calculating the fingerprint (e.g., hash value) of the file received over Hypertext Transfer Protocol (HTTP) is to compare the calculated fingerprint for the entire object with the set of available fingerprints of the existing files on the storage system). 

Regarding dependent claim 3, WANG et al and BALDWIN et al teach, the method of claim 2. 
WANG et al further teaches, further comprising: in response to determining that the first subset of data satisfies the second match criterion with respect to the stored subset of data, determining, by the system, … data deduplication is to be performed to remove the first subset of data from the write operation (Paragraph [0031] Chunk ID table 142 is shown in detail in FIG. 1C. Chunk ID table 142 is a key-value data structure that, when given a key, returns a value that is mapped to that key. The key-value mappings are mappings from the key to the value. Chunk ID table 142 includes key-value mappings, each mapping being between (a) the key, which is chunk ID 152 (e.g., obtained from chunk hash table 140), and (b) the value, which is a set of information 158 about chunk 202 corresponding to that chunk ID 152. Set of information 158 may include: chunk hash 150, a pointer 154 to the contents of chunk 202 within chunk store 134, and a reference count 156 of chunk 202. [0047] At step 345, deduplication module 144 checks that the hash calculated at step 310 is the same as chunk hash 150 within the obtained set of information 158 (i.e., based on the second result is equated to set of information which includes the chunk content/address. If the second criterion matches then the data is already in the storage and hence it is not written that is data is removed from the write operation). 
performing, by the system, the …deduplication to remove the first subset of data from the write operation (Paragraph [0047] At step 345, deduplication module 144 checks that the hash calculated at step 310 is the same as chunk hash 150 within the obtained set of information 158 (i.e., based on the second result is equated to set of information which includes the chunk content/address. If the second criterion matches then the data is already in the storage and hence it is not written that is data is removed from the write operation). 
and inserting, by the system, a reference value in the file, wherein the reference value indicates a storage location, in the second data store, of the stored subset of data that corresponds to the first subset of data (Paragraph [0031] Set of information 158 may be considered "metadata" about chunk 202 corresponding to chunk ID 152 mapped to the set of information 158. Set of information 158 may include: chunk hash 150, a pointer 154 to the contents of chunk 202 within chunk store 134, and a reference count 156 of chunk 202. Pointer 154 to the contents of chunk 202 may include an address, such as a logical or physical address. Pointer 154 may be a plurality of pointers 154 pointing to locations of file 200 within storage(s) 114. Pointer 154 may be a plurality of pointers if, for example, file 200 is a fragmented file, stored in more than one location within storage(s) 114. In an embodiment, pointer 154 is a logical pointer 154. Reference count 156 of chunk 202 may be the number of pointers (e.g., pointers 154 and pointers of files 200) that point to the contents of chunk 202 (i.e., inserting a reference value in the file which indicates a storage location).
BALDWIN et al further teaches, …inline data deduplication is to be performed to remove the first subset of data from the write operation (Paragraph [0017], [0020] In current systems for inline deduplication over objects (Fixed amount of chunks/subsets are extracted from an object. In other words the object file is divided/segmented into subsets/chunks as taught in Paragraph [0020]), the technique of calculating the fingerprint (e.g., hash value) of the file received over Hypertext Transfer Protocol (HTTP) is to compare the calculated fingerprint for the entire object with the set of available fingerprints of the existing files on the storage system). 

Regarding dependent claim 4, WANG et al and BALDWIN et al teach, the method of claim 1. 
WANG et al further teaches, further comprising: in response to determining that the hash value associated with the first subset of data does not satisfy the first match criterion with respect to any of the hash values in the memory index, determining, by the system, … data deduplication is not to be performed to remove the first subset of data from the write operation (Paragraph  [0037] If the hash of chunk 202 is not in chunk hash table 140, then the contents of chunk 202 have not been previously deduplicated through the processing of method 300, and method 300 proceeds to step 320 (i.e., data deduplication is not to be performed, that means that the data is unique and needs to be stored and not removed from the write operation)).
inserting, by the system, the hash value, a description of a file associated with the hash value (Paragraph [0031] Set of information 158 may be considered "metadata" about chunk 202 corresponding to chunk ID 152 mapped to the set of information 158 (i.e., metadata will have description of the file) , and an offset value associated with the first subset of data in the memory index (Paragraph [0026]  Deduplication module 144 chooses a small window size and computes a hash for a byte window starting at every byte offset of file 200 (i.e., an offset associated with the subset of data).
and writing, by the system, the first subset of data to the data store (Paragraph [0037] If the hash of chunk 202 is not in chunk hash table 140, then the contents of chunk 202 have not been previously deduplicated through the processing of method 300, and method 300 proceeds to step 320 (i.e., data deduplication is not to be performed, that means that the data is unique and needs to be stored and not removed from the write operation)).
BALDWIN et al further teaches, …inline data deduplication is to be performed to remove the first subset of data from the write operation (Paragraph [0020], [0021] The mechanisms of matching the hash values continue the foregoing processes until a match is found. The mechanisms may break the processing if a mismatch is found, and then, the mechanisms may insert the hash value in the hash table for the nth iteration (HTi). The mechanisms determine that the mismatch of sampled data indicates that the sampled data is a unique data object).

Regarding dependent claim 5, WANG et al and BALDWIN et al teach, the method of claim 1. 
WANG et al further teaches, wherein the hash value is a first hash value (Paragraph  [0037] the hash of chunk 202 is the first hash value), and wherein the method further comprises: in response to determining that the first hash value associated with the first subset of data does not satisfy the first match criterion with respect to any stored hash values stored in the memory index (Paragraph  [0037] If the hash of chunk 202 is not in chunk hash table 140, then the contents of chunk 202 have not been previously deduplicated through the processing of method 300, and method 300 proceeds to step 320), determining, by the system, whether the first hash value satisfies the first match criterion with respect to a second hash value associated with a set of pending updates associated with the write operation (Paragraph [0032] FIG. 3 depicts a flow diagram of a method 300 of deduplicating a file 200, according to an embodiment. Method 300 may be performed by deduplication module 144. Method 300 may be performed in the background, asynchronously relative to I/O operations directed to chunk store 134.Method 300 may be triggered when deduplication module 144 identifies within chunk store 134 a new file 200 that has not been previously deduplicated. [0052] At step 402, deduplication module 144 (or hypervisor 116 or an operating system of host 105 or VM 120) marks memory pages of a shared chunk 202 as COW (i.e., marked copy on write (COW) corresponds to pending updates to the storage. Examiner interprets pending updates as second hash value). Step 402 may be performed as part of method 300, such as part of step 350 of method 300. [0053] At step 404, chunk store 134 or hypervisor 116 receives an operation to update a file 200 that references the shared chunk 202, and the update operation is directed at contents of shared chunk 202).
and in response to determining that the first hash value satisfies the first match criterion with respect to the second hash value associated with the set of pending updates  (Paragraph [0032] FIG. 3 depicts a flow diagram of a method 300 of deduplicating a file 200, according to an embodiment. Method 300 may be performed by deduplication module 144. Method 300 may be performed in the background, asynchronously relative to I/O operations directed to chunk store 134.Method 300 may be triggered when deduplication module 144 identifies within chunk store 134 a new file 200 that has not been previously deduplicated. Method 300 may be triggered when deduplication module 144 identifies within chunk store 134 a file 200 that has been updated recently but has not been updated for a threshold length of time. The threshold length of time may be, for example, 24 hours. "Recently" may mean a time range that is between (a) the time that the file was last updated, and (b) the current time (i.e., determining by the system that a new file is uploaded based on the write operation is still in pending updates which is associated with second hash value)), determining, by the system, whether the first subset of data associated with the first hash value satisfies the second match criterion  (Paragraph [0031] Chunk ID table 142 is shown in detail in FIG. 1C. Chunk ID table 142 is a key-value data structure that, when given a key, returns a value that is mapped to that key. The key-value mappings are mappings from the key to the value. Chunk ID table 142 includes key-value mappings, each mapping being between (a) the key, which is chunk ID 152 (e.g., obtained from chunk hash table 140), and (b) the value, which is a set of information 158 about chunk 202 corresponding to that chunk ID 152. Set of information 158 may include: chunk hash 150, a pointer 154 to the contents of chunk 202 within chunk store 134, and a reference count 156 of chunk 202. [0047] At step 345, deduplication module 144 checks that the hash calculated at step 310 is the same as chunk hash 150 within the obtained set of information 158 (i.e., based on the second result is equated to set of information which includes the chunk content/address. If the second criterion matches then the data is already in the storage and hence it is not written that is data is removed from the write operation) with respect to a second subset of data that is associated with the second hash value, wherein the pending updates comprises the second subset of data (Paragraph [0050], [0052, [0053] At step 355, deduplication module 144 determines whether more chunks 202 of file 200 (of step 305) need to be processed by method 300. If so, method 300 returns to step 310. [0034] At step 310, deduplication module 144 chooses a first or next chunk 202 for processing in subsequent steps of method 300. If step 310 is reached from step 305, then method 300 has just began its first iteration, and so deduplication module 144 chooses the first chunk 202 of file 200. If step 310 is reached from step 355, then method 300 is restarting a new iteration, and so deduplication module 144 chooses the next chunk 202 of file 200 (i.e., when first subset of data matches the first criterion that is the hash value, then checks for more chunk/subset of files to be processed and the process iterates all over for second subset of data. i.e. if a new data is written in the second chunk then the system marks it as copy on write that is pending updates to the storage as explained earlier in paragraph [0052], [0053]).

Regarding dependent claim 6, WANG et al and BALDWIN et al teach, the method of claim 5. 
WANG et al further teaches, further comprising: in response to determining that the first subset of data associated with the first hash value satisfies the second match criterion (Paragraph [0030] Chunk hash table 140 is shown in detail in FIG. 1C. Chunk hash table 140 is a key-value data structure that, when given a key, returns a value that is mapped to that key. The key-value mappings are mappings from the key to the value. Chunk hash table 140 includes key-value mappings, each mapping being between (a) the key, which is the hash of the contents of chunk 202 (i.e., chunk hash 150), and (b) the value, which is a chunk identifier (ID) 152. 8 (i.e., based on the second result that is the set of information which includes the chunk content/address). [0047] If the hashes match, then deduplication module 144 performs a write to the storage block copied into cache at step 340. The write increases reference count 156, within the set of information 158, by one. The increase by one indicates that the portion of file 200 corresponding to chunk 202 chosen at step 310 is now pointing to the chunk 202 that had already been in chunk store 134 (and whose set of information 158 was obtained at previous steps)) with respect to the second subset of data associated with the second hash value [0052] At step 402, deduplication module 144 (or hypervisor 116 or an operating system of host 105 or VM 120) marks memory pages of a shared chunk 202 as COW (i.e., marked copy on write corresponds to pending updates to the storage). Step 402 may be performed as part of method 300, such as part of step 350 of method 300. [0053] At step 404, chunk store 134 or hypervisor 116 receives an operation to update a file 200 that references the shared chunk 202, and the update operation is directed at contents of shared chunk 202), removing, by the system, the first subset of data and the second subset of data from the write operation (Paragraph  [0047] At step 345, deduplication module 144 checks that the hash calculated at step 310 is the same as chunk hash 150 within the obtained set of information 158 (i.e., based on the second result that is the set of information which has the chunk content/address). If not, then method 300 may abort and an administrator may be notified (i.e., the first subset and second subset of data is not written to the data store or removed from the write operation); 
and scheduling, by the system, inserting a first reference value associated with the first subset of data and a second reference value associated with the second subset of data in a file stored in the first data store and associated with the memory index (Fig. 1B  i.e. memory 110 which is first data store is associated with memory index which is cache), wherein the first reference value and the second reference value indicate a storage location, in the second data store, of a stored subset of data that corresponds to the first subset of data and the second subset of data (Fig. 4 Paragraph [0047] At step 345, deduplication module 144 checks that the hash calculated at step 310 is the same as chunk hash 150 within the obtained set of information 158. If not, then method 300 may abort and an administrator may be notified. If the hashes match, then deduplication module 144 performs a write to the storage block copied into cache at step 340. The write increases reference count 156, within the set of information 158, by one. The increase by one indicates that the portion of file 200 corresponding to chunk 202 chosen at step 310 is now pointing to the chunk 202 that had already been in chunk store 134 (and whose set of information 158 was obtained at previous steps. Paragraphs [0053]-[0057] when a new data is written to the chunks, the chunk store which is second data store receives an update operation to the updated file which is marked as copy on write. Once updated reference values corresponds to the respective subsets of chunks/data by decreasing the reference count in shared chunk/data and increasing the reference count for the new chunks/data)).

Regarding dependent claim 7, WANG et al and BALDWIN et al teach, the method of claim 1. 
WANG et al further teaches, further comprising: …in response to determining that the hash value satisfies the first match criterion with respect to the stored hash value associated with a stored subset of data stored in the first data store and associated with a file that is associated with the memory index  (Fig. 3 Paragraph [0032] Method 300 may be triggered when deduplication module 144 identifies within chunk store 134 a new file 200 that has not been previously deduplicated (i.e., Data deduplication is a process that eliminates excessive copies of data).  [0033] step 305 shows that the file is divided into chunks or subsets. Paragraph [0034] step 310 shows each chunk or subset computes a hash of the chunk, Paragraph [0036] At step 315, deduplication module 144 determines whether the hash of chunk 202, computed at step 310, is in chunk hash table 140. If so, then the identical contents of chunk 202 have been previously processed by deduplication module 144, such as for example as part of a previous execution of method 300 (i.e., determining whether the chunk/subset satisfies the first criterion that is matching hash of chunk 202 to chunk hash table which is stored in the chunk hash table which is the memory index), determining, by the system, … data deduplication is not to be performed (Paragraph [0052] At step 402, deduplication module 144 (or hypervisor 116 or an operating system of host 105 or VM 120) marks memory pages of a shared chunk 202 as COW (i.e., marked copy on write corresponds to pending updates to the storage so that the system decides that the data deduplication is not to be performed as the data is pending and not yet updated in the storage). Step 402 may be performed as part of method 300, such as part of step 350 of method 300. [0053] At step 404, chunk store 134 or hypervisor 116 receives an operation to update a file 200 that references the shared chunk 202, and the update operation is directed at contents of shared chunk 202);
and removing, by the system, the first subset of data from the write operation (Paragraph [0047] At step 345, deduplication module 144 checks that the hash calculated at step 310 is the same as chunk hash 150 within the obtained set of information 158 (i.e., based on the second result that is the set of information which has the chunk content/address). If not, then method 300 may abort and an administrator may be notified (i.e., the first subset of data is not written to the data store or removed from the write operation). 
BALDWIN et al further teaches, during the write operation, …that the inline data deduplication is not to be performed (Paragraph [0020], [0021] The mechanisms of matching the hash values continue the foregoing processes until a match is found. The mechanisms may break the processing if a mismatch is found, and then, the mechanisms may insert the hash value in the hash table for the nth iteration (HTi). The mechanisms determine that the mismatch of sampled data indicates that the sampled data is a unique data object).

Regarding dependent claim 8, WANG et al and BALDWIN et al teach, the method of claim 7.
WANG et al further teaches, further comprising: redirecting, by the system, the first subset of data to be written to the second data store in a separate write operation (Paragraph [0053] At step 404, chunk store 134 or hypervisor 116 receives an operation to update a file 200 that references the shared chunk 202, and the update operation is directed at contents of shared chunk 202 (i.e., receiving an operation to update a file is redirecting the first subset of data to the storage location);
and inserting, by the system, a reference value associated with the first subset of data in the file that is stored in the first data store and associated with the memory index, wherein the reference value indicates a storage location of the first subset of data in the second data store (Fig. 4 Paragraph [0047] At step 345, deduplication module 144 checks that the hash calculated at step 310 is the same as chunk hash 150 within the obtained set of information 158. If not, then method 300 may abort and an administrator may be notified. If the hashes match, then deduplication module 144 performs a write to the storage block copied into cache at step 340. The write increases reference count 156, within the set of information 158, by one. The increase by one indicates that the portion of file 200 corresponding to chunk 202 chosen at step 310 is now pointing to the chunk 202 that had already been in chunk store 134 (and whose set of information 158 was obtained at previous steps. Paragraphs [0053]-[0057] when a new data is written to the chunks, the chunk store which is second data store receives an update operation to the updated file which is marked as copy on write. Once updated reference values corresponds to the respective subsets of chunks/data by decreasing the reference count in shared chunk/data and increasing the reference count for the new chunks/data)).

Regarding dependent claim 9, WANG et al and BALDWIN et al teach, the method of claim 1. 
WANG et al further teaches, further comprising: segmenting, by the system, the set of data into respective subsets of data, comprising the first subset of data (Fig. 3 Paragraph [0033] step 305 shows that the file is divided or segmented into chunks or subsets of data);
and generating, by the system, respective hash values associated with the respective subsets of data, comprising the hash value associated with the first subset of data (Paragraph [0034], [0035] step 310 shows each chunk or subset computes a hash of the chunk for respective subsets/chunks of data).

Regarding dependent claim 10, WANG et al and BALDWIN et al teach, the method of claim 9. 
WANG et al further teaches, wherein the generating the respective hash values comprises generating the respective hash values based at least in part on the respective subsets of data and a hash algorithm, wherein the hash algorithm satisfies a defined hashing speed criterion and a defined criterion (Paragraph [0034], [0035] As part of step 310, deduplication module 144 computes a hash of the data of chosen chunk 202. The hash may be computed by, using a hash algorithm such as secure hash algorithm (SHA) for example, SHA-256 or SHA-512. In an embodiment, the computed hash may be truncated (e.g., a SHA-512 hash may be truncated to 256 bits which can be a hashing algorithm is based on speed algorithm as defined in specification Paragraph [0059] and defined criterion), and the truncated hash is the hash that is "computed at step 310" for subsequent steps of method 300), and wherein the method further comprises: comparing, by the system, the hash value to stored hash values, comprising the stored hash value, that are stored in the memory index; and determining, by the system, whether the hash value matches any of the stored hash values, in accordance with the first match criterion  (Paragraph [0036] At step 315, deduplication module 144 determines whether the hash of chunk 202, computed at step 310, is in chunk hash table 140 (i.e., comparing the hash of chunk to the stored hash chunk in the chunk hash table).        

Regarding dependent claim 13, WANG et al and BALDWIN et al teach, the method of claim 1. 
WANG et al further teaches, further comprising: during a read operation to read a second set of data from the data store (Paragraph [0011] An improvement in deduplication improves the way a computer stores and retrieves data in memory and in storage (i.e., reading a set of data to the data store which can be a second set of data), segmenting, by the system, the second set of data into respective second subsets of data (Paragraph [0033] step 305 shows that the file is divided into chunks or subsets);
generating, by the system, respective hash values associated with the respective second subsets of data (Paragraph [0034], [0035] step 310 shows each chunk or subset computes a hash of the chunk for respective subsets/chunks of data); 
and storing, by the system, the respective hash values in the memory index (Paragraph  [0017] System memory 110 is hardware allowing information, such as executable instructions, configurations, and other data, to be stored and retrieved).

Regarding dependent claim 14, WANG et al and BALDWIN et al teach, the method of claim 13. 
WANG et al further teaches, further comprising: determining, by the system, whether a second hash value of the respective hash values satisfies the first match criterion with respect to a second stored hash value stored in the memory index (Paragraph [0032] FIG. 3 depicts a flow diagram of a method 300 of deduplicating a file 200, according to an embodiment. Method 300 may be performed by deduplication module 144. Method 300 may be performed in the background, asynchronously relative to I/O operations directed to chunk store 134.Method 300 may be triggered when deduplication module 144 identifies within chunk store 134 a new file 200 that has not been previously deduplicated. [0052] At step 402, deduplication module 144 (or hypervisor 116 or an operating system of host 105 or VM 120) marks memory pages of a shared chunk 202 as COW (i.e., marked copy on write corresponds to pending updates to the storage. Examiner interprets pending updates as second hash value)). Step 402 may be performed as part of method 300, such as part of step 350 of method 300. [0053] At step 404, chunk store 134 or hypervisor 116 receives an operation to update a file 200 that references the shared chunk 202, and the update operation is directed at contents of shared chunk 202);
and in response to determining that the second hash value satisfies the first match criterion with respect to the second stored hash value stored in the memory index, transferring, by the system, the second hash value and the second stored hash value to an asynchronous data deduplication process to perform data deduplication with regard to the second hash value and the second stored hash value (Paragraph [0023] Deduplication module 144 may be a background process working asynchronously relative to input/output (I/O) operations directed to chunk store 134, such as asynchronously relative to I/O operations by hosts 105 or VMs 120. [0056], [0057]  At step 410, the portion of updated file 200 that previously pointed to shared chunk 202 is remapped to point to new chunk 202. Because file 200 is remapped to a new chunk, shared chunk 200 may no longer be a "shared chunk" at step 410. As part of step 410 or as part of another step of method 400, the memory pages of previously shared chunk 202 may be unmarked COW and the deduplication process is performed with regard to the second hash value which is pending updates/COW to the second stored hash value).

Regarding independent claim 15, WANG; Wenguang (US 20210064582 A1) teaches, a system, comprising: a memory that stores computer executable components (Paragraph [0017]); 
and a processor that executes computer executable components stored in the memory (Paragraph [0018] a processor of a computer system), wherein the computer executable components comprise: a write component that initiates execution of a write operation to write chunks of data to a data store (Paragraph [0011] An improvement in deduplication improves the way a computer stores and retrieves data in memory and in storage (i.e., writing a set of data to the data store);   
and a data management component that, during the write operation, determines whether an … data deduplication is to be executed to remove a first chunk of data of the chunks of data from the write operation to prevent the first chunk of data from being written to the first data store based at least in part on a first result of a first determination regarding whether a hash associated with the first chunk of data satisfies a first match criterion in relation to a stored hash stored in a memory index (Fig. 3 Paragraph [0032] Method 300 may be triggered when deduplication module 144 identifies within chunk store 134 a new file 200 that has not been previously deduplicated (i.e., Data deduplication is a process that eliminates excessive copies of data).  [0033] step 305 shows that the file is divided into chunks or subsets. Paragraph [0034] step 310 shows each chunk or subset computes a hash of the chunk, Paragraph [0036] At step 315, deduplication module 144 determines whether the hash of chunk 202, computed at step 310, is in chunk hash table 140. If so, then the identical contents of chunk 202 have been previously processed by deduplication module 144, such as for example as part of a previous execution of method 300 (i.e., determining whether the chunk/subset satisfies the first criterion that is matching hash of chunk 202 to chunk hash table which is stored in the chunk hash table which is the memory index);
WANG et al fails to explicitly teach, …an inline data deduplication …
BALDWIN; Duane Mark (US 20130268497 A1) teaches, …an inline data deduplication is to be performed to remove a first … set of data from the write operation …(Paragraph [0015] As mentioned previously, with the emergence of storage cloud services, a new set of issues for storage cloud service providers are present in the area of data de-duplication, specifically when the storage cloud services providers want to reduce the consumption of their storage space using techniques such as deduplication. A storage cloud services provider may elect to use post process deduplication and/or in-line deduplication. In-line deduplication is the process where the deduplication hash calculations are created on the target device as the data enters the device in real time. If the device spots a block that the device already stored on the storage system, the device does not store the new block, but rather, simply makes a reference to the existing block.
Therefore it would have been obvious to one of the ordinarily skilled in the art at the time of the filing of the invention to have modified the teachings of WANG et al to provide system, methods for increased in-line deduplication efficiency in a computing environment as taught by BALDWIN et al (Paragraph [0016]).
It would have been obvious to one of the ordinary skill in the art, to provide a system, method for increased in-line deduplication efficiency in a computing environment. The benefit of in-line deduplication over post-process deduplication is that in-line deduplication requires less storage as data is not duplicated. On the other hand, because hash calculations and lookup operations in the hash table index experience significant time delays resulting in data ingestion being significantly slower, efficiency is decreased as the backup throughput of the device is reduced as taught by BALDWIN et al (Paragraph [0017]).

Regarding dependent claim 16, WANG et al and BALDWIN et al teach, the system of claim 15. 
WANG et al further teaches, wherein, based at least in part on the first result indicating that the hash satisfies the first match criterion in relation to the stored hash, the data management component determines whether the first chunk of data is to be removed from the write operation and not written to the first data store (Paragraph [0036] At step 315, deduplication module 144 determines whether the hash of chunk 202, computed at step 310, is in chunk hash table 140 (i.e., first criterion being matching the hash of chunk to chunk hash table). If so, then the identical contents of chunk 202 have been previously processed by deduplication module 144, such as for example as part of a previous execution of method 300. Also if so, then a chunk identical to chunk 202 is already present within chunk store 134. [0038] At step 320, deduplication module 144 adds an entry for chunk 202 to chunk hash table 140 (i.e., first criterion being matching the hash of chunk to chunk hash table). As discussed above, an entry in chunk hash table 140 includes a key-value mapping between (a) the key, which is the hash of the contents of chunk 202 (i.e., chunk hash 150), and (b) the value, which is a chunk ID 152. [0039] At step 325, deduplication module 144 adds an entry for chunk 202 to chunk ID table 142. As described above, an entry in chunk ID table 142 includes a key-value mapping between (a) the key, which is the chunk ID 152 assigned at step 320, and (b) the value, which is a set of information 158 about chunk 202 corresponding to that chunk ID 152 (i.e., chunk ID is associated with the stored subset of data in the second data store which is one of the storages and chunk hash table which is the memory index) based at least in part on a second result of a second determination regarding whether the first chunk of data satisfies a second match criterion in relation to a stored chunk of data stored in a second data store and associated with a file that is associated with the memory index (Paragraph [0031] Chunk ID table 142 is shown in detail in FIG. 1C. Chunk ID table 142 is a key-value data structure that, when given a key, returns a value that is mapped to that key. The key-value mappings are mappings from the key to the value. Chunk ID table 142 includes key-value mappings, each mapping being between (a) the key, which is chunk ID 152 (e.g., obtained from chunk hash table 140), and (b) the value, which is a set of information 158 about chunk 202 corresponding to that chunk ID 152. Set of information 158 may include: chunk hash 150, a pointer 154 to the contents of chunk 202 within chunk store 134, and a reference count 156 of chunk 202. [0047] At step 345, deduplication module 144 checks that the hash calculated at step 310 is the same as chunk hash 150 within the obtained set of information 158 (i.e., based on the second result is equated to set of information which includes the chunk content/address. If the second criterion matches then the data is already in the storage and hence it is not written that is data is removed from the write operation). 

Regarding dependent claim 17, WANG et al and Barrel et al teach, the system of claim 16. 
WANG et al further teaches, wherein, based at least in part on the second result indicating that the first chunk of data satisfies the second match criterion in relation to the stored chunk of data, the data management component determines that the … data deduplication is to be performed to remove the first chunk of data from the write operation (Paragraph [0031] Chunk ID table 142 is shown in detail in FIG. 1C. Chunk ID table 142 is a key-value data structure that, when given a key, returns a value that is mapped to that key. The key-value mappings are mappings from the key to the value. Chunk ID table 142 includes key-value mappings, each mapping being between (a) the key, which is chunk ID 152 (e.g., obtained from chunk hash table 140), and (b) the value, which is a set of information 158 about chunk 202 corresponding to that chunk ID 152. Set of information 158 may include: chunk hash 150, a pointer 154 to the contents of chunk 202 within chunk store 134, and a reference count 156 of chunk 202. [0047] At step 345, deduplication module 144 checks that the hash calculated at step 310 is the same as chunk hash 150 within the obtained set of information 158 (i.e., based on the second result is equated to set of information which includes the chunk content/address. If the second criterion matches then the data is already in the storage and hence it is not written that is data is removed from the write operation), performs the …data deduplication to remove the first chunk of data from the write operation (Paragraph [0047] At step 345, deduplication module 144 checks that the hash calculated at step 310 is the same as chunk hash 150 within the obtained set of information 158 (i.e., based on the second result is equated to set of information which includes the chunk content/address. If the second criterion matches then the data is already in the storage and hence it is not written that is data is removed from the write operation), and inserts a reference value in the file, wherein the reference value indicates a storage location, in the second data store, of the stored chunk of data that corresponds to the first chunk of data (Paragraph [0031] Set of information 158 may be considered "metadata" about chunk 202 corresponding to chunk ID 152 mapped to the set of information 158. Set of information 158 may include: chunk hash 150, a pointer 154 to the contents of chunk 202 within chunk store 134, and a reference count 156 of chunk 202. Pointer 154 to the contents of chunk 202 may include an address, such as a logical or physical address. Pointer 154 may be a plurality of pointers 154 pointing to locations of file 200 within storage(s) 114. Pointer 154 may be a plurality of pointers if, for example, file 200 is a fragmented file, stored in more than one location within storage(s) 114. In an embodiment, pointer 154 is a logical pointer 154. Reference count 156 of chunk 202 may be the number of pointers (e.g., pointers 154 and pointers of files 200) that point to the contents of chunk 202);
(Paragraph [0031] Set of information 158 may be considered "metadata" about chunk 202 corresponding to chunk ID 152 mapped to the set of information 158. Set of information 158 may include: chunk hash 150, a pointer 154 to the contents of chunk 202 within chunk store 134, and a reference count 156 of chunk 202. Pointer 154 to the contents of chunk 202 may include an address, such as a logical or physical address. Pointer 154 may be a plurality of pointers 154 pointing to locations of file 200 within storage(s) 114. Pointer 154 may be a plurality of pointers if, for example, file 200 is a fragmented file, stored in more than one location within storage(s) 114. In an embodiment, pointer 154 is a logical pointer 154. Reference count 156 of chunk 202 may be the number of pointers (e.g., pointers 154 and pointers of files 200) that point to the contents of chunk 202 (i.e., inserting a reference value in the file which indicates a storage location).
BALDWIN et al further teaches, …inline data deduplication is to be performed to remove the first subset of data from the write operation (Paragraph [0017], [0020] In current systems for inline deduplication over objects (Fixed amount of chunks/subsets are extracted from an object. In other words the object file is divided/segmented into subsets/chunks as taught in Paragraph [0020]), the technique of calculating the fingerprint (e.g., hash value) of the file received over Hypertext Transfer Protocol (HTTP) is to compare the calculated fingerprint for the entire object with the set of available fingerprints of the existing files on the storage system). 

Regarding dependent claim 18, WANG et al and Barrel et al teach, the system of claim 15. 
WANG et al further teaches, wherein, based at least in part on the first result indicating that the first chunk of data does not satisfy the first match criterion in relation to the stored chunk of data, the data management component determines that the inline data deduplication is not to be performed to remove the first chunk of data from the write operation (Paragraph  [0037] If the hash of chunk 202 is not in chunk hash table 140, then the contents of chunk 202 have not been previously deduplicated through the processing of method 300, and method 300 proceeds to step 320 (i.e., data deduplication is not to be performed, that means that the data is unique and needs to be stored and not removed from the write operation)), inserts the hash, a description of a file associated with the hash (Paragraph [0031] Set of information 158 may be considered "metadata" about chunk 202 corresponding to chunk ID 152 mapped to the set of information 158 (i.e., metadata will have description of the file), and an offset value associated with the first chunk of data in the memory index (Paragraph [0026]  Deduplication module 144 chooses a small window size and computes a hash for a byte window starting at every byte offset of file 200 (i.e., an offset associated with the subset of data);
and writes the first chunk of data to the first data store Paragraph [0038] At step 320, deduplication module 144 adds an entry for chunk 202 to chunk hash table 140. As discussed above, an entry in chunk hash table 140 includes a key-value mapping between (a) the key, which is the hash of the contents of chunk 202 (i.e., chunk hash 150), and (b) the value, which is a chunk ID 152. Chunk hash 150 was computed at step 310. Chunk ID 152 is assigned to chunk 202 as described above with reference to FIG. 2. If chunk 202 chosen at step 310 is the first chunk 202 of a file (e.g., chunk 202.sub.A of file 200.sub.1), then chunk ID 152 may be assigned arbitrarily. If chunk 202 chosen at step 310 is a second or subsequent chunk 202 (e.g., chunk 202.sub.B of file 200.sub.1), then chunk ID may be the next sequential identifier after chunk ID 152 assigned to the previous chunk 202. Previous chunk 202 may be, for example, chunk 202.sub.A of file 200.sub.1 (i.e., writing the first subset of data to first data store. Examiner interprets memory 110 as first data store which stores programs and data for the host computer)).

Regarding independent claim 19, WANG; Wenguang (US 20210064582 A1) teaches, a machine-readable storage medium (Paragraph [0061]), comprising executable instructions that, when executed by a processor (Paragraph [0018]), facilitate performance of operations, comprising: initiating execution of a write operation to initiate writing data to a data store (Paragraph [0011] An improvement in deduplication improves the way a computer stores and retrieves data in memory and in storage (i.e., writing a set of data to the data store. Examiner interprets memory 110 as first data store which stores programs and data for the host computer));   
and during the write operation, determining whether an … data deduplication is to be performed to remove a subset of the data from the write operation based at least in part on a first result of determining whether a hash value associated with the subset of the data satisfies a first match criterion in relation to a stored hash value stored in a memory index (Fig. 3 Paragraph [0032] Method 300 may be triggered when deduplication module 144 identifies within chunk store 134 a new file 200 that has not been previously deduplicated (i.e., Data deduplication is a process that eliminates excessive copies of data).  [0033] step 305 shows that the file is divided into chunks or subsets. Paragraph [0034] step 310 shows each chunk or subset computes a hash of the chunk, Paragraph [0036] At step 315, deduplication module 144 determines whether the hash of chunk 202, computed at step 310, is in chunk hash table 140. If so, then the identical contents of chunk 202 have been previously processed by deduplication module 144, such as for example as part of a previous execution of method 300 (i.e., determining whether the chunk/subset satisfies the first criterion that is matching hash of chunk 202 to chunk hash table which is stored in the chunk hash table which is the memory index);
WANG et al fails to explicitly teach, …an inline data deduplication …
BALDWIN; Duane Mark (US 20130268497 A1) teaches, …an inline data deduplication is to be performed to remove a first  subset of data from the write operation …(Paragraph [0015] As mentioned previously, with the emergence of storage cloud services, a new set of issues for storage cloud service providers are present in the area of data de-duplication, specifically when the storage cloud services providers want to reduce the consumption of their storage space using techniques such as deduplication. A storage cloud services provider may elect to use post process deduplication and/or in-line deduplication. In-line deduplication is the process where the deduplication hash calculations are created on the target device as the data enters the device in real time. If the device spots a block that the device already stored on the storage system, the device does not store the new block, but rather, simply makes a reference to the existing block.
Therefore it would have been obvious to one of the ordinarily skilled in the art at the time of the filing of the invention to have modified the teachings of WANG et al to provide system, methods for increased in-line deduplication efficiency in a computing environment as taught by BALDWIN et al (Paragraph [0016]).
It would have been obvious to one of the ordinary skill in the art, to provide a system, method for increased in-line deduplication efficiency in a computing environment. The benefit of in-line deduplication over post-process deduplication is that in-line deduplication requires less storage as data is not duplicated. On the other hand, because hash calculations and lookup operations in the hash table index experience significant time delays resulting in data ingestion being significantly slower, efficiency is decreased as the backup throughput of the device is reduced as taught by BALDWIN et al (Paragraph [0017]).

Regarding dependent claim 20, WANG et al and Barrel et al teach, the machine-readable storage medium of claim 19. 
WANG et al further teaches, wherein the operations further comprise: one of: in response to determining that the hash value associated with the subset of the data does not satisfy the first match criterion based at least in part on the first result, determining that the … data deduplication is not to be performed to remove the subset of the data from the write operation (Paragraph  [0037] If the hash of chunk 202 is not in chunk hash table 140, then the contents of chunk 202 have not been previously deduplicated through the processing of method 300, and method 300 proceeds to step 320 (i.e., data deduplication is not to be performed, that means that the data is unique and needs to be stored)).
or in response to determining that the hash value satisfies the first match criterion based at least in part on the first result, determining whether the subset of the data is to be removed from the write operation and not written to the data store based at least in part on a second result of determining whether the subset of the data satisfies a second match criterion with respect to a stored subset of data that is associated with the stored hash value and is in a file stored in the memory index, and in response to determining that the subset of the data satisfies the second match criterion with respect to the stored subset of data based at least in part on the second result, determining that the inline data deduplication is to be performed to remove the subset of the data from the write operation.

4.	 Claims 11 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over WANG; Wenguang (US 20210064582 A1) in view of BALDWIN; Duane Mark (US 20130268497 A1) and in further view of SIBBALD; Kern (US 20160055169 A1).
 
Regarding dependent claim 11, WANG et al and BALDWIN et al teach, the method of claim 1. 
WANG et al and BALDWIN et al fails to explicitly teach, further comprising: determining, by the system, whether a second subset of data of the set of data contains only data bits having zero values, wherein the second subset of data is a block of data; and in response to determining that the block of data contains only the data bits having the zero values, eliminating, by the system, the block of data, and replacing, by the system, the zero values of the block of data with a sparse region in metadata associated with a file that is stored in the data store.
SIBBALD; Kern (US 20160055169 A1) teaches, further comprising: determining, by the system, whether a second subset of data of the set of data contains only data bits having zero values, wherein the second subset of data is a block of data; and in response to determining that the block of data contains only the data bits having the zero values, eliminating, by the system, the block of data, and replacing, by the system, the zero values of the block of data with a sparse region in metadata associated with a file that is stored in the data store (Paragraph [0040] Chunk/subset: a series of contiguous bytes that may be of any size (the chunk or subset can any subset of a data). [0054] Sparse file:.sup.8 when a file or a volume is written, it is written in blocks as described above. In doing so it is possible to skip certain block addresses by advancing the write address and thus leaving a hole in a file that is not written. A file may have multiple holes, and is typically called a sparse file (As per the specification [00196] zero.block: the number of write blocks containing only zero values (e.g., with all zero bits) that are converted to sparse. Therefore the system determines the blocks of data and skips/eliminate the block of data and replaces with a sparse region. [0081] Also depicted in FIG. 7, are the Metadata and Aligned Volume formats described herein. The Metadata Volume contains all the header and metadata compacted and serialized.sup.11 in an efficient way without any alignment. In contrast the files are written to the Aligned Volume with the first block of the file aligned at a File Alignment address which may create an unallocated block or blocks preceding certain files such as shown for files 3, 5, and 6 in the figure. The end of each file also may have a Padding Size to fill the last unfilled block (or partial block) with zeros. A padded block will thus be allocated, but any additional blocks between the end of the padded block and the beginning of the next block that is the beginning of a new client file will be unallocated (i.e., replacing the zero values of the block with sparse region in the metadata). These methods plus a careful choice of the Block Size, creates an archive volume that can be optimally deduplicated. [0062] 4. Sparse file:8 A sparse file is a file that has holes or blocks that have never been written. Blocks not written will use no space on the disk and will generally return all zeros if read).
Therefore it would have been obvious to one of the ordinarily skilled in the art at the time of the filing of the invention to have modified the teachings of WANG et al and BALDWIN et al to provide a method for creating a volume that contains data from an original stream of multiple files, and which can be optimally deduplicated by an underlying deduplication storage system as taught by SIBBALD (Abstract).
It would have been obvious to one of the ordinary skill in the art, to provide a method that produces a Universal Deduplication Volume where archive data is efficiently stored permitting any deduplicating storage device to deduplicate the file data in the most optimal manner. It takes advantage of having a separate deduplicating filesystem or storage device on which or in which the Universal Deduplication Volume is stored as taught by SIBBALD (Paragraph [0014]) which helps eliminating the block of data containing zeros).
 
Regarding dependent claim 12, WANG et al, BALDWIN et al and SIBBALD teach, the method of claim 11. 
SIBBALD further teaches, wherein the determining whether the second subset of data contains only the data bits having the zero values, the eliminating of the block of data, and the replacing the zero values of the block of data with the sparse region occur prior to determining whether the … data deduplication is to be performed to remove the second subset of data from the write operation.  (Paragraph [0040] Chunk/subset: a series of contiguous bytes that may be of any size (the chunk or subset can any subset of a data). [0054] Sparse file:.sup.8 when a file or a volume is written, it is written in blocks as described above. In doing so it is possible to skip certain block addresses by advancing the write address and thus leaving a hole in a file that is not written. A file may have multiple holes, and is typically called a sparse file (As per the specification [00196] zero.block: the number of write blocks containing only zero values (e.g., with all zero bits) that are converted to sparse. Here in this case, the system determines the blocks of data that has zero values and skips/eliminate the block of data and replaces with a sparse region). [0081] Also depicted in FIG. 7, are the Metadata and Aligned Volume formats described herein. The Metadata Volume contains all the header and metadata compacted and serialized.sup.11 in an efficient way without any alignment. In contrast the files are written to the Aligned Volume with the first block of the file aligned at a File Alignment address which may create an unallocated block or blocks preceding certain files such as shown for files 3, 5, and 6 in the figure. The end of each file also may have a Padding Size to fill the last unfilled block (or partial block) with zeros. A padded block will thus be allocated, but any additional blocks between the end of the padded block and the beginning of the next block that is the beginning of a new client file will be unallocated (i.e., replacing the zero values of the block with sparse region in the metadata). These methods plus a careful choice of the Block Size, creates an archive volume that can be optimally deduplicated).
BALDWIN et al further teaches,…inline data deduplication is to be performed to remove the second subset of data from the write operation (Paragraphs [0020], [0021] The mechanisms of matching the hash values continue the foregoing processes until a match is found. The mechanisms may break the processing if a mismatch is found, and then, the mechanisms may insert the hash value in the hash table for the nth iteration (HTi). The mechanisms determine that the mismatch of sampled data indicates that the sampled data is a unique data object).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SUMAN RAJAPUTRA whose telephone number is (571) 272-4669. The examiner can normally be reached between 8:00 AM - 5:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ashish Thomas (571) 272-0631 can be reached. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only.
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/S. R./
Examiner, Art Unit 2164

/ASHISH THOMAS/Supervisory Patent Examiner, Art Unit 2164