DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
This communication is responsive to the amendment to the original application. This action is Final. Claims 1-20 are pending and have been examined.  
Response to Amendments
In the reply filed 9/16/22, claims 1–3, 9, 11–13, 15–17 and 19 were amended. Accordingly, claims 1 – 20 are pending. 
Response to Arguments
Applicant's arguments with respect to claims 1 – 20 have been carefully considered but are moot and not deemed persuasive in view of rejections below.
Examiner withdraws 101 rejections based on the substantial amendments. However, examiner respectfully disagrees with applicant’s arguments on pages 13 – 14, that the prior art fails to teach the substantial amendments in independent claims. All claims have been updated below with clarifying prior art citations. Kindly let me know if you have any questions. Thanks.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1 – 20 are rejected under 35 U.S.C. 103 as being unpatentable over Harnik, U.S. Patent Application Publication No. 2018/0074745 (Hereinafter “Harnik”), and further in view of Dangi et al., U.S. Patent No.: 10,853,324 (Hereinafter “Dangi”).
Regarding claim 1, Harnik teaches, a non-transitory machine-readable storage medium comprising instructions that upon execution cause a system comprising a hardware processor to (Harnik [0030]: processor):
receive write requests to add a first data value to and remove a second data value from a deduplication data store in a storage system comprising persistent storage (Harnik [0034]: To maintain the sketches on every write, the system uses the process illustrated in FIG. 4. In step 410, the system computes a hash value. In step 420, the system determines whether or not the hash is a special hash. If the hash is not a special hash, the process ends in step 430. If the hash is a special hash, the system determines if the special hash already exists in the sketch of the corresponding volume in step 440. If the hash does not already exist in the sketch of the corresponding volume, the hash is added to the sketch of the corresponding volume in step 450. Otherwise, the system adds 1 to the hash reference count in step 560.);
perform deduplication for the first data value (Harnik [0013]: In embodiments, systems comprises a plurality of logical volumes, and a storage controller coupled to the plurality of logical volumes, where the storage controller comprises a software application comprising a deduplication process that generates a set of hash values corresponding to the data chunks on the plurality of logical volumes, and is configured to maintain a volume sketch table to track relationships between underlying data stored on the plurality of volumes and determine an estimated deduplication ratio for the plurality of volumes based on the corresponding volume sketches, where the volume sketch table comprises at least one special hash, and where the at least one special hash comprises a subset of the set of hash values.);
add, in response to adding the first data value, a first storage location indicator to the deduplication data store, the first storage location indicator indicating a storage location of the first data value (Harnik [0034]: To maintain the sketches on every write, the system uses the process illustrated in FIG. 4. In step 410, the system computes a hash value. In step 420, the system determines whether or not the hash is a special hash. If the hash is not a special hash, the process ends in step 430. If the hash is a special hash, the system determines if the special hash already exists in the sketch of the corresponding volume in step 440. If the hash does not already exist in the sketch of the corresponding volume, the hash is added to the sketch of the corresponding volume in step 450. Otherwise, the system adds 1 to the hash reference count in step 560.);
remove, in response to removing the second data value, a second storage location indicator from the deduplication data store, the second storage location indicator indicating a storage location of the second data value (Harnik [0035]: Similarly, for data overwrites and deletions the system uses the process illustrated in FIG. 5. In step 510, the system identifies the hash of the deleted/freed chunk. In step 520, the system determines if the hash is a special hash. If the hash is not a special hash, the process ends in step 530. If the hash is a special hash, the system reduces the reference count of the hash in the corresponding volume by 1 in step 540. In step 550, the system checks if the reference count is zero. If the reference count is zero, the system deletes the corresponding volume sketch and the process terminates in step 570. Otherwise, the process terminates in step 560.);
Harnik does not clearly teach, compute respective first and second output values based on applying a function on the first and second storage location indicators; However, Dangi [Col. 7 lines 20 – 43] teaches, “After the data stream is stored by storage deduplication server 102, the stored data stream may be restored. (For example, restoring a stored data stream includes reconstructing the data stream using the stored data blocks and/or references to stored data blocks associated with the data stream.) … In various embodiments, the restored data stream itself can be used to verify its correctness without requiring a master copy of the original data stream and/or the parameters used to generate the original data stream, thereby eliminating the need to maintain a master copy of the data stream for verification purposes. In various embodiments, a data stream can be verified in the same manner regardless if the data stream is compressible or non-compressible.”).
wherein each respective storage location indicator of the first and second storage location indicators represents presence of a unique data value in the deduplication data store (Dangi [Col. 24 lines 14 – 21]:  The blocks that are duplicates (of previously stored data) are detected and only the unique blocks are stored in the deduplication storage. Instead of storing a duplicate block multiple times, a reference to the previously stored block is stored. The reference requires significantly less storage space than the duplicate data blocks would have required.);
update a first entry of an estimator based on the first output value, to reflect the addition of the first storage location indicator to the deduplication data store (Dangi Col. 5 lines 7 – 26]: “Embodiments of modifying a data stream with a predictable change rate are described herein. In various embodiments, a change rate parameter comprising a specified change rate value is received. For example, the change rate comprises a desired percentage by which to modify a data stream. In various embodiments, the data stream to be modified by an amount determined based on a received change rate can be either a compressible or non-compressible data stream. For example, if the change rate value were R percent, then the modified data stream would have data that is R percent different from the original data stream and also have data that is (100−R) percent in common with the original data stream. For example, the data stream and the modified data stream can be used together in a quality assurance and/or storage deduplication testing system to test a deduplication technique in which the modified data stream is compared to the original data stream to determine whether the deduplication technique can determine the correct amount of data by which the modified data stream differs from the original data stream.” Here, updating of an estimator is similar to the modifying based on the predictable change rate.); 
update a second entry of the estimator based on the second output value, to reflect the removal of the second storage location indicator from the deduplication data store (Dangi [Col. 36 lines 48 – 67]: Each instance of such a modified data stream associated with a particular change rate parameter is generated using an additional parameter that is referred to as a "change rate revision parameter," in some embodiments.  Note that the "change rate revision parameter" used with a change rate parameter, as will be described below, is different from the "revision parameter" corresponding to a seed value that was used to select at least two prime numbers to use to generate a merged data stream, as described above. As shown earlier, a data stream can be seen as a sequence of blocks.  Any minute change (a "corruption") within a block of the data stream can result in a modified block.  Even if one bit is changed, the modified block is determined to be a new block to a deduplicating storage server.  A block can be modified in various ways.  In the examples described below, the block size of each data block of a data stream is 8 KiB (e.g., because the average block size is 8 KiB (8,192 bytes) in a deduplicating storage server) and the data stream comprises of alternating 32-bit (4 byte) values each from one of two different sequences.); 
compute, using the updated estimator, a parameter relating to data deduplication at the deduplication data store (Dangi [Col. 7 line 44 – Col. 8 line 3]: To test the quality and/or effectiveness of the storage deduplication techniques used by storage deduplication server 102, quality assurance server 106 is configured to generate a modified version of the previously generated data stream that was sent to storage deduplication server 102 over network 104. In some embodiments, quality assurance server 106 is configured to generate a modified data stream based on the parameters associated with the original data stream (e.g., the seed value, the revision value, and/or the two prime numbers) and an additional parameter such as a change rate parameter. For example, a change rate value of a change rate parameter comprises a percentage by which to modify the original data stream to generate the modified data stream. Put another way, a modified data stream differs from the original data stream by the percentage specified by the change rate. In some embodiments, this modified data stream is sent by quality assurance server 106 over network 104 to storage deduplication server 102 (e.g., as part of the same or a different test backup operation) for storage. Storage deduplication server 102 is configured to segment the modified data stream into data blocks (e.g., of variable sizes) and store only the new data blocks (e.g., data blocks that have not already been stored at storage device 108). Given the data blocks stored at storage device 108 for the original data stream, storage deduplication server 102 should store only those new data blocks from the modified data stream that differ from the original data stream.). 
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to incorporate the teaching of Harnik et al. to the Dangi’s system by adding the feature of deduplication ratio. The references (Harnik and Dangi) teach features that are analogous art and they are directed to the same field of endeavor, such as data storage. Ordinary skilled artisan would have been motivated to do so to provide Harnik’s system with enhanced data. (See Dangi [Col. 7 lines 20 – 43], [Col. 7 line 44 – Col. 8 line 3], [Col. 36 lines 48 – 67]). One of the biggest advantages of network machine learning algorithms is their ability to improve over time. Machine learning technology typically improves efficiency and accuracy thanks to the ever-increasing amounts of data that are processed.
Regarding claim 2, the non-transitory machine-readable storage medium of claim 1, wherein the function is a hash function, the first output value is a first hash value based on applying the hash function on the first storage location indicator, and the second output value is a second hash value based on applying the hash function on the second storage location indicator (Dangi, [Col. 3 lines 47 – 66]: In some embodiments, a first sequence is generated using a first prime number and the initialization parameter.  In some embodiments, a second sequence is generated using a second prime number and the initialization parameter.  In some embodiments, the first prime number and the second prime number are selected based on a revision parameter that is received.  In some embodiments, each of the first prime number and the second prime number is selected from a constrained modified set of prime numbers.  A data stream is generated by merging (e.g., interleaving) the first sequence and the second sequence.  In various embodiments, a "data stream" refers to a sequence of values that is determined by the merging (e.g., interleaving) of at least two sequences.  In some embodiments, a data stream can be referred to as a "merged sequence." In some embodiments, a data stream is not deduplicatable.  In various embodiments, a non-deduplicatable data stream comprises a data stream that does not include duplicate blocks of data (e.g., than can be identified by a deduplication system for a block size recognized by the deduplication system).
Regarding claim 3, the non-transitory machine-readable storage medium of claim 1, wherein the updating of the first entry of the estimator based on the first output value comprises advancing a count in the first entry, and wherein the updating of the second entry of the estimator based on the second output value comprises reversing a count in the second entry (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment.  At each point, the deduplicated storage size for one or more volumes can be estimated by the following process. …  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.).
Regarding claim 4, the non-transitory machine-readable storage medium of claim 1, wherein the parameter represents how many unique data values are present in a portion of the deduplication data store (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment. …  If the same hash value appears in more than one sketch then it is united into a single hash value in step 630.  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.  In the example above (of 16 leading zeros), the multiplier would be 65,536.  In the example of FIG. 3, the sketch holds two values (i.e., two hash values corresponding to special hashes) and the ratio is 1/2 3=1/8.  Thus the estimated physical size to store the entire volume hashes would be 2*8=16 data chunks.).
Regarding claim 5, the non-transitory machine-readable storage medium of claim 1, wherein the parameter represents a similarity between a plurality of data volumes of the deduplication data store (Dangi [Col. 45 lines 11 – 24]: At 2106, a set of new data blocks of a plurality of data blocks associated with the modified data stream is identified relative to a plurality of data blocks associated with the data stream. In the deduplication process, only the data blocks of the modified data stream that are identified as not having been previously stored (e.g., at the test backup storage location) and are therefore a set of new data blocks, are stored.( For example, in the deduplication process, the modified data stream is segmented into data blocks of variable block sizes and each data block is compared against previously stored data and only those data blocks that are not duplicates of previously stored data are determined as new data to be stored). 
Regarding claim 6, the non-transitory machine-readable storage medium of claim 1, wherein the parameter represents how many unique data values are present in a first portion of the deduplication data store and not present in any other portion of the deduplication data store (Dangi [Col. 8 lines 60 – 62]: In some embodiments, a revision parameter is a revision value associated with a given "seed value" that uniquely maps to at least two prime numbers.).
Regarding claim 7, the non-transitory machine-readable storage medium of claim 1, wherein the parameter represents a deduplication ratio of a portion of the deduplication data store (Harnik [0014]: The storage controller may be optionally configured to estimate a combined compression and deduplication ratio by recording a compression ratio of the data chunks with hashes corresponding to the hashes in the at least one special hashes, calculating a space required to store the chunks corresponding to the at least one special hash, and producing a data reduction estimate in embodiments.).         
Regarding claim 8, the non-transitory machine-readable storage medium of claim 1, wherein the parameter represents how many physical blocks of the deduplication data store would be reclaimed responsive to a deletion of a portion of the deduplication data store (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment. …  If the same hash value appears in more than one sketch then it is united into a single hash value in step 630.  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.  In the example above (of 16 leading zeros), the multiplier would be 65,536.  In the example of FIG. 3, the sketch holds two values (i.e., two hash values corresponding to special hashes) and the ratio is 1/2 3=1/8.  Thus the estimated physical size to store the entire volume hashes would be 2*8=16 data chunks.).
Regarding claim 9, the non-transitory machine-readable storage medium of claim 1, wherein the estimator comprises an array of entries, and wherein each corresponding value of the first and second values maps to a corresponding entry of the array of entries (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment.  At each point, the deduplicated storage size for one or more volumes can be estimated by the following process. …  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.).
Regarding claim 10, the non-transitory machine-readable storage medium of claim 9, wherein the corresponding value comprises a first value portion and a second value portion, the first value portion mapping to a column of the array of entries, and the second value portion mapping to a row of the array of entries (Dangi, [Col. 3 lines 47 – 66]: In some embodiments, a first sequence is generated using a first prime number and the initialization parameter.  In some embodiments, a second sequence is generated using a second prime number and the initialization parameter.  In some embodiments, the first prime number and the second prime number are selected based on a revision parameter that is received.  In some embodiments, each of the first prime number and the second prime number is selected from a constrained modified set of prime numbers.  A data stream is generated by merging (e.g., interleaving) the first sequence and the second sequence.  In various embodiments, a "data stream" refers to a sequence of values that is determined by the merging (e.g., interleaving) of at least two sequences.  In some embodiments, a data stream can be referred to as a "merged sequence." In some embodiments, a data stream is not deduplicatable.  In various embodiments, a non-deduplicatable data stream comprises a data stream that does not include duplicate blocks of data (e.g., than can be identified by a deduplication system for a block size recognized by the deduplication system).
Regarding claim 11, the non-transitory machine-readable storage medium of claim 10, wherein a position of the first entry containing a non-zero count in the array of entries indicates a first estimate of how many unique data values are in a portion of the deduplication data store (Dangi [Col. 8 lines 60 – 62]: In some embodiments, a revision parameter is a revision value associated with a given "seed value" that uniquely maps to at least two prime numbers.).
Regarding claim 12, the non-transitory machine-readable storage medium of claim 11, wherein a position of the second entry containing a non-zero count in the array of entries indicates a second estimate of how many unique data values are in the portion of the deduplication data store, wherein the computing of the parameter is based on the first estimate and the second estimate (Dangi [Col. 8 lines 60 – 62]: In some embodiments, a revision parameter is a revision value associated with a given "seed value" that uniquely maps to at least two prime numbers.).
Regarding claim 13, the non-transitory machine-readable storage medium of claim 2, wherein a quantity of trailing zeros in the first hash value indicates an estimate of how many unique data values are in a portion of the deduplication data store (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment. …  If the same hash value appears in more than one sketch then it is united into a single hash value in step 630.  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.  In the example above (of 16 leading zeros), the multiplier would be 65,536.  In the example of FIG. 3, the sketch holds two values (i.e., two hash values corresponding to special hashes) and the ratio is 1/2 3=1/8.  Thus the estimated physical size to store the entire volume hashes would be 2*8=16 data chunks.). 
Regarding claim 14, the non-transitory machine-readable storage medium of claim 13, wherein the estimator tracks hash values with quantities of trailing zeros within a specified range, and the estimator does not track hash values with quantities of trailing zeros outside the specified range (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment. …  If the same hash value appears in more than one sketch then it is united into a single hash value in step 630.  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.  In the example above (of 16 leading zeros), the multiplier would be 65,536.  In the example of FIG. 3, the sketch holds two values (i.e., two hash values corresponding to special hashes) and the ratio is 1/2 3=1/8.  Thus the estimated physical size to store the entire volume hashes would be 2*8=16 data chunks.). 
Regarding claim 15, Harnik teaches, a system comprising: 
a processor; and a non-transitory storage medium storing instructions executable on the processor to (Harnik [0030]: processor): 
receive write requests to add a first data value to and remove a second data value from a deduplication data store in a storage system comprising persistent storage (Harnik [0034]: To maintain the sketches on every write, the system uses the process illustrated in FIG. 4. In step 410, the system computes a hash value. In step 420, the system determines whether or not the hash is a special hash. If the hash is not a special hash, the process ends in step 430. If the hash is a special hash, the system determines if the special hash already exists in the sketch of the corresponding volume in step 440. If the hash does not already exist in the sketch of the corresponding volume, the hash is added to the sketch of the corresponding volume in step 450. Otherwise, the system adds 1 to the hash reference count in step 560.);
perform deduplication for the first data value (Harnik [0013]: In embodiments, systems comprises a plurality of logical volumes, and a storage controller coupled to the plurality of logical volumes, where the storage controller comprises a software application comprising a deduplication process that generates a set of hash values corresponding to the data chunks on the plurality of logical volumes, and is configured to maintain a volume sketch table to track relationships between underlying data stored on the plurality of volumes and determine an estimated deduplication ratio for the plurality of volumes based on the corresponding volume sketches, where the volume sketch table comprises at least one special hash, and where the at least one special hash comprises a subset of the set of hash values.);
add, in response to adding the first data value, a first storage location indicator to the deduplication data store, the first storage location indicator indicating a storage location of the first data value (Harnik [0034]: To maintain the sketches on every write, the system uses the process illustrated in FIG. 4. In step 410, the system computes a hash value. In step 420, the system determines whether or not the hash is a special hash. If the hash is not a special hash, the process ends in step 430. If the hash is a special hash, the system determines if the special hash already exists in the sketch of the corresponding volume in step 440. If the hash does not already exist in the sketch of the corresponding volume, the hash is added to the sketch of the corresponding volume in step 450. Otherwise, the system adds 1 to the hash reference count in step 560.);
remove, in response to removing the second data value, a second storage location indicator from the deduplication data store, the second storage location indicator indicating a storage location of the second data value (Harnik [0035]: Similarly, for data overwrites and deletions the system uses the process illustrated in FIG. 5. In step 510, the system identifies the hash of the deleted/freed chunk. In step 520, the system determines if the hash is a special hash. If the hash is not a special hash, the process ends in step 530. If the hash is a special hash, the system reduces the reference count of the hash in the corresponding volume by 1 in step 540. In step 550, the system checks if the reference count is zero. If the reference count is zero, the system deletes the corresponding volume sketch and the process terminates in step 570. Otherwise, the process terminates in step 560.);
Harnik does not clearly teach, compute respective first and second values based on applying a function on the first and second storage location indicators; However, Dangi [Col. 7 lines 20 – 43] teaches, “After the data stream is stored by storage deduplication server 102, the stored data stream may be restored. (For example, restoring a stored data stream includes reconstructing the data stream using the stored data blocks and/or references to stored data blocks associated with the data stream.) … In various embodiments, the restored data stream itself can be used to verify its correctness without requiring a master copy of the original data stream and/or the parameters used to generate the original data stream, thereby eliminating the need to maintain a master copy of the data stream for verification purposes. In various embodiments, a data stream can be verified in the same manner regardless if the data stream is compressible or non-compressible.”).
wherein each respective storage location indicator of the first and second storage location indicators represents presence of a unique data value in the deduplication data store (Dangi [Col. 24 lines 14 – 21]:  The blocks that are duplicates (of previously stored data) are detected and only the unique blocks are stored in the deduplication storage. Instead of storing a duplicate block multiple times, a reference to the previously stored block is stored. The reference requires significantly less storage space than the duplicate data blocks would have required.);
update a first entry of an estimator based on the first output value, to reflect the addition of the first storage location indicator to the deduplication data store (Dangi Col. 5 lines 7 – 26]: “Embodiments of modifying a data stream with a predictable change rate are described herein. In various embodiments, a change rate parameter comprising a specified change rate value is received. For example, the change rate comprises a desired percentage by which to modify a data stream. In various embodiments, the data stream to be modified by an amount determined based on a received change rate can be either a compressible or non-compressible data stream. For example, if the change rate value were R percent, then the modified data stream would have data that is R percent different from the original data stream and also have data that is (100−R) percent in common with the original data stream. For example, the data stream and the modified data stream can be used together in a quality assurance and/or storage deduplication testing system to test a deduplication technique in which the modified data stream is compared to the original data stream to determine whether the deduplication technique can determine the correct amount of data by which the modified data stream differs from the original data stream.” Here, updating of an estimator is similar to the modifying based on the predictable change rate.); 
update a second entry of the estimator based on the second output value, to reflect the removal of the second storage location indicator from the deduplication data store (Dangi [Col. 36 lines 48 – 67]: Each instance of such a modified data stream associated with a particular change rate parameter is generated using an additional parameter that is referred to as a "change rate revision parameter," in some embodiments.  Note that the "change rate revision parameter" used with a change rate parameter, as will be described below, is different from the "revision parameter" corresponding to a seed value that was used to select at least two prime numbers to use to generate a merged data stream, as described above. As shown earlier, a data stream can be seen as a sequence of blocks.  Any minute change (a "corruption") within a block of the data stream can result in a modified block.  Even if one bit is changed, the modified block is determined to be a new block to a deduplicating storage server.  A block can be modified in various ways.  In the examples described below, the block size of each data block of a data stream is 8 KiB (e.g., because the average block size is 8 KiB (8,192 bytes) in a deduplicating storage server) and the data stream comprises of alternating 32-bit (4 byte) values each from one of two different sequences.); and 
compute, using the updated estimator, a parameter relating to data deduplication at the deduplication data store. (Dangi [Col. 7 line 44 – Col. 8 line 3]: To test the quality and/or effectiveness of the storage deduplication techniques used by storage deduplication server 102, quality assurance server 106 is configured to generate a modified version of the previously generated data stream that was sent to storage deduplication server 102 over network 104. In some embodiments, quality assurance server 106 is configured to generate a modified data stream based on the parameters associated with the original data stream (e.g., the seed value, the revision value, and/or the two prime numbers) and an additional parameter such as a change rate parameter. For example, a change rate value of a change rate parameter comprises a percentage by which to modify the original data stream to generate the modified data stream. Put another way, a modified data stream differs from the original data stream by the percentage specified by the change rate. In some embodiments, this modified data stream is sent by quality assurance server 106 over network 104 to storage deduplication server 102 (e.g., as part of the same or a different test backup operation) for storage. Storage deduplication server 102 is configured to segment the modified data stream into data blocks (e.g., of variable sizes) and store only the new data blocks (e.g., data blocks that have not already been stored at storage device 108). Given the data blocks stored at storage device 108 for the original data stream, storage deduplication server 102 should store only those new data blocks from the modified data stream that differ from the original data stream.). 
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to incorporate the teaching of Harnik et al. to the Dangi’s system by adding the feature of deduplication ratio. The references (Harnik and Dangi) teach features that are analogous art and they are directed to the same field of endeavor, such as data storage. Ordinary skilled artisan would have been motivated to do so to provide Harnik’s system with enhanced data. (See Dangi [Col. 7 lines 20 – 43], [Col. 7 line 44 – Col. 8 line 3], [Col. 36 lines 48 – 67]). One of the biggest advantages of network machine learning algorithms is their ability to improve over time. Machine learning technology typically improves efficiency and accuracy thanks to the ever-increasing amounts of data that are processed.
Regarding claim 16, the system of claim 15, wherein the estimator comprises an array of entries, each respective entry of the array of entries comprising a respective counter, and wherein the instructions are executable on the processor to: 
increment a counter in the first entry of the array of entries in response to the addition of the first storage location indicator, wherein the first output value computed for the first storage location indicator maps to the first entry; and decrement the counter in the second entry in response to the removal of the second storage location indicator, wherein the second output value computed for the second storage location indicator maps to the second entry  (Dangi, [Col. 3 lines 47 – 66]: In some embodiments, a first sequence is generated using a first prime number and the initialization parameter.  In some embodiments, a second sequence is generated using a second prime number and the initialization parameter.  In some embodiments, the first prime number and the second prime number are selected based on a revision parameter that is received.  In some embodiments, each of the first prime number and the second prime number is selected from a constrained modified set of prime numbers.  A data stream is generated by merging (e.g., interleaving) the first sequence and the second sequence.  In various embodiments, a "data stream" refers to a sequence of values that is determined by the merging (e.g., interleaving) of at least two sequences.  In some embodiments, a data stream can be referred to as a "merged sequence." In some embodiments, a data stream is not deduplicatable.  In various embodiments, a non-deduplicatable data stream comprises a data stream that does not include duplicate blocks of data (e.g., than can be identified by a deduplication system for a block size recognized by the deduplication system).
Regarding claim 17, the system of claim 16, wherein the first output value comprises a first portion and a second portion, the first portion mapping to a column of the array of entries, and a quantity of trailing zeros in the second portion mapping to a row of the array of entries (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment. …  If the same hash value appears in more than one sketch then it is united into a single hash value in step 630.  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.  In the example above (of 16 leading zeros), the multiplier would be 65,536.  In the example of FIG. 3, the sketch holds two values (i.e., two hash values corresponding to special hashes) and the ratio is 1/2 3=1/8.  Thus the estimated physical size to store the entire volume hashes would be 2*8=16 data chunks.).
Regarding claim 18, the system of claim 16, wherein the computing of the parameter relating to data deduplication at the deduplication data store is based on non-zero counts maintained by the counters in the array of entries (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment.  At each point, the deduplicated storage size for one or more volumes can be estimated by the following process. …  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.). 
Regarding claim 19, Harnik teaches, a method of a system comprising a hardware processor, comprising:
receiving write requests to add a first data value to and remove a second data value from a deduplication data store in a storage system comprising persistent storage (Harnik [0034]: To maintain the sketches on every write, the system uses the process illustrated in FIG. 4. In step 410, the system computes a hash value. In step 420, the system determines whether or not the hash is a special hash. If the hash is not a special hash, the process ends in step 430. If the hash is a special hash, the system determines if the special hash already exists in the sketch of the corresponding volume in step 440. If the hash does not already exist in the sketch of the corresponding volume, the hash is added to the sketch of the corresponding volume in step 450. Otherwise, the system adds 1 to the hash reference count in step 560.);
performing deduplication for the first data value (Harnik [0013]: In embodiments, systems comprises a plurality of logical volumes, and a storage controller coupled to the plurality of logical volumes, where the storage controller comprises a software application comprising a deduplication process that generates a set of hash values corresponding to the data chunks on the plurality of logical volumes, and is configured to maintain a volume sketch table to track relationships between underlying data stored on the plurality of volumes and determine an estimated deduplication ratio for the plurality of volumes based on the corresponding volume sketches, where the volume sketch table comprises at least one special hash, and where the at least one special hash comprises a subset of the set of hash values.);
adding, in response to adding the first data value, a first storage location indicator to the deduplication data store, the first storage location indicator indicating a storage location of the first data value (Harnik [0034]: To maintain the sketches on every write, the system uses the process illustrated in FIG. 4. In step 410, the system computes a hash value. In step 420, the system determines whether or not the hash is a special hash. If the hash is not a special hash, the process ends in step 430. If the hash is a special hash, the system determines if the special hash already exists in the sketch of the corresponding volume in step 440. If the hash does not already exist in the sketch of the corresponding volume, the hash is added to the sketch of the corresponding volume in step 450. Otherwise, the system adds 1 to the hash reference count in step 560.);
removing, in response to removing the second data value, a second storage location indicator from the deduplication data store, the second storage location indicator indicating a storage location of the second data value (Harnik [0035]: Similarly, for data overwrites and deletions the system uses the process illustrated in FIG. 5. In step 510, the system identifies the hash of the deleted/freed chunk. In step 520, the system determines if the hash is a special hash. If the hash is not a special hash, the process ends in step 530. If the hash is a special hash, the system reduces the reference count of the hash in the corresponding volume by 1 in step 540. In step 550, the system checks if the reference count is zero. If the reference count is zero, the system deletes the corresponding volume sketch and the process terminates in step 570. Otherwise, the process terminates in step 560.);
Harnik does not clearly teach, computing respective first and second hash values based on content of the first and second storage location indicators added to and removed from the deduplication data store (Dangi [Col. 1 line 5]: Conventionally, testing data is generated by hashing and/or cryptography techniques.), wherein each respective storage location indicator of the first and second storage location indicators (Dangi [Col. 7 lines 20 – 43]: “After the data stream is stored by storage deduplication server 102, the stored data stream may be restored. (For example, restoring a stored data stream includes reconstructing the data stream using the stored data blocks and/or references to stored data blocks associated with the data stream.) … In various embodiments, the restored data stream itself can be used to verify its correctness without requiring a master copy of the original data stream and/or the parameters used to generate the original data stream, thereby eliminating the need to maintain a master copy of the data stream for verification purposes. In various embodiments, a data stream can be verified in the same manner regardless if the data stream is compressible or non-compressible.”) represents presence of a unique data value in the deduplication data store (Dangi [Col. 24 lines 14 – 21]:  The blocks that are duplicates (of previously stored data) are detected and only the unique blocks are stored in the deduplication storage. Instead of storing a duplicate block multiple times, a reference to the previously stored block is stored. The reference requires significantly less storage space than the duplicate data blocks would have required.); 
updating a first entry of a probabilistic cardinality estimator based on the first hash value, to reflect the addition of the first storage location indicator to the deduplication data store (Dangi Col. 5 lines 7 – 26]: “Embodiments of modifying a data stream with a predictable change rate are described herein. In various embodiments, a change rate parameter comprising a specified change rate value is received. For example, the change rate comprises a desired percentage by which to modify a data stream. In various embodiments, the data stream to be modified by an amount determined based on a received change rate can be either a compressible or non-compressible data stream. For example, if the change rate value were R percent, then the modified data stream would have data that is R percent different from the original data stream and also have data that is (100−R) percent in common with the original data stream. For example, the data stream and the modified data stream can be used together in a quality assurance and/or storage deduplication testing system to test a deduplication technique in which the modified data stream is compared to the original data stream to determine whether the deduplication technique can determine the correct amount of data by which the modified data stream differs from the original data stream.” Here, updating of an estimator is similar to the modifying based on the predictable change rate.); 
updating a second entry of the probabilistic cardinality estimator based on the second hash value to reflect the removal of the second storage location indicator from the deduplication data store (Dangi [Col. 36 lines 48 – 67]: Each instance of such a modified data stream associated with a particular change rate parameter is generated using an additional parameter that is referred to as a "change rate revision parameter," in some embodiments.  Note that the "change rate revision parameter" used with a change rate parameter, as will be described below, is different from the "revision parameter" corresponding to a seed value that was used to select at least two prime numbers to use to generate a merged data stream, as described above. As shown earlier, a data stream can be seen as a sequence of blocks.  Any minute change (a "corruption") within a block of the data stream can result in a modified block.  Even if one bit is changed, the modified block is determined to be a new block to a deduplicating storage server.  A block can be modified in various ways.  In the examples described below, the block size of each data block of a data stream is 8 KiB (e.g., because the average block size is 8 KiB (8,192 bytes) in a deduplicating storage server) and the data stream comprises of alternating 32-bit (4 byte) values each from one of two different sequences.), 
wherein the probabilistic cardinality estimator comprises counters, and the updating of the first entry and the second entry comprises advancing a first counter in the first entry, and reversing a second counter in the second entry (Dangi, [Col. 3 lines 47 – 66]: In some embodiments, a first sequence is generated using a first prime number and the initialization parameter.  In some embodiments, a second sequence is generated using a second prime number and the initialization parameter.  In some embodiments, the first prime number and the second prime number are selected based on a revision parameter that is received.  In some embodiments, each of the first prime number and the second prime number is selected from a constrained modified set of prime numbers.  A data stream is generated by merging (e.g., interleaving) the first sequence and the second sequence.  In various embodiments, a "data stream" refers to a sequence of values that is determined by the merging (e.g., interleaving) of at least two sequences.  In some embodiments, a data stream can be referred to as a "merged sequence." In some embodiments, a data stream is not deduplicatable.  In various embodiments, a non-deduplicatable data stream comprises a data stream that does not include duplicate blocks of data (e.g., than can be identified by a deduplication system for a block size recognized by the deduplication system); and
computing, using the updated probabilistic cardinality estimator, a parameter relating to data deduplication at the deduplication data store (Dangi [Col. 7 line 44 – Col. 8 line 3]: To test the quality and/or effectiveness of the storage deduplication techniques used by storage deduplication server 102, quality assurance server 106 is configured to generate a modified version of the previously generated data stream that was sent to storage deduplication server 102 over network 104. In some embodiments, quality assurance server 106 is configured to generate a modified data stream based on the parameters associated with the original data stream (e.g., the seed value, the revision value, and/or the two prime numbers) and an additional parameter such as a change rate parameter. For example, a change rate value of a change rate parameter comprises a percentage by which to modify the original data stream to generate the modified data stream. Put another way, a modified data stream differs from the original data stream by the percentage specified by the change rate. In some embodiments, this modified data stream is sent by quality assurance server 106 over network 104 to storage deduplication server 102 (e.g., as part of the same or a different test backup operation) for storage. Storage deduplication server 102 is configured to segment the modified data stream into data blocks (e.g., of variable sizes) and store only the new data blocks (e.g., data blocks that have not already been stored at storage device 108). Given the data blocks stored at storage device 108 for the original data stream, storage deduplication server 102 should store only those new data blocks from the modified data stream that differ from the original data stream.). 
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to incorporate the teaching of Harnik et al. to the Dangi’s system by adding the feature of deduplication ratio. The references (Harnik and Dangi) teach features that are analogous art and they are directed to the same field of endeavor, such as data storage. Ordinary skilled artisan would have been motivated to do so to provide Harnik’s system with enhanced data. (See Dangi [Col. 7 lines 20 – 43], [Col. 7 line 44 – Col. 8 line 3], [Col. 36 lines 48 – 67]). One of the biggest advantages of network machine learning algorithms is their ability to improve over time. Machine learning technology typically improves efficiency and accuracy thanks to the ever-increasing amounts of data that are processed.
Regarding claim 20, the method of claim 19, further comprising:
computing a further probabilistic cardinality estimator updated responsive to hash values computed based on data values stored in the deduplication data store, the further probabilistic cardinality estimator when compared to another probabilistic cardinality estimator based on data values stored in another deduplication data store providing an indication of similarity between the deduplication data store and the another deduplication data store (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment. …  If the same hash value appears in more than one sketch then it is united into a single hash value in step 630.  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.  In the example above (of 16 leading zeros), the multiplier would be 65,536.  In the example of FIG. 3, the sketch holds two values (i.e., two hash values corresponding to special hashes) and the ratio is 1/2 3=1/8.  Thus the estimated physical size to store the entire volume hashes would be 2*8=16 data chunks.).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Theimer, US 2020/0104175, Parameter variations for computations using a remote repository
Jain, US 2019/0340261, Policy based Data Deduplication
Wu, US 2018/0314435, Deduplication Processing Method, and Storage Device
Voruganti, US 2015/0066873, Policy Based Deduplication Techniques
Leppard, US 2011/0238635, Combining Hash based Duplication with sub-block differencing to deduplicate data
Hu, US 10,037,336, Performing block deduplication using block sequence classifications
Klose, U.S. Patent No. 9,535,776, Dataflow alerts for an information management system
Johnston, US 9,152,333, System and Method for estimating storage savings from deduplication

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SABA AHMED whose telephone number is (571)270-0236.  The examiner can normally be reached on MON – FRI: 9AM – 5PM EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain Alam can be reached on 571-272-3978. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SABA AHMED/
Examiner, Art Unit 2154

/HOSAIN T ALAM/Supervisory Patent Examiner, Art Unit 2154