DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
This communication is responsive to the original application filed on 5/7/2020. This action is Non-Final. Claims 1-20 are pending and have been examined.  
Drawings
The applicant’s drawings submitted are acceptable for examination purposes. 
Specification
The applicant’s specification submitted is acceptable for examination purposes. 
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1 – 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-patentable subject matter. The claims are directed to an abstract idea without significantly more.
Claims 1 – 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The judicial exception is not integrated into a practical application. The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The eligibility analysis in support of these findings is provided below, on accordance with the “2019 Revised Patent Subject Matter Eligibility Guidance” (published on 1/7/2019 in Fed. Register, Vol. 84, No. 4 at pgs. 50 – 57, hereinafter referred to as the “2019 PEG”).
Step 1. In accordance with Step 1 of the eligibility inquiry (as explained in MPEP 2106), it is first noted the claim storage medium (claims 1 – 14), system (claims 15 – 18) and method (claims 19 – 20) are directed to one of the eligible categories of subject matter and therefore satisfies Step 1.
Step 2. In accordance with Step 2A Prong one of 2019 PEG, it is noted that the claims recite an abstract idea by reciting a method of organization human activities, which falls into the “software per se” group within group within the enumerated groupings of abstract ideas set forth in the 2019 PEG. The claims recite the abstract idea of estimator, which falls within the abstract idea of a mental process. It is noted that cited abstract idea also falls within the mental processes group within the enumerated groupings of abstract ideas set forth in 2019 PEG. The recitation of generic computer components does not negate the abstractness of given limitation. The limitations reciting the abstract idea are highlighted in italics and the limitation directed to additional elements highlighted in bold, as set forth in exemplary claim 1: 
A non-transitory machine-readable storage medium comprising instructions that upon execution cause a system to: compute respective values for corresponding data value indicators added to and removed from a deduplication data store in which duplicated data values have been eliminated, wherein each respective data value indicator of the data value indicators represents presence of a unique data value in the deduplication data store; update an estimator based on the respective values, to reflect an addition of a first data value indicator to the deduplication data store and a removal of a second data value indicator from the deduplication data store; and compute, using the updated estimator, a parameter relating to data deduplication at the deduplication data store.

With respect to Step 2A Prong Two of the 2019 PEG, the judicial exception is not integrated into a practical application. The additional elements are directed to first and second data value indicators (claim 1). However, these elements fail to integrate the abstract idea into a practical application because they fail to provide an improvement to the functioning of a computer or to any other technology or technical field, fail to apply the exception with a particular machine, fail to apply the judicial exception to effect a particular treatment or prophylaxis for a disease or medical condition, fail to effect a transformation of a particular article to a different state or thing, and fail to apply/use the abstract idea in a meaningful way beyond generally linking the use of the judicial exception to a particular technological environment. Furthermore, these elements have been fully considered, however they are directed to the use of generic computing elements to perform the abstract idea, which is not sufficient to amount to a practical application (as noted in the 2019 PEG) and is tantamount to simply saying “apply it” using a general purpose computer, which merely serves to tie the abstract idea to a particular technological environment (computer based operating environment) by using the computer as a tool to perform the abstract idea, which is not sufficient to amount a particular application.
Accordingly, because the Step 2A Prong One and Prong Two analysis resulted in the conclusion that the claims are directed to an abstract idea, additional analysis under Step 2B of the eligibility inquiry must be conducted in order to determine whether any claim element of combination of elements amount to significantly more than the judicial exception. 
Step 2B. It has been determined that the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional limitations are directed to first and second data value indicators, though at a very high level of generality and without imposing meaningful limitation on the scope of the claim. Such generic, high-level, and nominal involvement of a computer or computer-based elements for carrying out the invention merely serves to tie the abstract idea to a particular technological environment, which is not enough to render the claims patent-eligible, as noted at pg. 74624 of Federal Register/ Vol. 79, No. 241, citing Alice, which in turn cites Mayo. Further, See, e.g., Alice Corp. Pty. Ltd. v. CLS Bank Int'l, 134 S. Ct. 2347, 2359-60, 110 USPQ2d 1976, 1984 (2014). See also OIP Techs. v. Amazon.com, 788 F.3d 1359, 1364, 115 USPQ2d 1090, 1093-94 (Fed. Cir. 2015) ("Just as Diehr could not save the claims in Alice, which were directed to 'implement[ing] the abstract idea of intermediated settlement on a generic computer', it cannot save O/P's claims directed to implementing the abstract idea of price optimization on a generic computer.") ( citations omitted). See also, Affinity Labs of Texas LLC v. DirecTV LLC, 838 F.3d 1253, 1257-1258 (Fed. Cir. 2016) (mere recitation of a GUI does not make a claim patent-eligible); Intellectual Ventures I LLC v. Capital One Bank, 792 F.3d 1363, 1370 (Fed. Cir. 2015) ("the interactive interface limitation is a generic computer element".
The additional elements are broadly applied to the abstract idea(s) at a high level of generality ("similar to how the recitation of the computer in the claims in Alice amounted to mere instructions to apply the abstract idea of intermediated settlement on a generic computer," as explained in MPEP § 2106.05(f)) and they operate in well-understood, routine, and conventional manners. Furthermore, generally transmitting, analyzing, and outputting (e.g., displaying) data are examples of insignificant extra-solution activity. The recitation first and second datasets are performed by an apparatus/device is the epitome of "mere instructions to implement an abstract idea on a computer". 
MPEP § 2106.0S(d)(II) sets forth the following:
The courts have recognized the following computer functions as well-understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity.
• Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec ... ; TLI Communications LLC v. AV Auto. LLC ... ; OIP Techs., Inc., v. Amazon.com, Inc ... ; buySAFE, Inc. v. Google, Inc ... ;
• Performing repetitive calculations, Flook ... ; Bancorp Services v. Sun Life ... ;
• Electronic recordkeeping, Alice Corp ... ; Ultramercial ... ;
• Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc ... ;
• Electronically scanning or extracting data from a physical document, Content Extraction and Transmission, LLC v. Wells Fargo Bank ... ; and
• A web browser's back and forward button functionality, Internet Patent
• Corp. v. Active Network, Inc. ...

. . . Courts have held computer-implemented processes not to be significantly more than an abstract idea (and thus ineligible) where the claim as a whole amounts to nothing more than generic computer functions merely used to implement an abstract idea, such as an idea that could be done by a human analog (i.e., by hand or by merely thinking) ...
In addition, when taken as an ordered combination, the ordered combination adds nothing that is not already present as when the elements are taken individually. There is no indication that the combination of elements integrate the abstract idea into a practical application. Their collective functions merely provide conventional computer implementation. Therefore, when viewed as a whole, these additional claim elements do not provide meaningful limitations to transform the abstract idea into a practical application of the abstract idea or that the ordered combination amounts to significantly more than the abstract idea itself.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1 – 20 are rejected under 35 U.S.C. 103 as being unpatentable over Dangi et al.,
U.S. Patent No.: 10,853,324 (Hereinafter “Dangi”), and further in view of Harnik, U.S. Patent Application Publication No. 2018/0074745 (Hereinafter “Harnik”).
Regarding claim 1, Dangi teaches, a non-transitory machine-readable storage medium comprising instructions that upon execution cause a system to:
compute respective values for corresponding data value indicators added to and removed from a deduplication data store in which duplicated data values have been eliminated, wherein each respective data value indicator of the data value indicators represents presence of a unique data value in the deduplication data store (Dangi [Col. 7 lines 20 – 43]: “After the data stream is stored by storage deduplication server 102, the stored data stream may be restored. (For example, restoring a stored data stream includes reconstructing the data stream using the stored data blocks and/or references to stored data blocks associated with the data stream.) … In various embodiments, the restored data stream itself can be used to verify its correctness without requiring a master copy of the original data stream and/or the parameters used to generate the original data stream, thereby eliminating the need to maintain a master copy of the data stream for verification purposes. In various embodiments, a data stream can be verified in the same manner regardless if the data stream is compressible or non-compressible.”).
update an estimator based on the respective values, to reflect an addition of a first data value indicator to the deduplication data store and a removal of a second data value indicator from the deduplication data store (Dangi [Col. 36 lines 48 – 67]: Each instance of such a modified data stream associated with a particular change rate parameter is generated using an additional parameter that is referred to as a "change rate revision parameter," in some embodiments.  Note that the "change rate revision parameter" used with a change rate parameter, as will be described below, is different from the "revision parameter" corresponding to a seed value that was used to select at least two prime numbers to use to generate a merged data stream, as described above. As shown earlier, a data stream can be seen as a sequence of blocks.  Any minute change (a "corruption") within a block of the data stream can result in a modified block.  Even if one bit is changed, the modified block is determined to be a new block to a deduplicating storage server.  A block can be modified in various ways.  In the examples described below, the block size of each data block of a data stream is 8 KiB (e.g., because the average block size is 8 KiB (8,192 bytes) in a deduplicating storage server) and the data stream comprises of alternating 32-bit (4 byte) values each from one of two different sequences.); and
Dangi does not clearly teach, compute, using the updated estimator, a parameter relating to data deduplication at the deduplication data store. However, Harnik [0014] teaches, “The storage controller may be optionally configured to estimate a combined compression and deduplication ratio by recording a compression ratio of the data chunks with hashes corresponding to the hashes in the at least one special hashes, calculating a space required to store the chunks corresponding to the at least one special hash, and producing a data reduction estimate in embodiments.”
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to incorporate the teaching of Dangi et al. to the Harnik’s system by adding the feature of deduplication ratio. The references (Dangi and Harnik) teach features that are analogous art and they are directed to the same field of endeavor, such as data storage. Ordinary skilled artisan would have been motivated to do so to provide Dangi’s system with enhanced data. (See Harnik [Abstract], [0014], [0030 – 32], [0041]). One of the biggest advantages of network machine learning algorithms is their ability to improve over time. Machine learning technology typically improves efficiency and accuracy thanks to the ever-increasing amounts of data that are processed.
Regarding claim 2, the non-transitory machine-readable storage medium of claim 1, wherein the instructions upon execution cause the system to:
compute a first value based on the first data value indicator; update an entry of the estimator based on the first value, to reflect the addition of the first data value indicator to the deduplication data store; compute a second value based on the second data value indicator; and update the entry of the estimator based on the second value, to reflect the removal of the second data value indicator from the deduplication data store (Dangi, [Col. 3 lines 47 – 66]: In some embodiments, a first sequence is generated using a first prime number and the initialization parameter.  In some embodiments, a second sequence is generated using a second prime number and the initialization parameter.  In some embodiments, the first prime number and the second prime number are selected based on a revision parameter that is received.  In some embodiments, each of the first prime number and the second prime number is selected from a constrained modified set of prime numbers.  A data stream is generated by merging (e.g., interleaving) the first sequence and the second sequence.  In various embodiments, a "data stream" refers to a sequence of values that is determined by the merging (e.g., interleaving) of at least two sequences.  In some embodiments, a data stream can be referred to as a "merged sequence." In some embodiments, a data stream is not deduplicatable.  In various embodiments, a non-deduplicatable data stream comprises a data stream that does not include duplicate blocks of data (e.g., than can be identified by a deduplication system for a block size recognized by the deduplication system).
Regarding claim 3, the non-transitory machine-readable storage medium of claim 2, wherein the updating of the entry of the estimator based on the first value comprises advancing a count in the entry, and wherein the updating of the entry of the estimator based on the second value comprises reversing the count in the entry (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment.  At each point, the deduplicated storage size for one or more volumes can be estimated by the following process. …  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.).
Regarding claim 4, the non-transitory machine-readable storage medium of claim 1, wherein the parameter represents how many unique data values are present in a portion of the deduplication data store (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment. …  If the same hash value appears in more than one sketch then it is united into a single hash value in step 630.  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.  In the example above (of 16 leading zeros), the multiplier would be 65,536.  In the example of FIG. 3, the sketch holds two values (i.e., two hash values corresponding to special hashes) and the ratio is 1/2 3=1/8.  Thus the estimated physical size to store the entire volume hashes would be 2*8=16 data chunks.).
Regarding claim 5, the non-transitory machine-readable storage medium of claim 1, wherein the parameter represents a similarity between a plurality of data volumes of the deduplication data store (Dangi [Col. 45 lines 11 – 24]: At 2106, a set of new data blocks of a plurality of data blocks associated with the modified data stream is identified relative to a plurality of data blocks associated with the data stream. In the deduplication process, only the data blocks of the modified data stream that are identified as not having been previously stored (e.g., at the test backup storage location) and are therefore a set of new data blocks, are stored.( For example, in the deduplication process, the modified data stream is segmented into data blocks of variable block sizes and each data block is compared against previously stored data and only those data blocks that are not duplicates of previously stored data are determined as new data to be stored). 
Regarding claim 6, the non-transitory machine-readable storage medium of claim 1, wherein the parameter represents how many unique data values are present in a first portion of the deduplication data store and not present in any other portion of the deduplication data store (Dangi [Col. 8 lines 60 – 62]: In some embodiments, a revision parameter is a revision value associated with a given "seed value" that uniquely maps to at least two prime numbers.).
Regarding claim 7, the non-transitory machine-readable storage medium of claim 1, wherein the parameter represents a deduplication ratio of a portion of the deduplication data store (Harnik [0014]: The storage controller may be optionally configured to estimate a combined compression and deduplication ratio by recording a compression ratio of the data chunks with hashes corresponding to the hashes in the at least one special hashes, calculating a space required to store the chunks corresponding to the at least one special hash, and producing a data reduction estimate in embodiments.).         
Regarding claim 8, the non-transitory machine-readable storage medium of claim 1, wherein the parameter represents how many physical blocks of the deduplication data store would be reclaimed responsive to a deletion of a portion of the deduplication data store (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment. …  If the same hash value appears in more than one sketch then it is united into a single hash value in step 630.  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.  In the example above (of 16 leading zeros), the multiplier would be 65,536.  In the example of FIG. 3, the sketch holds two values (i.e., two hash values corresponding to special hashes) and the ratio is 1/2 3=1/8.  Thus the estimated physical size to store the entire volume hashes would be 2*8=16 data chunks.).
Regarding claim 9, the non-transitory machine-readable storage medium of claim 1, wherein the estimator comprises an array of entries, and wherein each corresponding value of the respective values maps to a corresponding entry of the array of entries, and
wherein updating the estimator comprises updating a count in the corresponding entry based on the corresponding value, the count representing how many of the respective values map to the corresponding entry (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment.  At each point, the deduplicated storage size for one or more volumes can be estimated by the following process. …  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.).
Regarding claim 10, the non-transitory machine-readable storage medium of claim 9, wherein the corresponding value comprises a first value portion and a second value portion, the first value portion mapping to a column of the array of entries, and the second value portion mapping to a row of the array of entries (Dangi, [Col. 3 lines 47 – 66]: In some embodiments, a first sequence is generated using a first prime number and the initialization parameter.  In some embodiments, a second sequence is generated using a second prime number and the initialization parameter.  In some embodiments, the first prime number and the second prime number are selected based on a revision parameter that is received.  In some embodiments, each of the first prime number and the second prime number is selected from a constrained modified set of prime numbers.  A data stream is generated by merging (e.g., interleaving) the first sequence and the second sequence.  In various embodiments, a "data stream" refers to a sequence of values that is determined by the merging (e.g., interleaving) of at least two sequences.  In some embodiments, a data stream can be referred to as a "merged sequence." In some embodiments, a data stream is not deduplicatable.  In various embodiments, a non-deduplicatable data stream comprises a data stream that does not include duplicate blocks of data (e.g., than can be identified by a deduplication system for a block size recognized by the deduplication system).
Regarding claim 11, the non-transitory machine-readable storage medium of claim 10, wherein a position of a first entry containing a non-zero count in the array of entries indicates a first estimate of how many unique data values are in a portion of the deduplication data store (Dangi [Col. 8 lines 60 – 62]: In some embodiments, a revision parameter is a revision value associated with a given "seed value" that uniquely maps to at least two prime numbers.).
Regarding claim 12, the non-transitory machine-readable storage medium of claim 11, wherein a position of a second entry containing a non-zero count in the array of entries indicates a second estimate of how many unique data values are in the portion of the deduplication data store, wherein the computing of the parameter is based on the first estimate and the second estimate (Dangi [Col. 8 lines 60 – 62]: In some embodiments, a revision parameter is a revision value associated with a given "seed value" that uniquely maps to at least two prime numbers.).

Regarding claim 13, the non-transitory machine-readable storage medium of claim 1, wherein the respective values comprise hash values, and wherein a quantity of trailing zeros in each hash value of the hash values indicates an estimate of how many unique data values are in a portion of the deduplication data store (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment. …  If the same hash value appears in more than one sketch then it is united into a single hash value in step 630.  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.  In the example above (of 16 leading zeros), the multiplier would be 65,536.  In the example of FIG. 3, the sketch holds two values (i.e., two hash values corresponding to special hashes) and the ratio is 1/2 3=1/8.  Thus the estimated physical size to store the entire volume hashes would be 2*8=16 data chunks.). 

Regarding claim 14, the non-transitory machine-readable storage medium of claim 13, wherein the estimator tracks hash values with quantities of trailing zeros within a specified range, and the estimator does not track hash values with quantities of trailing zeros outside the specified range (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment. …  If the same hash value appears in more than one sketch then it is united into a single hash value in step 630.  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.  In the example above (of 16 leading zeros), the multiplier would be 65,536.  In the example of FIG. 3, the sketch holds two values (i.e., two hash values corresponding to special hashes) and the ratio is 1/2 3=1/8.  Thus the estimated physical size to store the entire volume hashes would be 2*8=16 data chunks.). 
Regarding claim 15, Dangi teaches, a system comprising: 
a processor (Dangi [Col. 3 line 11]: processor); and 
a non-transitory storage medium storing instructions executable on the processor to: compute respective values for corresponding storage location indicators added to and removed from a deduplication data store in which duplicated data values have been eliminated, wherein each respective storage location indicator of the storage location indicators represents a respective storage location of a unique data value in the deduplication data store (Dangi [Col. 7 lines 20 – 43]: “After the data stream is stored by storage deduplication server 102, the stored data stream may be restored. (For example, restoring a stored data stream includes reconstructing the data stream using the stored data blocks and/or references to stored data blocks associated with the data stream.) … In various embodiments, the restored data stream itself can be used to verify its correctness without requiring a master copy of the original data stream and/or the parameters used to generate the original data stream, thereby eliminating the need to maintain a master copy of the data stream for verification purposes. In various embodiments, a data stream can be verified in the same manner regardless if the data stream is compressible or non-compressible.”); 
update an estimator based on the respective values, to reflect an addition of a first storage location indicator to the deduplication data store and a removal of a second storage location indicator from the deduplication data store (Dangi [Col. 36 lines 48 – 67]: Each instance of such a modified data stream associated with a particular change rate parameter is generated using an additional parameter that is referred to as a "change rate revision parameter," in some embodiments.  Note that the "change rate revision parameter" used with a change rate parameter, as will be described below, is different from the "revision parameter" corresponding to a seed value that was used to select at least two prime numbers to use to generate a merged data stream, as described above. As shown earlier, a data stream can be seen as a sequence of blocks.  Any minute change (a "corruption") within a block of the data stream can result in a modified block.  Even if one bit is changed, the modified block is determined to be a new block to a deduplicating storage server.  A block can be modified in various ways.  In the examples described below, the block size of each data block of a data stream is 8 KiB (e.g., because the average block size is 8 KiB (8,192 bytes) in a deduplicating storage server) and the data stream comprises of alternating 32-bit (4 byte) values each from one of two different sequences.); and 
Dangi does not clearly teach, compute, using the updated estimator, a parameter relating to data deduplication at the deduplication data store. However, Harnik [0014] teaches, “The storage controller may be optionally configured to estimate a combined compression and deduplication ratio by recording a compression ratio of the data chunks with hashes corresponding to the hashes in the at least one special hashes, calculating a space required to store the chunks corresponding to the at least one special hash, and producing a data reduction estimate in embodiments.”
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to incorporate the teaching of Dangi et al. to the Harnik’s system by adding the feature of deduplication ratio. The references (Dangi and Harnik) teach features that are analogous art and they are directed to the same field of endeavor, such as data storage. Ordinary skilled artisan would have been motivated to do so to provide Dangi’s system with enhanced data. (See Harnik [Abstract], [0014], [0030 – 32], [0041]). One of the biggest advantages of network machine learning algorithms is their ability to improve over time. Machine learning technology typically improves efficiency and accuracy thanks to the ever-increasing amounts of data that are processed.
Regarding claim 16, the system of claim 15, wherein the estimator comprises an array of entries, each respective entry of the array of entries comprising a respective counter, and wherein the instructions are executable on the processor to: 
increment a counter in a given entry of the array of entries in response to the addition of the first storage location indicator, wherein a first value computed for the first storage location indicator maps to the given entry; and decrement the counter in the given entry in response to the removal of the second storage location indicator, wherein a second value computed for the second storage location indicator maps to the given entry  (Dangi, [Col. 3 lines 47 – 66]: In some embodiments, a first sequence is generated using a first prime number and the initialization parameter.  In some embodiments, a second sequence is generated using a second prime number and the initialization parameter.  In some embodiments, the first prime number and the second prime number are selected based on a revision parameter that is received.  In some embodiments, each of the first prime number and the second prime number is selected from a constrained modified set of prime numbers.  A data stream is generated by merging (e.g., interleaving) the first sequence and the second sequence.  In various embodiments, a "data stream" refers to a sequence of values that is determined by the merging (e.g., interleaving) of at least two sequences.  In some embodiments, a data stream can be referred to as a "merged sequence." In some embodiments, a data stream is not deduplicatable.  In various embodiments, a non-deduplicatable data stream comprises a data stream that does not include duplicate blocks of data (e.g., than can be identified by a deduplication system for a block size recognized by the deduplication system).
Regarding claim 17, the system of claim 16, wherein the first value comprises a first portion and a second portion, the first portion mapping to a column of the array of entries, and a quantity of trailing zeros in the second portion mapping to a row of the array of entries (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment. …  If the same hash value appears in more than one sketch then it is united into a single hash value in step 630.  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.  In the example above (of 16 leading zeros), the multiplier would be 65,536.  In the example of FIG. 3, the sketch holds two values (i.e., two hash values corresponding to special hashes) and the ratio is 1/2 3=1/8.  Thus the estimated physical size to store the entire volume hashes would be 2*8=16 data chunks.).
Regarding claim 18, the system of claim 16, wherein the computing of the parameter relating to data deduplication at the deduplication data store is based on non-zero counts maintained by the counters in the array of entries (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment.  At each point, the deduplicated storage size for one or more volumes can be estimated by the following process. …  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.). 
Regarding claim 19, Dangi teaches, a method of a system comprising a hardware processor, comprising:
computing respective hash values based on content of corresponding storage location indicators added to and removed from a deduplication data store in which duplicated data values have been eliminated (Dangi [Col. 1 line 5]: Conventionally, testing data is generated by hashing and/or cryptography techniques.), wherein each respective storage location indicator of the storage location indicators represents a respective storage location of a unique data value in the deduplication data store (Dangi [Col. 7 lines 20 – 43]: “After the data stream is stored by storage deduplication server 102, the stored data stream may be restored. (For example, restoring a stored data stream includes reconstructing the data stream using the stored data blocks and/or references to stored data blocks associated with the data stream.) … In various embodiments, the restored data stream itself can be used to verify its correctness without requiring a master copy of the original data stream and/or the parameters used to generate the original data stream, thereby eliminating the need to maintain a master copy of the data stream for verification purposes. In various embodiments, a data stream can be verified in the same manner regardless if the data stream is compressible or non-compressible.”); 
updating a probabilistic cardinality estimator based on the respective hash values, to reflect an addition of a first storage location indicator to the deduplication data store and a removal of a second storage location indicator from the deduplication data store (Dangi [Col. 36 lines 48 – 67]: Each instance of such a modified data stream associated with a particular change rate parameter is generated using an additional parameter that is referred to as a "change rate revision parameter," in some embodiments.  Note that the "change rate revision parameter" used with a change rate parameter, as will be described below, is different from the "revision parameter" corresponding to a seed value that was used to select at least two prime numbers to use to generate a merged data stream, as described above. As shown earlier, a data stream can be seen as a sequence of blocks.  Any minute change (a "corruption") within a block of the data stream can result in a modified block.  Even if one bit is changed, the modified block is determined to be a new block to a deduplicating storage server.  A block can be modified in various ways.  In the examples described below, the block size of each data block of a data stream is 8 KiB (e.g., because the average block size is 8 KiB (8,192 bytes) in a deduplicating storage server) and the data stream comprises of alternating 32-bit (4 byte) values each from one of two different sequences.), 
wherein the probabilistic cardinality estimator comprises counters, and the updating comprises advancing a first counter of the counters responsive to the addition of the first storage location indicator, and reversing the first counter responsive to the removal of the second storage location indicator (Dangi, [Col. 3 lines 47 – 66]: In some embodiments, a first sequence is generated using a first prime number and the initialization parameter.  In some embodiments, a second sequence is generated using a second prime number and the initialization parameter.  In some embodiments, the first prime number and the second prime number are selected based on a revision parameter that is received.  In some embodiments, each of the first prime number and the second prime number is selected from a constrained modified set of prime numbers.  A data stream is generated by merging (e.g., interleaving) the first sequence and the second sequence.  In various embodiments, a "data stream" refers to a sequence of values that is determined by the merging (e.g., interleaving) of at least two sequences.  In some embodiments, a data stream can be referred to as a "merged sequence." In some embodiments, a data stream is not deduplicatable.  In various embodiments, a non-deduplicatable data stream comprises a data stream that does not include duplicate blocks of data (e.g., than can be identified by a deduplication system for a block size recognized by the deduplication system); and
Dangi does not clearly teach, compute, using the updated probabilistic cardinality estimator, a parameter relating to data deduplication at the deduplication data store. However, Harnik [0014] teaches, “The storage controller may be optionally configured to estimate a combined compression and deduplication ratio by recording a compression ratio of the data chunks with hashes corresponding to the hashes in the at least one special hashes, calculating a space required to store the chunks corresponding to the at least one special hash, and producing a data reduction estimate in embodiments.”
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to incorporate the teaching of Dangi et al. to the Harnik’s system by adding the feature of deduplication ratio. The references (Dangi and Harnik) teach features that are analogous art and they are directed to the same field of endeavor, such as data storage. Ordinary skilled artisan would have been motivated to do so to provide Dangi’s system with enhanced data. (See Harnik [Abstract], [0014], [0030 – 32], [0041]). One of the biggest advantages of network machine learning algorithms is their ability to improve over time. Machine learning technology typically improves efficiency and accuracy thanks to the ever-increasing amounts of data that are processed.
Regarding claim 20, the method of claim 19, further comprising:
computing a further probabilistic cardinality estimator updated responsive to hash values computed based on data values stored in the deduplication data store, the further probabilistic cardinality estimator when compared to another probabilistic cardinality estimator based on data values stored in another deduplication data store providing an indication of similarity between the deduplication data store and the another deduplication data store (Harnik [0037]: FIG. 6 illustrates a program flow for estimating the deduplicated storage size, according to an embodiment. …  If the same hash value appears in more than one sketch then it is united into a single hash value in step 630.  In step 640, the system counts the space it takes to store the data chunks of all the special hashes seen in the combined sketch, and multiplies this space by the ratio between all possible hashes and the set of all possible special hashes.  In the example above (of 16 leading zeros), the multiplier would be 65,536.  In the example of FIG. 3, the sketch holds two values (i.e., two hash values corresponding to special hashes) and the ratio is 1/2 3=1/8.  Thus the estimated physical size to store the entire volume hashes would be 2*8=16 data chunks.).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Theimer, US 2020/0104175, Parameter variations for computations using a remote repository
Jain, US 2019/0340261, Policy based Data Deduplication
Wu, US 2018/0314435, Deduplication Processing Method, and Storage Device
Voruganti, US 2015/0066873, Policy Based Deduplication Techniques
Leppard, US 2011/0238635, Combining Hash based Duplication with sub-block differencing to deduplicate data
Hu, US 10,037,336, Performing block deduplication using block sequence classifications
Klose, U.S. Patent No. 9,535,776, Dataflow alerts for an information management system
Johnston, US 9,152,333, System and Method for estimating storage savings from deduplication

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SABA AHMED whose telephone number is (571)270-0236.  The examiner can normally be reached on MON – FRI: 9AM – 5PM EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain Alam can be reached on 571-272-3978. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SABA AHMED/
Examiner, Art Unit 2154

/HOSAIN T ALAM/Supervisory Patent Examiner, Art Unit 2154