DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Information Disclosure Statement
The information disclosure statements (IDS) submitted on January 2, 2018 and May 10, 2019 are being considered by the examiner.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-3, 6, 7, 9, 10, 12-16 and 18-20 are rejected under 35 U.S.C. 102(a)(1)/(a)(2) as being anticipated by Rhodes (U.S. Publication No. 2015/0269178 A1, hereinafter referred to as “Rhodes”).
a computer system configured to correct bias when estimating small numbers of distinct values in a multiset data set, the computer system comprising: (e.g., abstract, figure 8 and paragraph [0020]) one or more processors; one or more computer readable media coupled to the one or more processors, wherein the one or more processors and one or more computer readable media are configured to implement: (e.g., figure 8 and paragraphs [0064]-[0066])
a distinct value estimator, wherein the distinct value estimator is configured to estimate a number of distinct values in a multiset data set; (HLL and fixed-size bucket techniques are used in correlation with other techniques for different ranges of cardinality to yield an acceptable error rate and error variation to produce an enhanced accuracy cardinality estimation for a large data set – sketches are used for estimating the cardinality of unique values for big data)(e.g., abstract and paragraphs [0019] and [0021])
a bias table, corresponding specifically to the distinct value estimator with values in the bias table derived from a specific configuration for the distinct value estimator, the bias table having entries with values corresponding to biases caused by the specific configuration of the distinct value estimator correlated to values corresponding to numbers estimated by the distinct value estimator; wherein the entries in the table are optimized by having a set of entries with an optimized number of biases in the entries and where the biases in the entries are associated with a predetermined confidence interval; (another HLL enhancement technique involves employing a table lookup and regression between empty bin estimate range and high estimate range. In the adjustment range, an initial cardinality estimate is used as a lookup value into a lookup table. The lookup table may be any memory storage that a computer system may store in or retrieve data from. The lookup table may contain a set of 
a bias corrector coupled to the distinct value estimator and the bias table, wherein the bias corrector is configured to correct the number of distinct values in the multiset data set estimated by the distinct value estimator using values from the bias table to produce a corrected value; and (cardinality estimator calculates an enhanced accuracy cardinality for a large data set based on a fixed-size bucket or a generated sketch, such as a bin array. Enhanced accuracy is considered as a corrected value))(e.g., figure 2 and paragraphs [0037]-[0038])
a user interface coupled to the bias corrector, wherein the user interface is configured to output the corrected value to a user. (user interface includes display of the enhanced accuracy.)(e.g., figures 1C, 5 and 6 and paragraph [0032])

Regarding claim 2, Rhodes discloses the computer system of claim 1. Rhodes further discloses wherein the user interface comprises a user interface of a database system. (e.g., figure 1C and paragraph [0032])


further comprising: one or more additional distinct value estimators, wherein each of the additional distinct value estimators is configured to estimate a number of distinct values in a different multiset data set; and (HLL sketches exist for different sizes)(e.g., paragraphs [0019], [0021], [0054]-[0055])
one or more additional bias tables, each corresponding specifically to one of the additional distinct value estimators with values in the bias table derived from a specific configuration for the distinct value estimator. (one or more lookup tables for enhancing an accuracy of a cardinality estimate corresponding to different size bin arrays)(e.g., paragraphs [0054]-[0055])

Regarding claim 6, Rhodes discloses the computer system of claim 3. Rhodes further discloses wherein characteristics of the plurality of distinct value estimators are based on a number of multiset data sets for which estimations of a number of distinct values needs to be performed. (HLL sketches with a bine array are generated. A hash value, a transformed value from a large data set, with a size of B bits, is received.)(e.g., figure 4 and paragraphs [0023], [0045] and [0049])

Regarding claim 7, Rhodes discloses the computer system of claim 3. Rhodes further discloses wherein characteristics of the plurality of distinct value estimators are based on resources available at the computer system. (bounded size is important to plan for a maximum memory or disk storage capacity independent of how large the input data stream from the big data becomes.)(e.g., paragraph [0019]).
wherein the distinct value estimator is a HyperLogLog (HLL) estimator. (HLL estimator is used for distinct value estimator)(e.g., paragraphs [0021], [0023] and [0024])

Regarding claim 10, Rhodes discloses the computer system of claim 1. Rhodes further discloses wherein the set of entries with an optimized number of biases in the entries is optimized by having entries that result in a minimization of differences between adjacent biases in the entries. (lookup values or the mapped values may be based on a lookup table where the spacing of the values of adjustment range are such that the ratio of adjacent values in the table are a constant)(e.g., paragraph [0028]).

Regarding claim 12, Rhodes discloses the computer system of claim 1. Rhodes further discloses wherein the distinct value estimator is implemented in system memory of the computer system. (e.g., paragraphs [0065], [0068] and [0069])

Regarding claim 13, Rhodes discloses a method of making a computer system configured to correct bias when estimating small numbers of distinct values in multiset data sets, the method comprising: (e.g., abstract, figure 8 and paragraph [0020] and [0064]-[0066]))
implementing a distinct value estimator, wherein the distinct value estimator is configured to estimate a number of distinct values in a multiset data set; (HLL and fixed-size bucket techniques are used in correlation with other techniques for different ranges of cardinality to yield an acceptable error rate and error variation to produce an enhanced accuracy 
coupling a bias table to the distinct value estimator, wherein the bias table corresponds specifically to the distinct value estimator with values in the bias table derived from a specific configuration for the distinct value estimator, the bias table having entries with values corresponding to biases caused by the specific configuration of the distinct value estimator correlated to values corresponding to numbers estimated by the distinct value estimator; wherein the entries in the table are optimized by having a set of entries with an optimized number of biases in the entries and where the biases in the entries are associated with a predetermined confidence interval; (another HLL enhancement technique involves employing a table lookup and regression between empty bin estimate range and high estimate range. In the adjustment range, an initial cardinality estimate is used as a lookup value into a lookup table. The lookup table may be any memory storage that a computer system may store in or retrieve data from. The lookup table may contain a set of lookup values and a set of mapped values, where one or more lookup values correspond to one or more mapped values. In an embodiment, the lookup table may not actually store the mapped values but rather derive them from an index of each stored lookup value. - “lookup table” is considered to be an optimized bias table with an optimized number of biases in the entries of the table, which derived from HLL, being a specific configuration of a distinct value estimator. Estimated result, along with upper and lower bounds of that estimate may be reported to the user)(e.g., paragraphs [0019] and [0028])
coupling a bias corrector to the distinct value estimator and the bias table, wherein the bias corrector is configured to correct the number of distinct values in the multiset data set using values from the bias table to produce a corrected value; and (cardinality estimator calculates an enhanced accuracy cardinality for a large data set based on a fixed-size bucket or a generated sketch, such as a bin array. Enhanced accuracy is considered as a corrected value))(e.g., figure 2 and paragraphs [0037]-[0038])
coupling a user interface to the bias corrector, wherein the user interface is configured to output the corrected value to a user. (user interface includes display of the enhanced accuracy.)(e.g., figures 1C, 5 and 6 and paragraph [0032])

Regarding claim 14, Rhodes discloses the method of claim 13. Rhodes further discloses wherein coupling a user interface to the bias corrector comprises coupling a user interface of a database system to the bias corrector to allow a database to provide unbiased estimates of small numbers of distinct values. (initial estimation is provided)(e.g., figure 1C and paragraphs [0032] and [0051]).
Claims 15, 16 and 18 have substantially similar limitations as stated in claim 9, 10 and 12, respectively; therefore, they are rejected under the same subject matter. 

Regarding claim 19, Rhodes discloses a method of outputting estimates of small numbers of distinct values from a database system, the method comprising; (an accurate cardinality estimate for low cardinality counts is produced)(e.g., paragraphs [0020]-[0021])
implementing a sketch by applying a multiset data set to a distinct value estimator stored in memory of a computing system, wherein the distinct value estimator is configured to estimate a number of distinct values in the multiset data set; (e.g., abstract and paragraphs [0035]-[0040])
receiving an estimate of a number of distinct values in the multiset data set from the distinct value estimator; (initial cardinality estimate value is retrieved. The initial value may be based on a cardinality estimate produced by an HLL sketch)(e.g., paragraph [0060])
obtaining a bias corresponding to the estimate from a bias table, the bias table corresponding specifically to the distinct value estimator with values in the bias table derived from a specific configuration for the distinct value estimator, the bias table having entries with values corresponding to biases caused by the specific configuration of the distinct value estimator correlated to values corresponding to numbers estimated by the distinct value estimator; wherein the entries in the table are optimized by having a set of entries with an optimized number of biases in the entries and where the biases in the entries are associated with predetermined confidence intervals; (HLL enhancement technique involves employing a table lookup and regression between empty bin estimate range and high estimate range. In adjustment range, an initial cardinality estimate is used as a lookup value into a lookup table. Lookup table may contain a set of lookup values and a set of mapped values. Lookup table is considered as an optimized bias table with an optimized number of biases in the entries of the table, which derived from the HLL, being a specific configuration of a distinct value estimator. With a bounded, well defined, and queryable error distribution, it is possible to report to the user not only any estimated result, but the upper and lower bounds of that estimate, based on a confidence interval.)(e.g., paragraphs [0019] and [0028])
correcting the estimate of some of the distinct values using the bias corresponding to the estimate; and (cardinality estimator calculates an enhanced accuracy cardinality for a large data set based on a fixed-size bucket or a generated sketch, such as bin array. Enhanced accuracy is considered as a corrected value)(e.g., figure 2 and paragraphs [0037] and [0038])
outputting the corrected estimate at a user interface of the database system. (enhanced accuracy is output at a user interface)(e.g., figure 1C and paragraph [0032]).

Regarding claim 20, Rhodes discloses the method of claim 19. Rhodes further discloses further comprising receiving user input at the user interface requesting a distinct value count, and wherein the method acts are performed in response to receiving the user input. (initial estimation is provided)(e.g., figure 1C and paragraphs [0032] and [0051]).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the 
Claims 4-5 are rejected under 35 U.S.C. 103 as being unpatentable over Rhodes in view of McElhinney et al. (U.S. Publication No. 2018/0039956 A1, hereinafter referred to as “McElhinney”).
Regarding claim 4, Rhodes discloses the computer system of claim 3. Rhodes discloses error distribution, error rate and error variation (e.g., paragraphs [0020]-[0021] and [0032]); however, Rhodes does not explicitly disclose wherein each of the distinct value estimators have the same precision, such that distinct values are estimated with the same precision for each of the different multiset data sets.
On the other hand, McElhinney, which relates to recommending asset repairs (title), does disclose wherein each of the distinct value estimators have the same precision, such that distinct values are estimated with the same precision for each of the different multiset data sets. (recommendation or count can be the same level of precision)(e.g., paragraphs [0006], [0009], [0019], [0113] and [0116]).
Rhodes relates to techniques for improving accuracy of analytics on big data using sketches and fixed-sized buckets. E.g., abstract. In Rhodes, different estimators are used in calculating the distinct values and calculating a cardinality estimate. However, Rhodes does not appear to specifically disclose that the estimators have the same precision. On the other hand, McElhinney provides that it is known in the art of statistics to consider precision when calculating and making recommendations. McElhinney provides that different recommendations 

Regarding claim 5, Rhodes discloses the computer system of claim 3. Rhodes discloses error distribution, error rate and error variation (e.g., paragraphs [0020]-[0021] and [0032]); however, Rhodes does not explicitly disclose wherein at least two of the distinct value estimators have different precisions. 
On the other hand, McElhinney, which relates to recommending asset repairs (title), does disclose wherein at least two of the distinct value estimators have different precisions. (recommendation or count can be different levels of precision having different precisions)(e.g., paragraphs [0006], [0009], [0019], [0113] and [0116]).
It would have been obvious to combine McElhinney with Rhodes for the same reasons as set forth in claim 4, above.

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Rhodes in view of Kipnis et al. (U.S. Publication No. 2013/0254441 A1, hereinafter referred to as “Kipnis”).
Regarding claim 8, Rhodes discloses the computer system of claim 3. However, Rhodes does not appear to specifically disclose wherein characteristics of the plurality of distinct value estimators are based on an ability to compress one or more sketches created using one or more distinct value estimators. 
On the other hand, Kipnis, which relates to processing data based upon estimated compressibility of the data (title), does disclose wherein characteristics of the plurality of distinct value estimators are based on an ability to compress one or more sketches created using one or more distinct value estimators. (distinct hash values is related to the estimated compression ratio.)(e.g., abstract and paragraphs [0029], [0030], [0032] and [0034]).
Rhodes relates to techniques for improving accuracy of analytics on big data using sketches and fixed-sized buckets. E.g., abstract. In Rhodes, different estimators are used in calculating the distinct values and calculating a cardinality estimate. However, Rhodes does not appear to specifically disclose that characteristics of the plurality of distinct value estimators are based on an ability to compress one or more sketches created using one or more distinct value estimators. On the other hand, Kipnis, which relates to processing data based upon estimated compressibility of the data (title), does disclose that it is known to have characteristics of distinct values to be based on compressibility. Kipnis provides that compression of a large data set can be time-intensive that can consume significant bandwidth. E.g., paragraph [0004]. Kipnis further provides that compression may not be warranted when the data set has only a small amount of redundancy. Therefore, it would have been obvious to incorporate the compressibility as disclosed in Kipnis to Rhodes to further enhance the manner in which distinct values are determined to improve the manner in which the estimators determine the distinct values within the disclosure of Rhodes.

Claims 11 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Rhodes in view of Cohen et al. (U.S. Patent No. 8,140,539 B1, hereinafter referred to as “Cohen”).
Regarding claim 11, Rhodes discloses the computer system of claim 1. Rhodes discloses a confidence interval (e.g., paragraphs [0019] and [0056]); however, Rhodes does not appear to specifically disclose wherein the predetermined confidence interval is a maximized confidence interval. 
On the other hand, Cohen, which relates to determining dataset estimators (e.g., title), does disclose wherein the predetermined confidence interval is a maximized confidence interval. (an upper bound and a lower bound of each confidence interval can be calculated based on expectation of a sum of independent Poisson trials.)(col 32 lines 10-19).
Rhodes relates to techniques for improving accuracy of analytics on big data using sketches and fixed-sized buckets. E.g., abstract. In Rhodes, different estimators are used in calculating the distinct values and calculating a cardinality estimate. Rhodes further discloses a confidence interval (e.g., paragraphs [0019] and [0056]); however, Rhodes does not appear to specifically disclose that the predetermined confidence interval is a maximized confidence interval. On the other hand, Cohen, which also relates to estimators (title), does disclose that the predetermined confidence interval is maximized based on Poisson trials. This provides an enhanced manner to ensure the estimation is performed to optimize the system. Therefore, it would have been obvious to incorporate the maximized confidence interval as disclosed in Cohen to Rhodes to further enhance Rhodes by ensuring that the interval is optimized to improve estimation.

further comprising performing a determined number of experiments using the distinct value estimator to maximize confidence intervals of biases in the bias table. 
On the other hand, Cohen, which relates to determining dataset estimators (e.g., title), does disclose further comprising performing a determined number of experiments using the distinct value estimator to maximize confidence intervals of biases in the bias table. (an upper bound and a lower bound of each confidence interval can be calculated based on expectation of a sum of independent Poisson trials.)(col 32 lines 10-19).
It would have been obvious to combine Cohen with Rhodes for the same reasons as stated in claim 11, above.

Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon is considered pertinent to applicant's disclosure. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RICHARD L BOWEN whose telephone number is (571)270-5982.  The examiner can normally be reached on Monday through Friday 7:30AM - 4:00PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aleksandr Kerzhner can be reached on (571)270-1760.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/RICHARD L BOWEN/            Primary Examiner, Art Unit 2165