DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 31-60 are the claimed invention is directed to abstract ideas without significantly more. 

Claims 31, 50 & 56 recite, in part, a method comprising scanning a subset of a plurality of values included in a dataset; generating based on the scanning: a value count indicator indicative of a quantity of the subset of the plurality of values; and a probabilistic estimator buffer representative of an intermediate probabilistic estimation state, the probabilistic estimator buffer based on one or more hash values; estimating a number of distinct values in the dataset based on the value count indicator and the probabilistic estimator buffer; and generating a query plan based on the estimated number of distinct values in the dataset).
As set forth in MPEP 2106.04(a)(2)(III) The courts consider a mental process (thinking) that "can be performed in the human mind, or by a human using a pen and paper" to be an abstract idea. CyberSource Corp. v. Retail Decisions, Inc., 654 F.3d 1366, 1372, 99 USPQ2d 1690, 1695 (Fed. Cir. 2011). Accordingly, the "mental processes" abstract idea grouping is defined as concepts performed in the human mind, and examples of mental processes include observations, evaluations, judgments, and opinions. 
The examiner respectfully points out that the method as recited in claim 1 comprising mental processes that can be performed using human mind with a pen and paper. 
For example, human mind is used to scan the set of names in a table as shown below and to determine that 
(a) “4” is a value count indicator represented 4 rows in the table; 
(b) “1” is a probabilistic estimator buffer representing each row will be scanned based on the leftmost value of “1000”; and 
(c) “3” is a number of distinct name values represented of “Smith”, “Jones” & “Dole” based on “4” rows and each scanning represented by “1”;
To query the salary of “Smith”, a query plan can be generated in the human mind, wherein the plan is to scan each row of the table for “Smith” based on 3 distinct name values, e.g., “Smith”, “Jones” & “Dole”. 

NAME
Smith
Jones
Dole
Smith
SALARY
200,000
150,000
80,000
100,000


The claims further recite a processor and a memory. Generic computer components recited as performing generic computer functions that are well-understood, routine and conventional activities amount to no more than implementing the abstract idea with a computerized system. There is no indication that the combination of elements improves the functioning of a computer or improves any other technology.  Their collective functions merely provide conventional computer implementation.  
Claims 31, 50 & 56 are therefore not drawn to eligible subject matter as they are directed to an abstract idea without significantly more.

Dependent claims 32-49, 51-55 & 57-60 do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea.  

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 34, 51 & 57 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claim 34, it is unclear whether the step of estimating a number of distinct values in the dataset is based on the value count indicator and the probabilistic estimator buffer as recited in claim 31, or the incremented value count indicator; and updated probabilistic estimator buffer as recited in claim 34.  

Claims 51 & 57 include features analogous to claim 34. Claims 51 & 57 are rejected for at least the reasons as noted with regard to claim 34. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 31, 34-36, 44-46, 49-51, 56 & 57 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by RJAIBI et al. [US 2003/0229617 A1], hereinafter referred to as RJAIBI.

Regarding claim 31, RJAIBI teaches a system comprising a processor (RJAIBI, [0034]); and a memory coupled to the processor, the memory including instructions stored thereon, which when executed by the processor cause the system to perform (RJAIBI, Claim 46) a method. The method as taught in RJAIBI reads on claim 1 as shown below.

CLAIMS 31, 50 & 56
A method comprising: 
scanning a subset of a plurality of values included in a dataset; 

generating based on the scanning: 
a value count indicator indicative of a quantity of the subset of the plurality of values; and 


a probabilistic estimator buffer representative of an intermediate probabilistic estimation state, the probabilistic estimator buffer based on one or more hash values; 

estimating a number of distinct values in the dataset based on the value count indicator and the probabilistic estimator buffer; and 

generating a query plan based on the estimated number of distinct values in the dataset.
RJAIBI et al.
A method comprising: 
names included in a table, e.g., Employee Table of FIGS. 2, storing names & salary, is scanned (RJAIBI, FIG. 3 & [0037]); 
based on the scanning: 
a linear count to indicate a quantity of names is generated in column cardinality (RJAIBI, FIGS. 2E & 2F & [0045][Wingdings font/0xE0] [0046]); and 
a summary statistic is derived from the linear count, wherein the summary statistic represents a relative position number n, wherein the summary statistic is based on at least one hash value (RJAIBI, [0048]); 
an estimated cardinality value is calculated based on the linear count and the relative position number n (RJAIBI, [0048] & [0050]); and 
a query plan based on the estimated cardinality value is built (RJAIBI, [0036]).


 

Regarding claim 34, RJAIBI further teaches that in response to scanning a particular value of the plurality of values: incrementing the value count indicator; and updating the probabilistic estimator buffer based on the particular value (RJAIBI, FIG. 2E & 2F & [0045][Wingdings font/0xE0] [0046] & [0048]).

Regarding claim 35, RJAIBI further teaches that updating the probabilistic estimator buffer based on the particular value includes processing the particular value using a probabilistic estimator algorithm (RJAIBI, FIG. 2E & 2F & [0045][Wingdings font/0xE0] [0046] & [0048]).

Regarding claim 36, RJAIBI further teaches that processing the particular value using a probabilistic estimator algorithm includes: applying a hash function to the particular value to generate a particular hash value of the one or more hash values; and analyzing the one or more hash values to update an intermediate distinct value estimate (RJAIBI, FIG. 2E & 2F & [0045][Wingdings font/0xE0] [0046] & [0048]).

Regarding claim 44, RJAIBI further teaches that the one or more hash values are fixed-length hash values, and wherein each of the one or more fixed-length hash values is based on a hash function applied to a different one of the subset of the plurality of values (RJAIBI, [0048][Wingdings font/0xE0][0049]).

Regarding claim 45, RJAIBI further teaches that the dataset is in the form of a table that includes values for various attributes arranged in rows and columns and wherein the value count indicator is a row count indicator (RJAIBI, FIGS. 2 & [0037] & [0045][Wingdings font/0xE0] [0046]).

Regarding claim 46, RJAIBI further teaches that the query plan is generated by a query planner in the distributed computing cluster (RJAIBI, [0035][Wingdings font/0xE0][0036]).

Regarding claim 49, RJAIBI further teaches that scanning the subset of the plurality of values included in the dataset includes scanning less than all of the plurality of values included in the dataset (RJAIBI, [0037]).

Regarding claims 51 & 57, RJAIBI further teaches that in response to scanning a particular value of the plurality of values: incrementing the value count indicator; and updating the probabilistic estimator buffer by: applying a hash function to the particular value to generate a particular hash value of the one or more hash values; and analyzing the one or more hash values to update an intermediate distinct value estimate (RJAIBI, FIG. 2E & 2F & [0045][Wingdings font/0xE0] [0046] & [0048]).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claim 32 is rejected under 35 U.S.C. 103 as being unpatentable over RJAIBI et al. [US 2003/0229617 A1], hereinafter referred to as RJAIBI, in view of ABDO [US 2003/0126127 A1].

Regarding claim 32, RJAIBI does not explicitly teach that linear count and summary statistic (i.e., the value count indicator and the probabilistic estimator buffer) are associated with a particular bucket of a plurality of buckets.
ABDO teaches a method for processing a query based on histogram and statistics. ABDO further discloses that a particular bucket of a plurality of buckets is associated with linear count (ABDO, [0056][Wingdings font/0xE0][0058]).
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to incorporate the teaching in ABDO into RJAIBI in order to manage distinct values.  
Claim 33 is rejected under 35 U.S.C. 103 as being unpatentable over RJAIBI et al. [US 2003/0229617 A1], hereinafter referred to as RJAIBI, in view of ABDO [US 2003/0126127 A1], and further in view of MIECZNIK [US 2018/0089843 A1].

Regarding claim 33, RJAIBI & ABDO do not explicitly teach that each of the plurality of buckets has a fixed memory length.
MIECZNIK teaches that each of the plurality of buckets has a fixed memory length (MIECZNIK, [0071]).
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to incorporate the teaching in MIECZNIK into RJAIBI & ABDO in order to manage the plurality of bins.

Claim 37 is rejected under 35 U.S.C. 103 as being unpatentable over RJAIBI et al. [US 2003/0229617 A1], hereinafter referred to as RJAIBI, in view of QIN et al. [US 2017/0300528 A1], hereinafter referred to as QIN.

Regarding claim 37, RJAIBI does not explicitly teach that the probabilistic estimator algorithm is HyperLogLog.
QIN teaches that HyperLogLog is used to determine distinct values (QIN, [0022]).
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to incorporate the teaching in QIN into RJAIBI in order to manage distinct values.  

Claims 38, 42, 52, 54, 58 & 60 are rejected under 35 U.S.C. 103 as being unpatentable over RJAIBI et al. [US 2003/0229617 A1], hereinafter referred to as RJAIBI, in view of CHIANG [US 2019/0197162 A1].

Regarding claims 38, 52 & 58, RJAIBI does not explicitly teach the steps of generating a plurality of data points, wherein a particular data point of the plurality of data points includes: the value count indicator as an x-value; and an intermediate distinct value estimate based on the probabilistic estimator buffer as a y-value.
CHIANG teaches a method of estimating cardinality value (CHIANG, Abstract). CHIANG further discloses the steps of generating a plurality of data points, wherein a particular data point of the plurality of data points includes: the value count indicator as an x-value; and an intermediate distinct value estimate based on the probabilistic estimator buffer as a y-value (CHIANG, [0032]). 
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to incorporate the teaching in CHIANG into RJAIBI in order to manage the linear count and summary statistic.

Regarding claims 42, 54 & 60, CHIANG further discloses the steps of:
plotting the plurality of data points (CHIANG, [0077]);
fitting an objective function to the plotted plurality of data points (CHIANG, [0059] & [0076][Wingdings font/0xE0][0077]); and
extrapolating the objective function to an estimated or known total number of values in the dataset (CHIANG, [0076]).

Claims 39, 41, 53 & 59 are rejected under 35 U.S.C. 103 as being unpatentable over RJAIBI et al. [US 2003/0229617 A1], hereinafter referred to as RJAIBI, in view of CHIANG [US 2019/0197162 A1], and further in view of AUSIN et al. [US 10,157,213 B1], hereinafter referred to as AUSIN.

Regarding claims 39, 53 & 59, CHIANG further teaches the step of generating a data point for each of a plurality of buckets (CHIANG, [0032]).
RJAIBI & CHIANG do not explicitly teach the step of merging two or more of the plurality of buckets; and generating additional data points based on the merging. 
AUSIN teaches a method for updating histogram. AUSIN further discloses the steps of: 
merging two or more of the plurality of buckets (AUSIN, Col. 12-Lines 16[Wingdings font/0xE0]35); and 
generating additional data points based on the merging (AUSIN, Col. 12-Lines 16[Wingdings font/0xE0]35).
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to incorporate the teaching in AUSIN into RJAIBI & CHIANG in order to manage a histogram.

Regarding claim 41, AUSIN further teaches that the two or more buckets are merged by successively merging various combinations of the plurality of buckets in a rolling window based on a quantity of the plurality of buckets (AUSIN, Col. 12-Lines 16[Wingdings font/0xE0]35).

Claim 40 is rejected under 35 U.S.C. 103 as being unpatentable over RJAIBI et al. [US 2003/0229617 A1], hereinafter referred to as RJAIBI, in view of CHIANG [US 2019/0197162 A1], and further in view of AUSIN et al. [US 10,157,213 B1], hereinafter referred to as AUSIN, and FRASER et al. [US 2008/0059125 A1], hereinafter referred to as FRASER.

Regarding claim 40, RJAIBI, CHIANG & AUSIN do not explicitly teach that a sum of the data points and additional data points generated is equal to a quantity of the plurality of buckets, squared.
FRASER teaches that a sum of the data points and additional data points generated is equal to a quantity of the plurality of buckets, squared (FRASER, FIG. 4 & [0041][Wingdings font/0xE0][0042]).
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to incorporate the teaching in FRASER into RJAIBI, CHIANG & AUSIN in order to manage the histogram. 

Claim 43 is rejected under 35 U.S.C. 103 as being unpatentable over RJAIBI et al. [US 2003/0229617 A1], hereinafter referred to as RJAIBI, in view of CHIANG [US 2019/0197162 A1], and further in view of SAWAFTA [US 2004/0010515 A1].

Regarding claim 43, RJAIBI & CHIANG do not explicitly teach the step of applying a curve fitting process to set parameters for a plurality of different objective functions to best fit the plotted data points; and selecting the one of the plurality of different objective functions that, based on a statistical analysis, best fits the plotted data points.
SAWAFTA teaches a curve fitting technique. HUANG further discloses the step of applying a curve fitting process to set parameters for a plurality of different objective functions to best fit the plotted data points; and selecting the one of the plurality of different objective functions that, based on a statistical analysis, best fits the plotted data points (SAWAFTA, FIG. 2F, [0052]).
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to incorporate the teaching in SAWAFTA into RJAIBI & CHIANG in order to manage a query plan.

Claims 47 & 48 are rejected under 35 U.S.C. 103 as being unpatentable over RJAIBI et al. [US 2003/0229617 A1], hereinafter referred to as RJAIBI, in view of COLE et al. [US 2015/0154255 A1], hereinafter referred to as COLE.

Regarding claim 47, RJAIBI does not explicitly teach the steps of generating a plurality of query plan fragments; and distributing the query plan fragments to a plurality of data nodes in the distributed computing cluster for execution.
COLE teaches a method for processing queries. COLE further teaches the steps of:
generating a plurality of query plan fragments (COLE, [0031][Wingdings font/0xE0][0032] & [0042]);
distributing the query plan fragments to a plurality of data nodes in the distributed computing cluster for execution (COLE, [0042]); 
 It would have been obvious for one of ordinary skill in the art at the time the invention was filed to incorporate the teaching in COLE into RJAIBI in order to process a query.

Regarding claim 48, RJAIBI further discloses the step of executing the query plan using the dataset (RJAIBI, [0036]). CHIANG does not explicitly teach the step of outputting results corresponding to the execution of the query plan.
COLE teaches a method for processing queries. COLE further teaches the step of outputting results corresponding to the execution of the query plan (COLE, [0031] & [0043]).
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to incorporate the teaching in COLE into RJAIBI in order to process a query.

Claim 55 is rejected under 35 U.S.C. 103 as being unpatentable over RJAIBI et al. [US 2003/0229617 A1], hereinafter referred to as RJAIBI, in view of ABDO [US 2003/0126127 A1], and further in view of MIECZNIK [US 2018/0089843 A1].

Regarding claim 55, RJAIBI does not explicitly teach that linear count and summary statistic (i.e., the value count indicator and the probabilistic estimator buffer) are associated with a particular bucket of a plurality of buckets.
ABDO teaches a method for processing a query based on histogram and statistics. ABDO further discloses that a particular bucket of a plurality of buckets is associated with linear count (ABDO, [0056][Wingdings font/0xE0][0058]).
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to incorporate the teaching in ABDO into RJAIBI in order to manage distinct values.
RJAIBI & ABDO do not explicitly teach that each of the plurality of buckets has a fixed memory length.
MIECZNIK teaches that each of the plurality of buckets has a fixed memory length (MIECZNIK, [0071]).
It would have been obvious for one of ordinary skill in the art at the time the invention was filed to incorporate the teaching in MIECZNIK into RJAIBI & ABDO in order to manage the plurality of bins.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUNG Q. PHAM whose telephone number is (571)272-4040. The examiner can normally be reached Monday-Friday 9am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela D. Reyes can be reached on 571-270-1006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

HUNG Q. PHAM
Primary Examiner
Art Unit 2159

/HUNG Q PHAM/Primary Examiner, Art Unit 2159                                                                                                                                                                                            July 27, 2022