DETAILED ACTION
This Office Action is in response to the original application filed on 11/30/2018. Claims 1-20 are pending, of which, claims 1, 8, and 15 are presented in independent form.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 03/30/2020 and 07/23/2020 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The drawings submitted on 01/31/2018 are accepted.

Specification
The disclosure is objected to because of the following informalities: 
In [0079], line 10, remove the extra period.  
Appropriate correction is required.

Claim Objections
Claim 5 is objected to because of the following informalities:  
In claim 5, “sort the dataset based at least on a comparison of the ratio to a threshold value” should read as “wherein said sort of the dataset is based at least on a comparison of the ratio to a threshold value” to be consistent with the language of claims 12 and 19.  
Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Claim 1 recites “sort the dataset according to the identifiers that are associated with a field of the plurality of fields for the dataset to generate a sorted dataset; divide the sorted dataset into a plurality of partitions, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value; determine a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions; aggregate a number of instances of the determined presence of the key; and generate a distinct count of values for the identifiers associated with the key.”

See MPEP 2106.05(g) “data gathering and outputting”). Accordingly, the additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
In step 2b, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of a processing system that includes one or more processors and a memory configured to store program code to be executed by the processing system amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The additional element of access a dataset that includes a plurality of fields to no more than insignificant extra-solution activities that the courts have recognized to be well-understood, routine, conventional See MPEP 2106.05(d)(II) “Storing and retrieving information in memory”). Insignificant extra-solution activities that the courts have recognized to be well-understood, routine, conventional activity cannot provide an inventive concept. Taking the elements both individually and as a whole, the claim does not amount to significantly more than the abstract idea itself. The claims is not patent eligible.
Claims 8 and 15 recite substantially the same limitations as claim 1, and follows substantially the same analysis.

Claim 2 recites divide the sorted dataset into the plurality of partitions at respective ones of the plurality of distributed servers according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers, and wherein at each one of the plurality of distributed servers the presence of the key of the keys is determined for a corresponding one of the plurality of partitions. As discussed above, the limitations of divide the sorted dataset into the plurality of partitions at respective ones of the plurality of distributed servers according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers, and wherein at each one of the plurality of distributed servers the presence of the key of the keys is determined for a corresponding one of the plurality of partitions, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitations in the mind. For example, “divide” in the context of this claim encompasses the user dividing the data into groups/partitions based on data values being the same. Similarly, “determine a presence” in the context of this claim encompasses the user mentally reviewing the data to determine that a key exists in the 
This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements – comprising a plurality of distributed servers. The distributed servers are recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using generic computer components. Accordingly, the additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
In step 2b, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of a system comprising a plurality of distributed servers amounts to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. Taking the elements both individually and as a whole, the claim does not amount to significantly more than the abstract idea itself. The claims is not patent eligible.
Claims 9, 10, 16, and 17 recite substantially the same limitations as claim 2, and follows substantially the same analysis.

Claim 3 recites divide the sorted dataset into the plurality of partitions according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers, and determine the presence of the key of the keys for each one of the plurality of partitions via a corresponding separate logical processing space of the processing system. As discussed above, the limitations of divide the sorted dataset into the plurality of partitions according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers; and determine the presence of the key of the keys for each one of the plurality of partitions via a corresponding separate logical processing space of the processing system, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitations in the mind. For example, “divide” in the context of this claim encompasses the user dividing the data into groups/partitions based on data values being the same. Similarly, “determine a presence” in the context of this claim encompasses the user mentally reviewing the data to determine that a key exists in the data. This can be demonstrated by mentally performing the above limitations on Fig. 4 of the applicant’s filed drawings. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim does not recite additional elements and therefore does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract 

Claim 4 recites receive an instruction for determining an exact, distinct count for the identifiers associated with the key, wherein the identifiers are at least one of user identifiers, tenant identifiers, numbers of accesses, or access times, and wherein the key is at least one of a hosted web service, a hosted web application, or a search term. The limitations of receive an instruction for determining an exact, distinct count for the identifiers associated with the key, wherein the identifiers are at least one of user identifiers, tenant identifiers, numbers of accesses, or access times, and wherein the key is at least one of a hosted web service, a hosted web application, or a search term, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitations in the mind. For example, “receive an instruction for determining” in the context of this claim encompasses the user being told to perform the determining of distinct counts based on certain types of identifiers and keys. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim does not recite additional elements and therefore does not impose any 
Claims 11 and 18 recite substantially the same limitations as claim 4, and follows substantially the same analysis.

Claim 5 recites determine a ratio of the keys to the identifiers for the dataset; and sort the dataset based at least on a comparison of the ratio to a threshold value. The limitations of determine a ratio of the keys to the identifiers for the dataset; and sort the dataset based at least on a comparison of the ratio to a threshold value, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitations in the mind. For example, “determine a ratio” in the context of this claim encompasses the user mentally calculating a ratio. Similarly, “sort” in the context of this claim encompasses the user sorting and organizing the data in the dataset based on the calculated ratio for certain data using a threshold. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim does not recite additional elements and therefore does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract 
Claims 12 and 19 recite substantially the same limitations as claim 5, and follows substantially the same analysis.

Claims 6, 13, and 20 recite additional elements that relate to the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application. These additional elements amount to no more than insignificant extra-solution activities that the courts have recognized to be well-understood, routine, conventional activity (See MPEP 2106.05(d)(II) “Receiving or transmitting data over a network”). As discussed above in respect to insignificant extra-solution activities that the courts have recognized to be well-understood, routine, conventional activity cannot provide an inventive concept, therefore the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. Taking the elements both individually and as a whole, the claims do not amount to significantly more than the abstract idea itself. The claims are not patent eligible.

Claims 7, 14, and 20 recite additional elements that relate to a system that is cloud-based and that hosts big data storage for the dataset. These additional elements amount to no more than insignificant extra-solution activities that the courts have recognized to be well-understood, routine, conventional activity (See MPEP 2106.05(d)(II) “Receiving or transmitting data over a network” and “Storing and retrieving information in memory”). As discussed above in respect to insignificant extra-solution activities that the courts have recognized to be well-understood, routine, conventional activity cannot provide an inventive concept, therefore the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. Taking the elements both individually and as a whole, the claims do not amount to significantly more than the abstract idea itself. The claims are not patent eligible.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 8 and 9 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Dyskant (U.S. Pub. No. 2007/0239663, cited in IDS).
 
Regarding independent claim 8, Dyskant teaches a computer-implemented method for determining a distinct count for identifiers on keys, the method comprising: (Dyskant, Fig. 1 and [0001], discloses "The invention relates to a system and method for parallel processing of large amounts of data in order to count distinct values".)
sorting a dataset according to the identifiers to generate a sorted dataset, the identifiers being values for a field of the dataset; (Dyskant, Fig. 2 and [0014]-[0015], discloses splitting a data set into a number of chunks based on the values in a column and sorting them to be distinctly counted.)
dividing the sorted dataset into a plurality of partitions, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value; (Dyskant, Fig. 2 and [0014], discloses "splits a data set into a number of chunks based on the values in a column upon which a count distinct function is to be performed (analytic column), such that no value appears in two or more chunks.") 
determining, for each partition of the plurality of partitions, a presence of a key of the keys in the at least one subset; aggregating a number of instances of the determined presence of the key; and generating a distinct count for the identifiers associated with the key. (Dyskant, [0012], discloses "Since no value of the analytic column appears in more than one chunk, each chunk can be treated as a separate problem and the system can use parallel processing to perform a count distinct function on each chunk, and then simply add the results together." Dyskant, Fig. 2 and [0024]-[0025], discloses analyzing the chunks of data to perform distinct counts for values of each chunk and then adding the results to obtain overall count distinct values. The values in the chunks are grouped and distinctly counted, where each key appears once.)
 
 the computer-implemented method of claim 8, wherein each partition of the plurality of partitions is provided to respective logically separate portions of a processing system for said determining subsequent to said dividing, said dividing further comprising: dividing the sorted dataset into the plurality of partitions according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers; and wherein said determining the presence of the key of the keys comprises determining the presence of the key of the keys for each of the plurality of partitions at a corresponding one of the logically separate portions. (Dyskant, Fig. 2 and [0014], discloses splitting the data based on a range partition based on how many values fit into a query processor's memory. Examiner interprets that the subchunks are balanced between the query processors based on their sizes and in order to group by a field, a query processor must determine the "presence of the key". Dyskant, [0020], discloses query processor have its own memory. Examiner interprets query processors could be distributed servers and the memory of query processors as a separate logical processing spaces.)
 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

s 1-4, 6-11, 13-18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Dyskant, in view of Croft et al. (U.S. Pub. No. 2007/0106643), hereinafter Croft.
 
Regarding independent claim 1, Dyskant teaches a system for determining a distinct count for identifiers based on keys, the system comprising: (Dyskant, Fig. 1 and [0001], discloses "The invention relates to a system and method for parallel processing of large amounts of data in order to count distinct values".)
access a dataset that includes a plurality of fields; (Dyskant, Fig. 2 and [0021], discloses accessing source data with columns.)
sort the dataset according to the identifiers that are associated with a field of the plurality of fields for the dataset to generate a sorted dataset; (Dyskant, Fig. 2 and [0014]-[0015], discloses splitting a data set into a number of chunks based on the values in a column and sorting them to be distinctly counted.)
divide the sorted dataset into a plurality of partitions, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value; (Dyskant, Fig. 2 and [0014], discloses "splits a data set into a number of chunks based on the values in a column upon which a count distinct function is to be performed (analytic column), such that no value appears in two or more chunks.")
determine a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions; aggregate a number of instances of the determined presence of the key; and generate a distinct count of values for the identifiers associated with the key.  (Dyskant, [0012], discloses "Since no value of the analytic column appears in more than one chunk, each chunk can be treated as a separate problem and the system can use parallel processing to perform a count distinct function on each chunk, and then simply add the results together." Dyskant, Fig. 2 and [0024]-[0025], discloses analyzing the chunks of data to perform distinct counts for values of each chunk and then adding the results to obtain overall count distinct values. The values in the chunks are grouped and distinctly counted, where each key appears once.)
However, Dyskant does not explicitly teach a processing system that includes one or more processors; and a memory configured to store program code to be executed by the processing system, the program code configured to: 
On the other hand, Croft teaches a processing system that includes one or more processors; and a memory configured to store program code to be executed by the processing system, the program code configured to: (Croft, [0145]-[0146] and [0149]-[0150], discloses that the system may be distributed across multiple computers, which are general purpose computers with computer-readable media that contain instructions for use in execution by a processor, in a networked system communicating via the internet. Examiner interprets this to be distributed servers in a cloud environment.)
Croft also teaches system for determining a distinct count for identifiers based on keys (Croft, Abstract, discloses "calculating a distinct count value from data stored in a hierarchical database. A counting measure may be defined in the Examiner interprets the category/column of data as a key and search terms in a query is used to identify the category/columns.)
The system that provides distinct count value from data stored in a hierarchical database of Croft can be the system for parallel processing of large amounts of data in order to count distinct values of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of the distinct count value calculation system of Croft because both address the same field of distinct count value systems and by incorporating Croft into the Dyskant enables the distinct count value system to be used in a cloud environment.
One of ordinary skill in the art would be motivated to do so as to provide a less processor intensive and significantly reduced processing time to process a reasonably sized data set, as taught by Croft [0002].
 
Regarding claim 2, Dyskant, in view of Croft, teaches the system of claim 1, comprising a plurality of distributed servers; (Croft, [0145]-[0146] and [0150], discloses that the system may be distributed across multiple computers in a networked system communicating via the internet. Examiner interprets this to be distributed servers in a cloud environment.) 
wherein the program code is configured to: divide the sorted dataset into the plurality of partitions at respective ones of the plurality of distributed servers according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers; and wherein at each one of the plurality of distributed servers the presence of the key of the keys is determined for a corresponding one of the plurality of partitions. (Dyskant, Fig. 2 and [0014], discloses splitting the data based on a range partition based on how many values fit into a query processor's memory. Examiner interprets that the subchunks are balanced between the query processors based on their sizes and in order to group by a field, a query processor must determine the "presence of the key". Dyskant, [0020], discloses query processor have its own memory. Examiner interprets query processors could be distributed servers and the memory of query processors as a separate logical processing spaces.)
 
Regarding claim 3, Dyskant, in view of Croft, teaches the system of claim 1, wherein the program code is configured to: divide the sorted dataset into the plurality of partitions according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers; and determine the presence of the key of the keys for each one of the plurality of partitions via a corresponding separate logical processing space of the processing system. (Dyskant, Fig. 2 and [0014], discloses splitting the data based on a range partition based on how many values fit into a query processor's memory. Examiner interprets that the subchunks are balanced between the query processors based on their sizes and in order to group by a field, a query processor must determine the "presence of the key". Dyskant, [0020], discloses query processor have its own memory. Examiner interprets query processors could be distributed servers and the memory of query processors as a separate logical processing spaces.)
 
Regarding claim 4, Dyskant, in view of Croft, teaches the system of claim 1, wherein the program code is configured to: receive an instruction for determining an exact, distinct count for the identifiers associated with the key, (Dyskant, [0012], discloses "Since no value of the analytic column appears in more than one chunk, each chunk can be treated as a separate problem and the system can use parallel processing to perform a count distinct function on each chunk, and then simply add the results together." Dyskant, Fig. 2 and [0024]-[0025], discloses analyzing the chunks of data to perform distinct counts for values of each chunk and then adding the results to obtain overall count distinct values. The values in the chunks are grouped and distinctly counted, where each key appears once.) wherein the identifiers are at least one of user identifiers, tenant identifiers, numbers of accesses, or access times, and wherein the key is at least one of a hosted web service, a hosted web application, or a search term. (Croft, Fig. 4 and [0032], discloses dataset with a column with buyer information. Examiner interprets buyer information as user identifiers. Croft, [0039], discloses receiving a query that identifies one or more categories of data from the database and identifies the pre-defined counting measures for calculating distinct count values. Examiner interprets the category/column of data as a key and a search term in a query is used to identify the category/column.)
 
Claim 18 recites substantially the same limitations as claim 4, and is rejected for substantially the same reasons.
 
Regarding claim 6, Dyskant, in view of Croft, teaches the system of claim 1, wherein the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application. (Croft, [0026], discloses host applications accessing data warehousing with a multidimensional database. Croft, [0145]-[0146] and [0150], discloses that the system may be distributed across multiple computers in a networked system communicating via the internet. Examiner interprets this to be data for hosted web application/service.) 
Examiner notes that “the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application” is intended use and is nonfunctional descriptive material and does not hold any patentable weight. See MPEP 2111.05, specifically “…If a new and unobvious functional relationship between the 
 
Regarding claim 7, Dyskant, in view of Croft, teaches the system of claim 1, wherein the system is a cloud-based system that hosts big data storage for the dataset. (Dyskant, [0001], discloses "The invention relates to a system and method for parallel processing of large amounts of data in order to count distinct values". In combination, Croft, [0026], discloses data warehousing with multidimensional database. Examiner interprets data warehousing as big data storage. Croft, [0145]-[0146] and [0150], discloses that the system may be distributed across multiple computers in a networked system communicating via the internet. Examiner interprets this to be distributed servers in a cloud environment.) 
Examiner notes that “system is a cloud-based system that hosts big data storage for the dataset" is intended field of use and is nonfunctional descriptive material and does not hold any patentable weight. See MPEP 2111.05, specifically “…If a new and unobvious functional relationship between the printed matter and the substrate does not exist. USPTO personnel need not give patentable weight to printed matter. See In re Lowry, 32 F.3d 1579, 1583-84, 32 USPQ2d 1031, 1035 (Fed. Cir. 1994); In re Ngai, 367 F.3d 1336, 70 USPQ2d 1862 (Fed. Cir. 2004).”
 
the computer-implemented method of claim 9, wherein the logically separate portions of the processing system (Dyskant, [0020], discloses query processor have its own memory. Examiner interprets query processors with own memory as separate logical processing spaces.) 
However, Dyskant does not explicitly teach wherein the logically separate portions of the processing system comprise a plurality of distributed servers.
On the other hand, Croft teaches wherein the logically separate portions of the processing system comprise a plurality of distributed servers. Croft, [0145]-[0146] and [0150], discloses that the system may be distributed across multiple computers in a networked system communicating via the internet. Examiner interprets this to be distributed servers in a cloud environment.) 
Croft, Abstract and [0039], teaches calculating a distinct count value from data stored in a hierarchical database using a query that identifies one or more categories of data from the database and identifies the pre-defined counting measures for calculating distinct count values. The system that provides distinct count value from data stored in a hierarchical database of Croft can be the system for parallel processing of large amounts of data in order to count distinct values of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of the distinct count value calculation system of Croft because both address the same field of distinct count value systems and by incorporating Croft into the Dyskant enables the distinct count value system to be used in a cloud environment.

Claim 17 recites substantially the same limitations as claim 10, and is rejected for substantially the same reasons.
 
Regarding claim 11, Dyskant teaches all the limitations as set forth in the rejection of claim 8 above. Dyskant further teaches the computer-implemented method of claim 8, further comprising: receiving an instruction for determining an exact, distinct count for the identifiers associated with the key, (Dyskant, [0012], discloses "Since no value of the analytic column appears in more than one chunk, each chunk can be treated as a separate problem and the system can use parallel processing to perform a count distinct function on each chunk, and then simply add the results together." Dyskant, Fig. 2 and [0024]-[0025], discloses analyzing the chunks of data to perform distinct counts for values of each chunk and then adding the results to obtain overall count distinct values. The values in the chunks are grouped and distinctly counted, where each key appears once.) 
However, Dyskant does not explicitly teach wherein the identifiers are at least one of user identifiers, tenant identifiers, numbers of accesses, or access times, and wherein the key is at least one of a hosted web service, a hosted web application, or a search term.
On the other hand, Croft teaches wherein the identifiers are at least one of user identifiers, tenant identifiers, numbers of accesses, or access times, and wherein the key is at least one of a hosted web service, a hosted web application, or a search term. (Croft, Fig. 4 and [0032], discloses dataset with a column with buyer information. Examiner interprets buyer information as user identifiers. Croft, [0039], discloses receiving a query that identifies one or more categories of data from the database and identifies the pre-defined counting measures for calculating distinct count values. Examiner interprets the category/column of data as a key and a search term in a query is used to identify the category/column.)
Croft, Abstract and [0039], teaches calculating a distinct count value from data stored in a hierarchical database using a query that identifies one or more categories of data from the database and identifies the pre-defined counting measures for calculating distinct count values. The system that provides distinct count value from data stored in a hierarchical database of Croft can be the system for parallel processing of large amounts of data in order to count distinct values of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of the distinct count value calculation system of Croft because both address the same field of distinct count value systems and by incorporating Croft into the Dyskant enables the distinct count value system to be used in a cloud environment.
One of ordinary skill in the art would be motivated to do so as to provide a less processor intensive and significantly reduced processing time to process a reasonably sized data set, as taught by Croft [0002].
Examiner notes that “wherein the identifiers are at least one of user identifiers, tenant identifiers, numbers of accesses, or access times, and wherein the key is at least  
 
Regarding claim 13, Dyskant teaches all the limitations as set forth in the rejection of claim 8 above. However, Dyskant does not explicitly teach the computer-implemented method of claim 8, wherein the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application. 
On the other hand, Croft teaches wherein the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application. (Croft, [0026], discloses host applications accessing data warehousing with a multidimensional database. Croft, [0145]-[0146] and [0150], discloses that the system may be distributed across multiple computers in a networked system communicating via the internet. Examiner interprets this to be data for hosted web application/service.) 
Croft, Abstract and [0039], teaches calculating a distinct count value from data stored in a hierarchical database using a query that identifies one or more categories of data from the database and identifies the pre-defined counting measures for calculating distinct count values. The system that provides distinct count value from data stored in a hierarchical database of Croft can be the system for parallel processing of large 
One of ordinary skill in the art would be motivated to do so as to provide a less processor intensive and significantly reduced processing time to process a reasonably sized data set, as taught by Croft [0002].
Examiner notes that “the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application” is intended use and is nonfunctional descriptive material and does not hold any patentable weight. See MPEP 2111.05, specifically “…If a new and unobvious functional relationship between the printed matter and the substrate does not exist. USPTO personnel need not give patentable weight to printed matter. See In re Lowry, 32 F.3d 1579, 1583-84, 32 USPQ2d 1031, 1035 (Fed. Cir. 1994); In re Ngai, 367 F.3d 1336, 70 USPQ2d 1862 (Fed. Cir. 2004).”
 
Regarding claim 14, Dyskant teaches all the limitations as set forth in the rejection of claim 8 above. Dyskant further teaches the computer-implemented method of claim 8, wherein the computer-implemented method is implemented in a system that hosts big data storage for the dataset. (Dyskant, [0001], discloses 
However, Dyskant does not explicitly teach a system that is cloud-based 
On the other hand, Croft teaches a system that is cloud-based (Croft, [0026], discloses data warehousing with multidimensional database. Examiner interprets data warehousing as big data storage. Croft, [0145]-[0146] and [0150], discloses that the system may be distributed across multiple computers in a networked system communicating via the internet. Examiner interprets this to be distributed servers in a cloud environment.) 
Croft, Abstract and [0039], teaches calculating a distinct count value from data stored in a hierarchical database using a query that identifies one or more categories of data from the database and identifies the pre-defined counting measures for calculating distinct count values. The system that provides distinct count value from data stored in a hierarchical database of Croft can be the system for parallel processing of large amounts of data in order to count distinct values of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of the distinct count value calculation system of Croft because both address the same field of distinct count value systems and by incorporating Croft into the Dyskant enables the distinct count value system to be used in a cloud environment.
One of ordinary skill in the art would be motivated to do so as to provide a less processor intensive and significantly reduced processing time to process a reasonably sized data set, as taught by Croft [0002].

 
Regarding independent claim 15, Dyskant teaches perform a method for determining a distinct count for identifiers on keys, the method comprising: (Dyskant, Fig. 1 and [0001], discloses "The invention relates to a system and method for parallel processing of large amounts of data in order to count distinct values".)
sorting a dataset according to the identifiers in a field of the dataset to generate a sorted dataset; (Dyskant, Fig. 2 and [0014]-[0015], discloses splitting a data set into a number of chunks based on the values in a column and sorting them to be distinctly counted.)
dividing the sorted dataset into a plurality of partitions, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value; (Dyskant, Fig. 2 and [0014], discloses "splits a data set into a number of chunks based on the values in a column upon which a count distinct function is to be performed (analytic column), such that no value appears in two or more chunks.")
providing each partition of the plurality of partitions to respective logically separate portions of a processing system; determining at ones of the respective logically separate portions a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions; aggregating a number of instances of the determined presence of the key; and generating a distinct count for the identifiers associated with the key. (Dyskant, [0012], discloses "Since no value of the analytic column appears in more than one chunk, each chunk can be treated as a separate problem and the system can use parallel processing to perform a count distinct function on each chunk, and then simply add the results together." Dyskant, Fig. 2 and [0024]-[0025], discloses providing chunks to the multiple query processors to analyze the chunks of data to perform distinct counts for values of each chunk and then adding the results to obtain overall count distinct values. The values in the chunks are grouped and distinctly counted, where each key appears once.)
However, Dyskant does not explicitly teach a computer-readable storage medium having program instructions recorded thereon that, when executed by one or more processors, 
On the other hand, Croft teaches a computer-readable storage medium having program instructions recorded thereon that, when executed by one or more processors, (Croft, [0145]-[0146] and [0149]-[0150], discloses that the system may be distributed across multiple computers, which are general purpose computers with computer-readable media that contain instructions for use in execution by a processor, in a networked system communicating via the internet. Examiner interprets this to be distributed servers in a cloud environment.)
method for determining a distinct count for identifiers based on keys (Croft, Abstract, discloses "calculating a distinct count value from data stored in a hierarchical database. A counting measure may be defined in the hierarchical database such that the counting measure is associated with members of a first category of data to be stored in the hierarchical database." Croft, [0039], discloses receiving a query that identifies one or more categories of data from the database and identifies the pre-defined counting measures for calculating distinct count values. Examiner interprets the category/column of data as a key and search terms in a query is used to identify the category/columns.)
The system that provides distinct count value from data stored in a hierarchical database of Croft can be the system for parallel processing of large amounts of data in order to count distinct values of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of the distinct count value calculation system of Croft because both address the same field of distinct count value systems and by incorporating Croft into the Dyskant enables the distinct count value system to be used in a cloud environment.
One of ordinary skill in the art would be motivated to do so as to provide a less processor intensive and significantly reduced processing time to process a reasonably sized data set, as taught by Croft [0002].
 
Regarding claim 16, Dyskant, in view of Croft, teaches the computer-readable storage medium of claim 15, wherein said dividing comprises: dividing the sorted dataset into the plurality of partitions according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers. (Dyskant, Fig. 2 and [0014], discloses splitting the data based on a range partition based on how many values fit into a query processor's memory. Examiner interprets that the subchunks are balanced between the query processors based on their sizes.)
 
Regarding claim 20, Dyskant, in view of Croft, teaches the computer-readable storage medium of claim 15, wherein the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application; or wherein the one or more processors are of a system that is a cloud-based system that hosts big data storage for the dataset. (Dyskant, [0001], discloses "The invention relates to a system and method for parallel processing of large amounts of data in order to count distinct values". In combination, Croft, [0026], discloses host applications accessing data warehousing with a multidimensional database. Examiner interprets data warehousing as big data storage. Croft, [0145]-[0146] and [0150], discloses that the system may be distributed across multiple computers in a networked system communicating via the internet. Examiner interprets this to be distributed servers in a cloud environment and interprets this to be data for hosted web application/service.) 
Examiner notes that “the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application” and “system is a cloud-based system that hosts big data storage for the dataset" are intended field of use and is nonfunctional descriptive material and does not hold any patentable weight. See MPEP 
 
 
 
Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Dyskant, in view of Fricke et al. (U.S. Pub. No. 2014/0351254, cited in IDS), hereinafter Fricke.
 
Regarding claim 12, Dyskant teaches all the limitations as set forth in the rejection of claim 8 above. However, Dyskant does not explicitly teach the computer-implemented method of claim 8, further comprising: determining a ratio of the keys to the identifiers for the dataset; wherein said sorting the dataset comprises sorting the dataset based at least on a comparison of the ratio to a threshold value.
On the other hand, Fricke teaches determining a ratio of the keys to the identifiers for the dataset; wherein said sorting the dataset comprises sorting the dataset based at least on a comparison of the ratio to a threshold value. (Fricke, [0007] and [0009], discloses generating a uniqueness metric representative of data in a database table column that is split across a plurality of data partitions for classifying column where the uniqueness metric is a calculated ratio of the total number of unique Examiner interprets a calculated ratio of the total number of unique values to a number of rows of the column as a ratio of the keys to the identifiers for the dataset and classifying the data using the ratio and a threshold as sorting a dataset based at least on a comparison of the ratio to a threshold value.)
The system that provides unique counts of values for data stored in a hierarchical table column of Fricke can be the system for parallel processing of large amounts of data in order to count distinct values of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of the use of uniqueness metrics of Fricke because both address the same field of distinct count value systems and by incorporating Fricke into the Dyskant enables the distinct count value system to be classify data based on a uniqueness ratio against threshold.
One of ordinary skill in the art would be motivated to do so as to provide column-based architectures enriched by additional mechanisms aimed at minimizing the need for access to compressed data without resulting in lower compression efficiency and/or increased processing requirements to access the compressed data, as taught by Fricke[0005].
 
 
 
s 5 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Dyskant, in view of Croft, and further in view of Fricke.
 
Regarding claim 5, Dyskant, in view of Croft, teaches all the limitations as set forth in the rejection of claim 8 above. However, Dyskant, in view of Croft, does not explicitly teach the system of claim 1, wherein the program code is configured to: determine a ratio of the keys to the identifiers for the dataset; and sort the dataset based at least on a comparison of the ratio to a threshold value.  
On the other hand, Fricke teaches determine a ratio of the keys to the identifiers for the dataset; and sort the dataset based at least on a comparison of the ratio to a threshold value. (Fricke, [0007] and [0009], discloses generating a uniqueness metric representative of data in a database table column that is split across a plurality of data partitions for classifying column where the uniqueness metric is a calculated ratio of the total number of unique values to a number of rows of the column. Fricke, [0033], discloses range partitioning on data that is amendable to being ordered. Fricke, [0025], discloses comparing the uniqueness metric to a threshold to classify the data. Examiner interprets a calculated ratio of the total number of unique values to a number of rows of the column as a ratio of the keys to the identifiers for the dataset and classifying the data using the ratio and a threshold as sorting a dataset based at least on a comparison of the ratio to a threshold value.)
The system that provides unique counts of values for data stored in a hierarchical table column of Fricke can be the system for parallel processing of large amounts of data in order to count distinct values of Dyskant. It would have been obvious to one of 
One of ordinary skill in the art would be motivated to do so as to provide column-based architectures enriched by additional mechanisms aimed at minimizing the need for access to compressed data without resulting in lower compression efficiency and/or increased processing requirements to access the compressed data, as taught by Fricke[0005].
Claim 19 recites substantially the same limitations as claim 5, and is rejected for substantially the same reasons.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EDDY CHEUNG whose telephone number is (571)272-9785.  The examiner can normally be reached on MON-TH 8:00AM-4:00PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aleksandr Kerzhner can be reached on (571)270-1760.  The fax phone 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/EC/Examiner, Art Unit 2165                                                                                                                                                                                                        
/ALEKSANDR KERZHNER/Supervisory Patent Examiner, Art Unit 2165