DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
This Office Action is in response to the Applicant’s response filed on 05/16/2022.
 
Claims 1, 8, and 15 are amended; and claims 2-7, 9-14, and 16-20 are unchanged; therefore, claims 1-20 are pending in the application, of which, claims 1, 8, and 15 are presented in independent form.

Response to Arguments
Applicant's arguments filed 05/16/2022 have been fully considered but they are not persuasive. 

35 U.S.C. 101
Applicant first argues that making the same 101 abstract idea rejection after it was withdrawn is against the principles of compact prosecution (pg. 1 of Response), the examiner respectfully disagrees. Applicant argument is not responsive to the prima facie case. The 101 abstract idea rejection is proper and therefore is maintained.

Applicant then argues that the claims are not directed to mere abstract ideas as claims cannot “reasonably” or “practically” be performed as a mental process stating the operations occur during query execution using uniquely configured and capable system tailored for distributed computing of large databases which improves computer-functionality (i.e. resource utilization) (pgs. 5-7 of Response), the examiner respectfully disagrees. First, the examiner notes that the claims do not specifically recite nor claims the applicant’s argued uniquely configured and capable system tailored for distributed computing of large databases. The limitations only recite a system capable of executing a distinct count query against a dataset, therefore the applicant’s argument for uniquely configured and capable system tailored for distributed computing of large databases is moot. Second, the examiner notes that the execution of the query limitation is recited at a high-level of generality (i.e., as a generic processor performing a generic computer function of executing stored instructions [e.g. query for distinct counts]) such that it amounts no more than mere instructions to apply the exception using generic computer components. Accordingly, the additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Third, the examiner notes MPEP 2106.05(a)(I) which states that for a “claim to improve computer functionality, the broadest reasonable interpretation of the claim must be limited to computer implementation. That is, a claim whose entire scope can be performed mentally, cannot be said to improve computer technology. Synopsys, Inc. v. Mentor Graphics Corp., 839 F.3d 1138, 120 USPQ2d 1473 (Fed. Cir. 2016).” and MPEP 2106.05(a)(I)(iii) which states that courts have indicated that “Mere automation of manual processes” may not be sufficient to show an improvement in computer-functionality. The examiner notes that as the applicant only claims a dataset but does not claim its size, under BRI the dataset could be a very small dataset of only a few records (e.g. 5 records) and would be reasonably performed by mental/manual process and therefore cannot be said to improve computer technology. See 101 abstract idea rejection for detailed analysis.

 Applicant next argues that the claims were eligible for streamlined analysis (pgs. 7-8 of Response), the examiner respectfully disagrees. The examiner notes a streamlined eligibility analysis when the eligibility of the claim is self-evident, however when there is doubt then it is proper to do a full eligibility analysis (See MPEP 2106.06). Examiner notes that applicant’s argument that there were meaningful limitations that provided practical application or significantly more were not self-evident, therefore a full eligibility analysis was proper.
 
Applicant next argues that the claims provided a “practical application” (pgs. 8-10 of Response), the examiner respectfully disagrees. Examiner notes that the claimed additional elements alone or in combination amounted to no more than generic computer components executing instructions and the access/output of datasets. Mere instructions to apply the exception using generic computer components and insignificant extra-solution activities do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. See 101 abstract idea rejection for detailed analysis. Further, the applicant’s points to the limitations that are all part of abstract idea, and not the additional elements, as reasons for integration into practical application/improvements. The examiner notes that the abstract idea themselves can't provide integration into practical application/improvements (See MPEP 2106.05(a) which states “It is important to note, the judicial exception alone cannot provide the improvement. The improvement can be provided by one or more additional elements.” and MPEP 2106.05(a)(II) which states “…it is important to keep in mind that an improvement in the abstract idea itself (e.g. a recited fundamental economic concept) is not an improvement in technology.”). The applicant provides no argument beyond reciting the abstract ideas themselves as reasons for integration into practical application/improvements with no relevant considerations given and is therefore not persuasive.

Applicant next argues that the claims provided a “innovative concept” (pgs. 10-13 of Response), the examiner respectfully disagrees. Examiner notes that the claimed additional elements alone or in combination amounted to no more than generic computer components executing instructions and the access/output of datasets. As discussed above with respect to integration of the abstract idea into a practical application, mere instructions to apply an exception using generic computer components and insignificant extra-solution activities that the courts have recognized to be well-understood, routine, conventional activity cannot provide an inventive concept. Taking the elements both individually and as a whole, the claim does not amount to significantly more than the abstract idea itself. See 101 abstract idea rejection for detailed analysis. Further, the applicant’s points to the limitations that are all part of abstract idea, and not the additional elements, as reciting “innovative concepts”. The examiner notes that the abstract idea themselves can't provide the “innovative concept” or significantly more than the abstract idea itself (See MPEP 2106.05(I) which states “an "inventive concept" is furnished by an element or combination of elements that is recited in the claim in addition to (beyond) the judicial exception, and is sufficient to ensure that the claim as a whole amounts to significantly more than the judicial exception itself.”). The applicant provides no argument beyond reciting the abstract ideas themselves as reasons for “innovative concepts” and/or significantly more than the abstract idea itself with no relevant considerations given and is therefore not persuasive.

35 U.S.C. 103
Applicant first argues that the combination of Dyskant and Anderson does not teach "determine, respectively by the plurality of distributed servers, a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions, each key corresponding to a respective identifier of another field of the plurality of fields;" arguing that examiner admits that Dyskant does not teach the determining presence of keys and that the keys of Anderson are different from what is claimed in the instant application for partitioning data for processing and does not teach determining distinct counts in any way (pgs. 14-16 of Response), the examiner respectfully disagrees. Examiner does not depend on Anderson to teach about determining the presence of a key for the purpose of determining distinct counts, rather the examiner cites Dyskant to teach that limitation. While the Examiner did admit that Dyskant did not teach "determine, respectively by the plurality of distributed servers, a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions, each key corresponding to a respective identifier of another field of the plurality of fields;" (emphasis added), the Examiner did indicate that Dyskant teaches "determine a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions". Examiner cites that Dyskant, Fig. 2 and [0012], [0014], and [0024]-[0025], teaches splitting a data set into a number of chunks (i.e. partitions) for parallel processing to perform a count distinct function on each chunk, and then simply add (i.e. aggregate the distinct counts for a value) the results together to obtain overall count distinct values. The values in the chunks are grouped and then distinctly counted, where each key appears once (i.e. distinct count for each value grouped with a specific key). The Examiner cites Anderson to teach data processing (e.g. which can include determining and partitioning keys) being done by distributed servers and that each key corresponds to another field of a plurality of fields. Anderson, [0084]-[0085], [0089]-[0090], and [0154]-[0156], discloses dividing a dataset into segments (i.e. partitions/chunks) based on a segment identifier (i.e. key) with a segment value (i.e. a specific key among plurality of keys). Data may be segmented/partitioned first by one identifier (e.g. the segment identifier/key, first field), then data clusters within each segment may be further segmented/grouped by another identifier/field (i.e. each key corresponding to a respective identifier of another field). Examiner interprets that the segmenting/grouping of data using another/second identifier/field within a segment that was segmented/divided using a first identifier/field to be the Applicant’s process illustrated in 406 of Fig. 4 in the instant application. Records may then be partitioned/grouped based on the multiple segment identifiers to be sent to different recipient processing entities/processing nodes (i.e. distributed servers), where every record having the same segment identifier value for processing. Examiner interprets that the processing of data segments of Anderson could be the process of determining distinct counts for each segment/partition of Dyskant. Applicant’s argument that the keys of Anderson are different from what is claimed in the instant application for partitioning data for processing and does not teach using keys in determining distinct counts in any way is moot as the Examiner notes that the term “key” claimed in the limitations is undefined and therefore broad in the claims. The limitation in question just determines the presence of the key and not the performance of the process to generate distinct counts, which the Examiner depends upon Dyskant to teach that feature and not Anderson. Therefore, the combination of Dyskant and Anderson does teach "determine, respectively by the plurality of distributed servers, a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions, each key corresponding to a respective identifier of another field of the plurality of fields;" for the purpose of determining distinct count of values.

Applicant then argues that the combination of Dyskant and Anderson does not teach " aggregate a number of instances of the determined presence of the key; and generate a distinct count of values for the identifiers associated with the key." arguing combination fails to teach determining distinct counts using a different key/field from the one used for dividing the data into partitions (pgs. 16-17 of Response), the examiner respectfully disagrees. As demonstrated above, Anderson teaches partitioning using a first field/key and further grouping data in the segment using a second field/key for processing. Dyskant, Fig. 2 and [0012], [1014], and [0024]-[0025], discloses analyzing the chunks/partitions of data to perform distinct counts for values in a column of each chunk and then adding the results to obtain overall count distinct values. The values in the chunks are grouped and distinctly counted, where each key appears once. Examiner interprets that the column the distinct count is performed on does not need to be the column used for the chunking the data (e.g. first field/key used in partitioning of Anderson) rather it can be another column of the chunk (e.g. second field/key for grouping data within the partition of Anderson). Therefore, the combination of Dyskant and Anderson does teach " aggregate a number of instances of the determined presence of the key; and generate a distinct count of values for the identifiers associated with the key."

Applicant then argues that Examiner’s argument that the processing of data of Anderson could be the process of determining distinct counts for each segment/partition of Dyskant is irrelevant as keys are different from those claimed in the instant application and reiterates the argument that Anderson not teach distinct counts using keys (pgs. 17-18 of Response), the examiner respectfully disagrees. As indicated above Applicant’s argument is moot as the Examiner notes that the term “key” claimed in the limitations is undefined and therefore broad in the claims. The limitation in question just determines the presence of the key and not the performance of the process to generate distinct counts, which the Examiner depends upon Dyskant to teach that feature and not Anderson. As demonstrated above, the combination of Dyskant and Anderson does teach "determine, respectively by the plurality of distributed servers, a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions, each key corresponding to a respective identifier of another field of the plurality of fields;" for the purpose of determining distinct count of values.

Applicant next argues that there is no rational reason to combine Anderson with Dyskant to arrive at the claimed features of the instant application (pgs. 18-19 of Response), the examiner respectfully disagrees. As demonstrated above, the combination of Dyskant and Anderson does teach the claimed limitation. Both address the same field of data partitioning systems, therefore they are analogous art and can be combined. The motivation for one of ordinary skill in the art to combine Anderson with Dyskant would be as to provide a way to perform a number of computations that have to be made between records to determine which are close under a suitable distance measure without limiting performance and scalability when clustering large volumes of data, as taught by Anderson [0028]. Therefore, there is a rational reason to combine Anderson with Dyskant to arrive at the claimed features of the instant application.

Applicant next argues that Croft fails to remedy the deficiencies of the combination of Dyskant and Anderson (pgs. 19-20 of Response), the examiner respectfully disagrees. As demonstrated above, the combination of Dyskant and Anderson does teach the claimed limitations that Applicant argues are deficiencies. Therefore, Applicant’s argument for Croft is moot.

Applicant then argues that Fricke fails to remedy the deficiencies of the combination of Dyskant and Anderson (pgs. 20 of Response), the examiner respectfully disagrees. As demonstrated above, the combination of Dyskant and Anderson does teach the claimed limitations that Applicant argues are deficiencies. Therefore, Applicant’s argument for Croft is moot.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Claim 1 is directed towards a machine or apparatus and recites the limitations of “sort the dataset according to the identifiers that are associated with a field of the plurality of fields for the dataset to generate a sorted dataset; divide the sorted dataset into a plurality of partitions for respective distribution to a plurality of distributed servers, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value; determine a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions, each key corresponding to a respective identifier of another field of the plurality of fields; aggregate a number of instances of the determined presence of the key; and generate and return as a result of the query the distinct count of values for the identifiers associated with the key”, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitations in the mind. For example, “sort” or sorting is considered to be a mental process because sorting is based on an evaluation of the data, in the context of this claim encompasses the user sorting and organizing the data in the dataset using the identifiers/values of the fields of the dataset. Similarly, “divide” or dividing is considered to be a mental process because dividing is based on an evaluation of the data, in the context of this claim encompasses the user manually dividing the data into groups/partitions based on data values being the same. Similarly, “determine a presence” or determining is considered to be a mental process because determining is based on an observation of the data, in the context of this claim encompasses the user mentally reviewing the data to determine that a key exists in the data. Similarly, “aggregate” or aggregating is considered to be a mental process because aggregating is based on an evaluation or judgement of the data, in the context of this claim encompasses the user manually counting and summing up the number of times a value appears in the data. Similarly, “generate and return” or generating and returning is considered to be a mental process because the generating is based on an evaluation or judgement of the data, in the context of this claim encompasses the user manually determining and recording the number of times a distinct value appears in the data. This can be demonstrated by mentally performing the above limitations on Fig. 4 of the applicant’s filed drawings. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
	This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements – a processing system that includes one or more processors, a memory configured to store program code to be executed by the processing system, “execute a query, which includes an instruction to determine a distinct count of values, against a dataset”, access the dataset that includes a plurality of fields, and a plurality of distributed servers. The processors, memory, and distributed servers are recited at a high-level of generality (i.e., as a generic processor performing a generic computer function of executing stored instructions, and servers set up in a distributed manner) such that it amounts no more than mere instructions to apply the exception using generic computer components. The limitation “execute a query, which includes an instruction to determine a distinct count of values, against a dataset” is recited at a high-level of generality (i.e., as a generic processor performing a generic computer function of executing stored instructions) such that it amounts no more than mere instructions to apply the exception using generic computer components. The limitation to access the dataset that includes a plurality of fields amount to no more than retrieving a dataset. These additional elements amount to no more than insignificant extra-solution activities (See MPEP 2106.05(g) “data gathering and outputting”). Further, the limitation “generate and return as a result of the query the distinct count of values for the identifiers associated with the key”, if not analyzed as an abstract idea, would also amount to no more than gathering/receiving, manipulating, and outputting of data which as indicated amounts to no more than insignificant extra-solution activities (See MPEP 2106.05(g) “data gathering and outputting”). Accordingly, the additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
	In step 2b, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of a processing system that includes one or more processors, a memory configured to store program code to be executed by the processing system, a plurality of distributed servers, and executing a query which includes an instruction to determine a distinct count of values against a dataset amounts to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. The additional element of access the dataset that includes a plurality of fields to no more than insignificant extra-solution activities that the courts have recognized to be well-understood, routine, conventional activity (See MPEP 2106.05(d)(II) “Storing and retrieving information in memory”). Further, if the limitation “generate and return as a result of the query the distinct count of values for the identifiers associated with the key”, is not analyzed as an abstract idea, would amount to no more than generating/receiving a distinct count and sending/storing the result which amounts to no more than insignificant extra-solution activities that the courts have recognized to be well-understood, routine, conventional activity (See MPEP 2106.05(d)(II) “Receiving or transmitting data over a network” or “Storing and retrieving information in memory”). Insignificant extra-solution activities that the courts have recognized to be well-understood, routine, conventional activity cannot provide an inventive concept. Taking the elements both individually and as a whole, the claim does not amount to significantly more than the abstract idea itself. The claim is not patent eligible.
	Claim 8 is directed towards a process and claim 15 is directed towards an article of manufacture. Claims 8 and 15 recite substantially the same limitations as claim 1 with the absence of the limitation relating to “access a dataset”, and follows substantially the same analysis.
	 
	Claim 2 recites divide the sorted dataset into the plurality of partitions at respective ones of the plurality of distributed servers according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers, and wherein at each one of the plurality of distributed servers the presence of the key of the keys is determined for a corresponding one of the plurality of partitions. As discussed above, the limitations of divide the sorted dataset into the plurality of partitions at respective ones of the plurality of distributed servers according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers, and wherein at each one of the plurality of distributed servers the presence of the key of the keys is determined for a corresponding one of the plurality of partitions, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitations in the mind. For example, “divide” or dividing is considered to be a mental process because dividing is based on an evaluation of the data, in the context of this claim encompasses the user manually dividing the data into groups/partitions based on data values being the same. Similarly, “determine a presence” or determining is considered to be a mental process because determining is based on an observation of the data, in the context of this claim encompasses the user mentally reviewing the data to determine that a key exists in the data. This can be demonstrated by mentally performing the above limitations on Fig. 4 of the applicant’s filed drawings. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
	This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements – comprising a plurality of distributed servers. As discussed above in claim 1, the additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
	In step 2b, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception as discussed above in claim 1. Taking the elements both individually and as a whole, the claim does not amount to significantly more than the abstract idea itself. The claim is not patent eligible.
	Claims 9, 10, 16, and 17 recite substantially the same limitations as claim 2, and follows substantially the same analysis.
	 
	Claim 3 recites divide the sorted dataset into the plurality of partitions according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers, and determine the presence of the key of the keys for each one of the plurality of partitions via a corresponding separate logical processing space of the processing system. As discussed above, the limitations of divide the sorted dataset into the plurality of partitions according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers; and determine the presence of the key of the keys for each one of the plurality of partitions via a corresponding separate logical processing space of the processing system, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitations in the mind. For example, “divide” or dividing is considered to be a mental process because dividing is based on an evaluation of the data, in the context of this claim encompasses the user dividing the data into groups/partitions based on data values being the same. Similarly, “determine a presence” or determining is considered to be a mental process because determining is based on an observation of the data, in the context of this claim encompasses the user mentally reviewing the data to determine that a key exists in the data. This can be demonstrated by mentally performing the above limitations on Fig. 4 of the applicant’s filed drawings. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
	This judicial exception is not integrated into a practical application. In particular, the claim does not recite additional elements and therefore does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. In step 2b, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Taking the elements both individually and as a whole, the claim does not amount to significantly more than the abstract idea itself. The claim is not patent eligible.
	 
	Claim 4 recites determining an exact, distinct count for the identifiers associated with the key, wherein the identifiers are at least one of user identifiers, tenant identifiers, numbers of accesses, or access times, and wherein the key is at least one of a hosted web service, a hosted web application, or a search term. The limitations of receive an instruction for determining an exact, distinct count for the identifiers associated with the key, wherein the identifiers are at least one of user identifiers, tenant identifiers, numbers of accesses, or access times, and wherein the key is at least one of a hosted web service, a hosted web application, or a search term, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitations in the mind. For example, “determining” or determining is considered to be a mental process because determining is based on an observation of the data, in the context of this claim encompasses the user manually determining of distinct counts based on certain types of identifiers and keys. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim recites the additional element – receive an instruction. The receiving an instruction is recited at a high-level of generality (i.e., as a generic processor performing a generic computer function of receiving and executing instructions) such that it amounts no more than mere instructions to apply the exception using generic computer components. Accordingly, the additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
	In step 2b, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, receiving and executing instructions amounts to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. Taking the elements both individually and as a whole, the claim does not amount to significantly more than the abstract idea itself. The claim is not patent eligible.
	Claims 11 and 18 recite substantially the same limitations as claim 4, and follows substantially the same analysis.
	 
	Claim 5 recites determine a ratio of the keys to the identifiers for the dataset; and sort the dataset based at least on a comparison of the ratio to a threshold value. The limitations of determine a ratio of the keys to the identifiers for the dataset; and sort the dataset based at least on a comparison of the ratio to a threshold value, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitations in the mind. For example, “determine a ratio” or determining is considered to be a mental process because determining is based on an observation of the data, in the context of this claim encompasses the user manually calculating a ratio. Similarly, “sort” or sorting is considered to be a mental process because sorting is based on an evaluation of the data, in the context of this claim encompasses the user manually sorting and organizing the data in the dataset based on the calculated ratio for certain data using a threshold. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
	This judicial exception is not integrated into a practical application. In particular, the claim does not recite additional elements and therefore does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. In step 2b, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Taking the elements both individually and as a whole, the claim does not amount to significantly more than the abstract idea itself. The claim is not patent eligible.
	Claims 12 and 19 recite substantially the same limitations as claim 5, and follows substantially the same analysis.
	 
	Claims 6, 13, and 20 recite additional elements that relate to the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application. These additional elements amount to no more than insignificant extra-solution activities that the courts have recognized to be well-understood, routine, conventional activity (See MPEP 2106.05(d)(II) “Receiving or transmitting data over a network”). As discussed above in respect to insignificant extra-solution activities that the courts have recognized to be well-understood, routine, conventional activity cannot provide an inventive concept, therefore the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. Taking the elements both individually and as a whole, the claims do not amount to significantly more than the abstract idea itself. The claims are not patent eligible.
	 
Claims 7, 14, and 20 recite additional elements that relate to a system that is cloud-based and that hosts big data storage for the dataset. The cloud-based storage and big data storage are recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using generic computer components. As discussed above, mere instructions to apply an exception using generic computer components cannot provide an inventive concept, therefore the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. Taking the elements both individually and as a whole, the claims do not amount to significantly more than the abstract idea itself. The claims are not patent eligible.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 8-10, and 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over Dyskant (U.S. Pub. No. 2007/0239663, previously cited in IDS), in view of Anderson et al.  (U.S. Pub. No. 2016/0283574, previously cited), hereinafter Anderson.

Regarding independent claim 1, Dyskant teaches a system for determining a distinct count for identifiers based on keys, the system comprising: (Dyskant, Fig. 1 and [0001], discloses "The invention relates to a system and method for parallel processing of large amounts of data in order to count distinct values".)
execute a query, which includes an instruction to determine a distinct count of values, against a dataset, (Dyskant, [0021] and [0030], discloses query processors performing a count distinct function on the values within any, all, or selected columns of tables with a data source and returning the query results (e.g., count distinct results) to be added to summary cells.) said execute including:
access the dataset that includes a plurality of fields; (Dyskant, Fig. 2 and [0021], discloses accessing source data with columns.)
sort the dataset according to the identifiers that are associated with a field of the plurality of fields for the dataset to generate a sorted dataset; (Dyskant, Fig. 2 and [0014]-[0015], discloses splitting a data set into a number of chunks based on the values in a column and sorting them to be distinctly counted.)
divide the sorted dataset into a plurality of partitions, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value; (Dyskant, Fig. 2 and [0014], discloses "splits a data set into a number of chunks based on the values in a column upon which a count distinct function is to be performed (analytic column), such that no value appears in two or more chunks.")
determine a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions; aggregate a number of instances of the determined presence of the key; and generate and return as a result of the query the distinct count for the identifiers associated with the key. (Dyskant, [0012], discloses "Since no value of the analytic column appears in more than one chunk, each chunk can be treated as a separate problem and the system can use parallel processing to perform a count distinct function on each chunk, and then simply add the results together." Dyskant, Fig. 2 and [0024]-[0025], discloses analyzing the chunks of data to perform distinct counts for values of each chunk and then adding the results to obtain overall count distinct values. The values in the chunks are grouped and distinctly counted, where each key appears once. Dyskant, [0021] and [0030], discloses query processors performing a count distinct function on the values within any, all, or selected columns of tables with a data source and returning the query results (e.g., count distinct results) to be added to summary cells.)
However, Dyskant does not explicitly teach a processing system that includes one or more processors; and 
a memory configured to store program code to be executed by the processing system, the program code configured to: 
divide the sorted dataset into a plurality of partitions for respective distribution to a plurality of distributed servers, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value;
determine, respectively by the plurality of distributed servers, a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions, each key corresponding to a respective identifier of another field of the plurality of fields; 
On the other hand, Anderson teaches a processing system that includes one or more processors; and a memory configured to store program code to be executed by the processing system, the program code configured to: (Anderson, [0284]-[0285], discloses software/computer programs that execute on one or more computer systems each including at least one processor, at least one data storage system (including volatile and non-volatile memory and/or storage elements). The software may be provided on a storage medium.)
divide the sorted dataset into a plurality of partitions for respective distribution to a plurality of distributed servers, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value; determine, respectively by the plurality of distributed servers, a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions, each key corresponding to a respective identifier of another field of the plurality of fields; (Anderson, [0084]-[0085], discloses the placement of records of a dataset into segments based on a value expected to be common across a cluster like a classifying characteristic that divides the collection of records into disjoint sets, like a product identifier, a geographic quantity like zip code or country of origin, or other criteria, for example, an assigned identifier. Multiple levels of segmentation are possible, for example, data may be segmented first by one identifier, and then data clusters within each segment may be further segmented by another identifier/field. Each segment may be passed to separate processing partitions. Anderson, [0089]-[0090] and [0154]-[0155], discloses a set of data source records or tokenized records are sorted and read/provided to the clustering engine for processing, where a segmentation engine assigns a segment identifier to a data record based on a value, called the segment value. Records may then be partitioned by a parallel partitioner based on the segment identifiers to be sent to different recipient processing entities/processing nodes (i.e. distributed servers), where every record having the same segment identifier value is sent to the same processing entity. The segment value can be derived and records with identical segment values receive the same segment identifier, but records with different segment values may receive different segment identifiers. Anderson, [0156], discloses a partitioner may also partition by a multipart key consisting of one or more approximately matched fields along with one or more exactly matched fields, to reduce potential skew.)
The clustering and partitioning of data records for parallel processing of Anderson can be the system for parallel processing of large amounts of data by chunking of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of clustering and partitioning data distribution system of Anderson because both address the same field of data partitioning systems and by incorporating Anderson into the Dyskant enables the distinct count value system to be used in a distributed computing environment.
One of ordinary skill in the art would be motivated to do so as to provide a way to perform a number of computations that have to be made between records to determine which are close under a suitable distance measure without limiting performance and scalability when clustering large volumes of data, as taught by Anderson [0028].
 
Regarding claim 2, Dyskant, in view of Anderson, teaches the system of claim 1, comprising the plurality of distributed servers;  (Anderson, [0284]-[0285], discloses software/computer programs that execute on one or more computer systems (which may be of various architectures such as distributed servers) each including at least one processor, at least one data storage system (including volatile and non-volatile memory and/or storage elements). The software may be provided on a storage medium.) wherein the program code is configured to: divide the sorted dataset into the plurality of partitions at respective ones of the plurality of distributed servers according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers; and wherein at each one of the plurality of distributed servers the presence of the key of the keys is determined for a corresponding one of the plurality of partitions. (Dyskant, Fig. 2 and [0014], discloses splitting the data based on a range partition based on how many values fit into a query processor's memory. Examiner interprets that the subchunks are balanced between the query processors based on their sizes and in order to group by a field, a query processor must determine the "presence of the key". Dyskant, [0020], discloses query processor have its own memory. Examiner interprets query processors could be distributed servers and the memory of query processors as a separate logical processing spaces.)

Regarding claim 3, Dyskant, in view of Anderson, teaches the system of claim 1, wherein the program code is configured to: divide the sorted dataset into the plurality of partitions according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers; and determine the presence of the key of the keys for each one of the plurality of partitions via a corresponding separate logical processing space of the processing system.  (Dyskant, Fig. 2 and [0014], discloses splitting the data based on a range partition based on how many values fit into a query processor's memory. Examiner interprets that the subchunks are balanced between the query processors based on their sizes and in order to group by a field, a query processor must determine the "presence of the key". Dyskant, [0020], discloses query processor have its own memory. Examiner interprets query processors could be distributed servers and the memory of query processors as a separate logical processing spaces.)
 
Regarding independent claim 8, Dyskant teaches a computer-implemented method for determining a distinct count for identifiers on keys, the method comprising:  (Dyskant, Fig. 1 and [0001], discloses "The invention relates to a system and method for parallel processing of large amounts of data in order to count distinct values".)
executing a query, which includes an instruction to determine a distinct count of values, against a dataset, (Dyskant, [0021] and [0030], discloses query processors performing a count distinct function on the values within any, all, or selected columns of tables with a data source and returning the query results (e.g., count distinct results) to be added to summary cells.) said executing including:
sorting the dataset according to the identifiers to generate a sorted dataset, the identifiers being values for a field of the dataset; (Dyskant, Fig. 2 and [0014]-[0015], discloses splitting a data set into a number of chunks based on the values in a column and sorting them to be distinctly counted.)
dividing the sorted dataset into a plurality of partitions, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value; (Dyskant, Fig. 2 and [0014], discloses "splits a data set into a number of chunks based on the values in a column upon which a count distinct function is to be performed (analytic column), such that no value appears in two or more chunks.") 
determining, for each partition of the plurality of partitions, a presence of a key of the keys in the at least one subset; aggregating a number of instances of the determined presence of the key; and generating and returning as a result of the query the distinct count for the identifiers associated with the key. (Dyskant, [0012], discloses "Since no value of the analytic column appears in more than one chunk, each chunk can be treated as a separate problem and the system can use parallel processing to perform a count distinct function on each chunk, and then simply add the results together." Dyskant, Fig. 2 and [0024]-[0025], discloses analyzing the chunks of data to perform distinct counts for values of each chunk and then adding the results to obtain overall count distinct values. The values in the chunks are grouped and distinctly counted, where each key appears once. Dyskant, [0021] and [0030], discloses query processors performing a count distinct function on the values within any, all, or selected columns of tables with a data source and returning the query results (e.g., count distinct results) to be added to summary cells.)
However, Dyskant does not explicitly teach dividing the sorted dataset into a plurality of partitions for respective distribution to logically separate portions of a processing system, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value; 
determining, for each partition of the plurality of partitions respectively by the logically separate portions, a presence of a key of the keys in the at least one subset, each key corresponding to a respective identifier of a second field of the dataset; 
On the other hand, Anderson teaches dividing the sorted dataset into a plurality of partitions for respective distribution to logically separate portions of a processing system, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value; determining, for each partition of the plurality of partitions respectively by the logically separate portions, a presence of a key of the keys in the at least one subset, each key corresponding to a respective identifier of a second field of the dataset; (Anderson, [0084]-[0085], discloses the placement of records of a dataset into segments based on a value expected to be common across a cluster like a classifying characteristic that divides the collection of records into disjoint sets, like a product identifier, a geographic quantity like zip code or country of origin, or other criteria, for example, an assigned identifier. Multiple levels of segmentation are possible, for example, data may be segmented first by one identifier, and then data clusters within each segment may be further segmented by another identifier/field. Each segment may be passed to separate processing partitions. Anderson, [0089]-[0090] and [0154]-[0155], discloses a set of data source records or tokenized records are sorted and read/provided to the clustering engine for processing, where a segmentation engine assigns a segment identifier to a data record based on a value, called the segment value. Records may then be partitioned by a parallel partitioner based on the segment identifiers to be sent to different recipient processing entities/processing nodes (i.e. distributed servers), where every record having the same segment identifier value is sent to the same processing entity. The segment value can be derived and records with identical segment values receive the same segment identifier, but records with different segment values may receive different segment identifiers. Anderson, [0156], discloses a partitioner may also partition by a multipart key consisting of one or more approximately matched fields along with one or more exactly matched fields, to reduce potential skew.)
The clustering and partitioning of data records for parallel processing of Anderson can be the system for parallel processing of large amounts of data by chunking of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of clustering and partitioning data distribution system of Anderson because both address the same field of data partitioning systems and by incorporating Anderson into the Dyskant enables the distinct count value system to be used in a distributed computing environment.
One of ordinary skill in the art would be motivated to do so as to provide a way to perform a number of computations that have to be made between records to determine which are close under a suitable distance measure without limiting performance and scalability when clustering large volumes of data, as taught by Anderson [0028].
 
Regarding claim 9, Dyskant, in view of Anderson, teaches the computer-implemented method of claim 8, wherein each partition of the plurality of partitions is respectively provided to the logically separate portions of the processing system for said determining subsequent to said dividing, said dividing further comprising: dividing the sorted dataset into the plurality of partitions according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers; and wherein said determining the presence of the key of the keys comprises determining the presence of the key of the keys for each of the plurality of partitions at a corresponding one of the logically separate portions. (Dyskant, Fig. 2 and [0014], discloses splitting the data based on a range partition based on how many values fit into a query processor's memory. Examiner interprets that the subchunks are balanced between the query processors based on their sizes and in order to group by a field, a query processor must determine the "presence of the key". Dyskant, [0020], discloses query processor have its own memory. Examiner interprets query processors could be distributed servers and the memory of query processors as a separate logical processing spaces. In combination, Anderson, [0084]-[0085], discloses the placement of records of a dataset into segments based on a value expected to be common across a cluster like a classifying characteristic that divides the collection of records into disjoint sets. Each segment may be passed to separate processing partitions. Anderson, [0089]-[0090] and [0154]-[0155], discloses a set of data source records or tokenized records are sorted and read/provided to the clustering engine for processing, where a segmentation engine assigns a segment identifier to a data record based on a value, called the segment value. Records may then be partitioned by a parallel partitioner based on the segment identifiers to be sent to different recipient processing entities/processing nodes (i.e. distributed servers), where every record having the same segment identifier value is sent to the same processing entity. The segment value can be derived and records with identical segment values receive the same segment identifier, but records with different segment values may receive different segment identifiers.)
 
Regarding claim 10, Dyskant, in view of Anderson, teaches the computer-implemented method of claim 9, wherein the logically separate portions of the processing system comprise a plurality of distributed servers. (Dyskant, [0020], discloses query processor have its own memory. Examiner interprets query processors with own memory as separate logical processing spaces. In combination, Anderson, [0084]-[0085], discloses the placement of records of a dataset into segments based on a value expected to be common across a cluster like a classifying characteristic that divides the collection of records into disjoint sets. Each segment may be passed to separate processing partitions. Anderson, [0089]-[0090] and [0154]-[0155], discloses a set of data source records or tokenized records are sorted and read/provided to the clustering engine for processing, where a segmentation engine assigns a segment identifier to a data record based on a value, called the segment value. Records may then be partitioned by a parallel partitioner based on the segment identifiers to be sent to different recipient processing entities/processing nodes (i.e. distributed servers), where every record having the same segment identifier value is sent to the same processing entity. The segment value can be derived and records with identical segment values receive the same segment identifier, but records with different segment values may receive different segment identifiers.)
Claim 17 recites substantially the same limitations as claim 10, and is rejected for substantially the same reasons.
 
Regarding independent claim 15, Dyskant teaches perform a method for determining a distinct count for identifiers on keys, the method comprising: (Dyskant, Fig. 1 and [0001], discloses "The invention relates to a system and method for parallel processing of large amounts of data in order to count distinct values".)
executing a query, which includes an instruction to determine a distinct count of values, against a dataset, (Dyskant, [0021] and [0030], discloses query processors performing a count distinct function on the values within any, all, or selected columns of tables with a data source and returning the query results (e.g., count distinct results) to be added to summary cells.) said executing including:
sorting the dataset according to the identifiers in a field of the dataset to generate a sorted dataset; (Dyskant, Fig. 2 and [0014]-[0015], discloses splitting a data set into a number of chunks based on the values in a column and sorting them to be distinctly counted.)
dividing the sorted dataset into a plurality of partitions, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value; (Dyskant, Fig. 2 and [0014], discloses "splits a data set into a number of chunks based on the values in a column upon which a count distinct function is to be performed (analytic column), such that no value appears in two or more chunks.")
providing each partition of the plurality of partitions respectively to the logically separate portions of a processing system; determining at ones of the respective logically separate portions a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions; aggregating a number of instances of the determined presence of the key; and generating and returning as a result of the query the distinct count for the identifiers associated with the key. (Dyskant, [0012], discloses "Since no value of the analytic column appears in more than one chunk, each chunk can be treated as a separate problem and the system can use parallel processing to perform a count distinct function on each chunk, and then simply add the results together." Dyskant, Fig. 2 and [0024]-[0025], discloses analyzing the chunks of data to perform distinct counts for values of each chunk and then adding the results to obtain overall count distinct values. The values in the chunks are grouped and distinctly counted, where each key appears once. Dyskant, [0021] and [0030], discloses query processors performing a count distinct function on the values within any, all, or selected columns of tables with a data source and returning the query results (e.g., count distinct results) to be added to summary cells.)
However, Dyskant does not explicitly teach a computer-readable storage medium having program instructions recorded thereon that, when executed by one or more processors, 
dividing the sorted dataset into a plurality of partitions for respective distribution to logically separate portions of a processing system, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value;
providing each partition of the plurality of partitions respectively to the logically separate portions of a processing system; 
determining at ones of the respective logically separate portions a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions, each key corresponding to a respective identifier of a second field of the dataset;
On the other hand, Anderson teaches a computer-readable storage medium having program instructions recorded thereon that, when executed by one or more processors, (Anderson, [0284]-[0285], discloses software/computer programs that execute on one or more computer systems each including at least one processor, at least one data storage system (including volatile and non-volatile memory and/or storage elements). The software may be provided on a storage medium.)
dividing the sorted dataset into a plurality of partitions for respective distribution to logically separate portions of a processing system, each partition of the plurality of partitions being the only partition to include a respective portion of the dataset having at least one subset of identifiers of a first field of the dataset with a same value; providing each partition of the plurality of partitions respectively to the logically separate portions of a processing system; determining at ones of the respective logically separate portions a presence of a key of the keys in the at least one subset on each partition of the plurality of partitions, each key corresponding to a respective identifier of a second field of the dataset; (Anderson, [0084]-[0085], discloses the placement of records of a dataset into segments based on a value expected to be common across a cluster like a classifying characteristic that divides the collection of records into disjoint sets, like a product identifier, a geographic quantity like zip code or country of origin, or other criteria, for example, an assigned identifier. Multiple levels of segmentation are possible, for example, data may be segmented first by one identifier, and then data clusters within each segment may be further segmented by another identifier/field. Each segment may be passed to separate processing partitions. Anderson, [0089]-[0090] and [0154]-[0155], discloses a set of data source records or tokenized records are sorted and read/provided to the clustering engine for processing, where a segmentation engine assigns a segment identifier to a data record based on a value, called the segment value. Records may then be partitioned by a parallel partitioner based on the segment identifiers to be sent to different recipient processing entities/processing nodes (i.e. distributed servers), where every record having the same segment identifier value is sent to the same processing entity. The segment value can be derived and records with identical segment values receive the same segment identifier, but records with different segment values may receive different segment identifiers. Anderson, [0156], discloses a partitioner may also partition by a multipart key consisting of one or more approximately matched fields along with one or more exactly matched fields, to reduce potential skew.)
The clustering and partitioning of data records for parallel processing of Anderson can be the system for parallel processing of large amounts of data by chunking of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of clustering and partitioning data distribution system of Anderson because both address the same field of data partitioning systems and by incorporating Anderson into the Dyskant enables the distinct count value system to be used in a distributed computing environment.
One of ordinary skill in the art would be motivated to do so as to provide a way to perform a number of computations that have to be made between records to determine which are close under a suitable distance measure without limiting performance and scalability when clustering large volumes of data, as taught by Anderson [0028].
 
Regarding claim 16, Dyskant, in view of Anderson, teaches the computer-readable storage medium of claim 15, wherein said dividing comprises: dividing the sorted dataset into the plurality of partitions according to a range partition operation, wherein the partitions are balanced according to numbers of the identifiers. (Dyskant, Fig. 2 and [0014], discloses splitting the data based on a range partition based on how many values fit into a query processor's memory. Examiner interprets that the subchunks are balanced between the query processors based on their sizes. In combination, Anderson, [0089]-[0090] and [0154]-[0155], discloses a set of data source records or tokenized records are sorted and read/provided to the clustering engine for processing, where a segmentation engine assigns a segment identifier to a data record based on a value, called the segment value. Records may then be partitioned by a parallel partitioner based on the segment identifiers to be sent to different recipient processing entities/processing nodes (i.e. distributed servers), where every record having the same segment identifier value is sent to the same processing entity. The segment value can be derived and records with identical segment values receive the same segment identifier, but records with different segment values may receive different segment identifiers. Anderson, [0156], discloses a partitioner may also partition by a multipart key consisting of one or more approximately matched fields along with one or more exactly matched fields, to reduce potential skew.)
 
 
 
Claims 4, 6, 7, 11, 13, 14, 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Dyskant, in view of Anderson, in view of Croft et al.  (U.S. Pub. No. 2007/0106643, previously cited), hereinafter Croft.
 
Regarding claim 4, Dyskant, in view of Anderson, teaches all the limitations as set forth in the rejection of claim 1 above. Dyskant, in view of Anderson, further teaches the system of claim 1, wherein the program code is configured to: receive an instruction for determining an exact, distinct count for the identifiers associated with the key, (Dyskant, [0012], discloses "Since no value of the analytic column appears in more than one chunk, each chunk can be treated as a separate problem and the system can use parallel processing to perform a count distinct function on each chunk, and then simply add the results together." Dyskant, Fig. 2 and [0024]-[0025], discloses analyzing the chunks of data to perform distinct counts for values of each chunk and then adding the results to obtain overall count distinct values. The values in the chunks are grouped and distinctly counted, where each key appears once.) 
However, Dyskant, in view of Anderson, does not explicitly teach wherein the identifiers are at least one of user identifiers, tenant identifiers, numbers of accesses, or access times, and wherein the key is at least one of a hosted web service, a hosted web application, or a search term. 
On the other hand, Croft teaches wherein the identifiers are at least one of user identifiers, tenant identifiers, numbers of accesses, or access times, and wherein the key is at least one of a hosted web service, a hosted web application, or a search term. (Croft, Fig. 4 and [0032], discloses dataset with a column with buyer information. Examiner interprets buyer information as user identifiers. Croft, [0039], discloses receiving a query that identifies one or more categories of data from the database and identifies the pre-defined counting measures for calculating distinct count values. Examiner interprets the category/column of data as a key and a search term in a query is used to identify the category/column.)
Examiner notes that “wherein the identifiers are at least one of user identifiers, tenant identifiers, numbers of accesses, or access times, and wherein the key is at least one of a hosted web service, a hosted web application, or a search term” indicating the content of the keys/identifiers are nonfunctional descriptive material and does not hold any patentable weight. See MPEP 2111.05, specifically “…If a new and unobvious functional relationship between the printed matter and the substrate does not exist. USPTO personnel need not give patentable weight to printed matter. See In re Lowry, 32 F.3d 1579, 1583-84, 32 USPQ2d 1031, 1035 (Fed. Cir. 1994); In re Ngai, 367 F.3d 1336, 70 USPQ2d 1862 (Fed. Cir. 2004).”
Croft, Abstract and [0039], teaches calculating a distinct count value from data stored in a hierarchical database using a query that identifies one or more categories of data from the database and identifies the pre-defined counting measures for calculating distinct count values. The system that provides distinct count value from data stored in a hierarchical database of Croft can be the system for parallel processing of large amounts of data in order to count distinct values of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of the distinct count value calculation system of Croft because both address the same field of distinct count value systems and by incorporating Croft into the Dyskant enables the distinct count value system to be used in a cloud environment.
One of ordinary skill in the art would be motivated to do so as to provide a less processor intensive and significantly reduced processing time to process a reasonably sized data set, as taught by Croft [0002].
Claims 11 and 18 recite substantially the same limitations as claim 4, and are rejected for substantially the same reasons.
 
Regarding claim 6, Dyskant, in view of Anderson, teaches all the limitations as set forth in the rejection of claim 1 above. However, Dyskant, in view of Anderson, does not explicitly teach the system of claim 1, wherein the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application. 
On the other hand, Croft teaches wherein the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application. (Croft, [0026], discloses host applications accessing data warehousing with a multidimensional database. Croft, [0145]-[0146] and [0150], discloses that the system may be distributed across multiple computers in a networked system communicating via the internet. Examiner interprets this to be data for hosted web application/service.) 
Examiner notes that “the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application” is intended use and is nonfunctional descriptive material and does not hold any patentable weight. See MPEP 2111.05, specifically “…If a new and unobvious functional relationship between the printed matter and the substrate does not exist. USPTO personnel need not give patentable weight to printed matter. See In re Lowry, 32 F.3d 1579, 1583-84, 32 USPQ2d 1031, 1035 (Fed. Cir. 1994); In re Ngai, 367 F.3d 1336, 70 USPQ2d 1862 (Fed. Cir. 2004).”
Croft, Abstract and [0039], teaches calculating a distinct count value from data stored in a hierarchical database using a query that identifies one or more categories of data from the database and identifies the pre-defined counting measures for calculating distinct count values. The system that provides distinct count value from data stored in a hierarchical database of Croft can be the system for parallel processing of large amounts of data in order to count distinct values of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of the distinct count value calculation system of Croft because both address the same field of distinct count value systems and by incorporating Croft into the Dyskant enables the distinct count value system to be used in a cloud environment.
One of ordinary skill in the art would be motivated to do so as to provide a less processor intensive and significantly reduced processing time to process a reasonably sized data set, as taught by Croft [0002].
Claim 13 recites substantially the same limitations as claim 6, and is rejected for substantially the same reasons.
 
Regarding claim 7, Dyskant, in view of Anderson, teaches all the limitations as set forth in the rejection of claim 1 above. Dyskant, in view of Anderson, further teaches the system of claim 1, wherein the system hosts big data storage for the dataset. (Dyskant, [0001], discloses "The invention relates to a system and method for parallel processing of large amounts of data in order to count distinct values". 
However, Dyskant, in view of Anderson, does not explicitly teach wherein the system is a cloud-based system that hosts big data storage for the dataset. 
On the other hand, Croft teaches wherein the system is a cloud-based system that hosts big data storage for the dataset. (Croft, [0026], discloses data warehousing with multidimensional database. Examiner interprets data warehousing as big data storage. Croft, [0145]-[0146] and [0150], discloses that the system may be distributed across multiple computers in a networked system communicating via the internet. Examiner interprets this to be distributed servers in a cloud environment.) 
Examiner notes that “system is a cloud-based system that hosts big data storage for the dataset" is intended field of use and is nonfunctional descriptive material and does not hold any patentable weight. See MPEP 2111.05, specifically “…If a new and unobvious functional relationship between the printed matter and the substrate does not exist. USPTO personnel need not give patentable weight to printed matter. See In re Lowry, 32 F.3d 1579, 1583-84, 32 USPQ2d 1031, 1035 (Fed. Cir. 1994); In re Ngai, 367 F.3d 1336, 70 USPQ2d 1862 (Fed. Cir. 2004).”
Croft, Abstract and [0039], teaches calculating a distinct count value from data stored in a hierarchical database using a query that identifies one or more categories of data from the database and identifies the pre-defined counting measures for calculating distinct count values. The system that provides distinct count value from data stored in a hierarchical database of Croft can be the system for parallel processing of large amounts of data in order to count distinct values of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of the distinct count value calculation system of Croft because both address the same field of distinct count value systems and by incorporating Croft into the Dyskant enables the distinct count value system to be used in a cloud environment.
One of ordinary skill in the art would be motivated to do so as to provide a less processor intensive and significantly reduced processing time to process a reasonably sized data set, as taught by Croft [0002].
Claim 14 recites substantially the same limitations as claim 7, and is rejected for substantially the same reasons.
 
Regarding claim 20, Dyskant, in view of Anderson, teaches all the limitations as set forth in the rejection of claim 15 above. Dyskant, in view of Anderson, further teaches the computer-readable storage medium of claim 15, wherein the one or more processors are of a system that hosts big data storage for the dataset. (Dyskant, [0001], discloses "The invention relates to a system and method for parallel processing of large amounts of data in order to count distinct values". 
However, Dyskant, in view of Anderson, does not explicitly teach wherein the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application; or wherein the one or more processors are of a system that is a cloud-based system that hosts big data storage for the dataset.
On the other hand, Croft teaches wherein the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application; or wherein the one or more processors are of a system that is a cloud-based system that hosts big data storage for the dataset.(Croft, [0026], discloses data warehousing with multidimensional database. Examiner interprets data warehousing as big data storage. Croft, [0145]-[0146] and [0150], discloses that the system may be distributed across multiple computers in a networked system communicating via the internet. Examiner interprets this to be distributed servers in a cloud environment.) 
Examiner notes that “the dataset comprises log entries having data for at least one of a hosted web service or a hosted web application” and “system is a cloud-based system that hosts big data storage for the dataset" are intended field of use and is nonfunctional descriptive material and does not hold any patentable weight. See MPEP 2111.05, specifically “…If a new and unobvious functional relationship between the printed matter and the substrate does not exist. USPTO personnel need not give patentable weight to printed matter. See In re Lowry, 32 F.3d 1579, 1583-84, 32 USPQ2d 1031, 1035 (Fed. Cir. 1994); In re Ngai, 367 F.3d 1336, 70 USPQ2d 1862 (Fed. Cir. 2004).”
Croft, Abstract and [0039], teaches calculating a distinct count value from data stored in a hierarchical database using a query that identifies one or more categories of data from the database and identifies the pre-defined counting measures for calculating distinct count values. The system that provides distinct count value from data stored in a hierarchical database of Croft can be the system for parallel processing of large amounts of data in order to count distinct values of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of the distinct count value calculation system of Croft because both address the same field of distinct count value systems and by incorporating Croft into the Dyskant enables the distinct count value system to be used in a cloud environment.
One of ordinary skill in the art would be motivated to do so as to provide a less processor intensive and significantly reduced processing time to process a reasonably sized data set, as taught by Croft [0002].
 
 
 
Claims 5, 12, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Dyskant, in view of Anderson, and further in view of Fricke et al. (U.S. Pub. No. 2014/0351254, previously cited in IDS), hereinafter Fricke.
 
Regarding claim 5, Dyskant, in view of Anderson, teaches all the limitations as set forth in the rejection of claim 1 above. However, Dyskant, in view of Anderson, does not explicitly teach the system of claim 1, wherein the program code is configured to: determine a ratio of the keys to the identifiers for the dataset; and wherein said sort the dataset is based at least on a comparison of the ratio to a threshold value.  
On the other hand, Fricke teaches determine a ratio of the keys to the identifiers for the dataset; and wherein said sort the dataset is based at least on a comparison of the ratio to a threshold value. (Fricke, [0007] and [0009], discloses generating a uniqueness metric representative of data in a database table column that is split across a plurality of data partitions for classifying column where the uniqueness metric is a calculated ratio of the total number of unique values to a number of rows of the column. Fricke, [0033], discloses range partitioning on data that is amendable to being ordered. Fricke, [0025], discloses comparing the uniqueness metric to a threshold to classify the data. Examiner interprets a calculated ratio of the total number of unique values to a number of rows of the column as a ratio of the keys to the identifiers for the dataset and classifying the data using the ratio and a threshold as sorting a dataset based at least on a comparison of the ratio to a threshold value.)
The system that provides unique counts of values for data stored in a hierarchical table column of Fricke can be the system for parallel processing of large amounts of data in order to count distinct values of Dyskant. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to have modified Dyskant to incorporate the teachings of the use of uniqueness metrics of Fricke because both address the same field of distinct count value systems and by incorporating Fricke into the Dyskant enables the distinct count value system to be classify data based on a uniqueness ratio against threshold.
One of ordinary skill in the art would be motivated to do so as to provide column-based architectures enriched by additional mechanisms aimed at minimizing the need for access to compressed data without resulting in lower compression efficiency and/or increased processing requirements to access the compressed data, as taught by Fricke[0005].
Claim 19 recites substantially the same limitations as claim 5, and is rejected for substantially the same reasons.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Point of Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EDDY CHEUNG whose telephone number is (571)272-9785. The examiner can normally be reached MON-TH 8:00AM-4:00PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aleksandr Kerzhner can be reached on (571)270-1760. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Eddy Cheung/Primary Examiner, Art Unit 2165