Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement filed May 19, 2021 fails to comply with the provisions of 37 CFR 1.97, 1.98 and MPEP § 609 because a listed patent publication does not exist.  It has been placed in the application file, but certain information referred to therein has not been considered as to the merits.  Applicant is advised that the date of any re-submission of any item of information contained in this information disclosure statement or the submission of any missing element(s) will be the date of submission for purposes of determining compliance with the requirements based on the time of filing the statement, including all certification requirements for statements under 37 CFR 1.97(e).  See MPEP § 609.05(a).

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because reference character “24” has been used to designate both external data stores and applications.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they do not include the following reference sign mentioned in the description: 10.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Objections
Claim 7 is objected to because of the following informalities: Claim 7 recites “aggregating records of second particular partition,” which is missing an article preceding “second particular partition.” Examiner suggests revising the claim to recite “aggregating records of the second particular partition.” Appropriate correction is required.
Claim 8 is objected to because of the following informalities: Claim 8 recites “the plurality of record,” which lacks antecedent basis. Examiner suggests revising the claim to recite “the plurality of records.” Appropriate correction is required.
Claim 9 is objected to because of the following informalities: Claim 9 recites “the plurality of record,” which lacks antecedent basis. Examiner suggests revising the claim to recite “the plurality of records.” Appropriate correction is required.
Claim 20 is objected to because of the following informalities: Claim 20 recites “aggregating records of additional partition,” which is missing an article preceding “additional partition.” Examiner suggests revising the claim to recite “aggregating records of the additional partition.” Appropriate correction is required.
Claim 23 is objected to because of the following informalities: Claim 23 recites “selecting the particular partition for aggregation based a number of records,” which is grammatically incorrect. Examiner suggests revising the claim to “selecting the particular partition for aggregation based on a number of records.”  Appropriate correction is required.
Claim 24 is objected to because of the following informalities: Claim 24 recites “selecting the particular partition for aggregation based the particular partition having a minimum number of records,” which is grammatically incorrect. Examiner suggests revising the claim to “selecting the particular partition for aggregation based on the particular partition having a minimum number of records.”  Appropriate correction is required.
Claim 25 is objected to because of the following informalities: Claim 25 recites “selecting the additional partition for aggregation based the additional partition having a highest number of records,” which is grammatically incorrect. Examiner suggests revising the claim to recite “selecting the additional partition for aggregation based on the additional partition having a highest number of records.”  Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-4, 7-16, 21, 22, 26, 28, and 29 are rejected under 35 U.S.C. 103 as being unpatentable over Gould et al. (US Publication No. 2005/0102325) in view of Skjolsvold et al. (US Publication No. 2013/0204991).

As to claim 1, Gould teaches a computer-implemented method comprising:
obtaining, at a worker node [profiling and processing subsystem] of a distributed query execution environment, a chunk of data [initial portion of a data set], wherein the chunk of data comprises a plurality of records associated with a query (see e.g., [0072] for a data processing system 10 including a profiling and processing subsystem 20, which is used to process data from data sources 30, [0076] for the profiling and processing subsystem 20 including a profiling module 100, which reads data directly from a data source without necessarily landing a complete copy of the data to a storage medium before profiling in units of discrete work elements such as individual records, [0079] for the profiling module 100 reading records from a data source, [0087] for the profiling module 100 obtaining an initial portion of the data set, [0093] for the runtime environment providing for the profiling module 100 to execute as a parallel process and parallel processing systems including any configuration of computer systems using multiple central processing units (CPUs), either locally distributed or remotely distributed, and [0203] for profiling module 100 generating a "virtual table" that includes fields from the multiple sources and the virtual table being generated, for example, by performing a join operation on the sources using a key field that is common to the sources. The profiling and processing subsystem obtains an initial portion of a data set comprising records. A distributed profiling module of the profiling and processing subsystem performs a join query on the records.);
assigning records of the plurality of records to individual partitions of a set of data partitions at the worker node (see e.g., [0125] for the partition by round-robin component 612 taking records from the single or multiple partitions of the input data set 402 and re-partitioning the records among a number of parallel processors and/or computers (e.g., as selected by the user) in order to balance the work load among the processors and/or computers. The profiling and processing subsystem assigns the records to partitions.);
combining records [census elements] across partitions within the set of partitions, wherein combining records across partitions within the set of partitions combines records sharing a field value in a particular partition (see e.g., [0126] for the canonicalize component 616 taking in a flow of records and sending out a flow of census elements containing a field/value pair representing values for each field in an input record, an input record with ten fields yielding a flow of ten census elements and the census elements flowing into a local rollup field/value component which (for each partition) takes occurrences of the same value for the same field and combines them into one census element including a count of the number of occurrences and [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value. Census elements across the partitions that share a field value are combined in a particular partition.); 
combining the records sharing the field value in the particular partition into a single record [census element] having the field value (see e.g., [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value so that the rollup process performed in the global rollup field/value component 626 can add the occurrences calculated in different partitions to produce a total occurrences count in a single census element for each unique field/value pair contained within the profiled records. Census elements sharing the field value in the particular partition are combined into a single census element having the field value.); and
reducing a number of partitions in the set of partitions by aggregating records of the particular partition with records of an additional partition (see e.g., [0128] for the global rollup field/value component 626 processing these census elements in potentially multiple partitions for a potentially parallel file represented by the census file component 410 and [0129] for a partition by field component 632 reading a flow of census elements from the census file component 410 and re-partitioning the census elements according to a hash value based on the field such that census records with the same field (but different values) are in the same partition. Census elements of the particular partition are aggregated with census elements of an additional partition having the same field but different value. This reduces the number of partitions since there are fewer fields than values per field.).
Gould does not specifically disclose based on a number of data partitions exceeding a threshold value, combining records across partitions within the set of partitions; and reducing a number of partitions in the set of partitions by removing the particular partition from the worker node. However, Skjolsvold teaches
based on a number of data partitions exceeding a threshold value [value approaching the upper limit], combining records across partitions within the set of partitions (see e.g., [0063] for merging of partitions allowing partitions that have lower amounts of activity to be combined, this reducing the overhead required to track and maintain the various partitions for a data set, optionally, a user defining an upper limit on the number of partitions for a namespace, the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit, and the upper limit for number of partitions being dynamically configured and [0080] for it being often desirable to avoid having too many partitions, as the maximum number of partitions is approached for a server, the likelihood of merging partitions increasing, and as an example, it being desirable to maintain between 5 and 8 partitions per server. Based on the number of data partitions exceeding a value approaching the upper limit, partitions may be combined.); and
reducing a number of partitions in the set of partitions by removing the particular partition [first partition] from the worker node [current server] (see e.g., [0083] for when partitions are merged, as an initial step the partitions for merger being unassigned from the current server, for example, a first partition on server S2 having a low key value of K and a high key value of M, in this example, the epoch number for the first partition being 7, a second partition on server S4 having a low key value of M and a high key value of N, the epoch value for the second partition being 9 in this example, as an initial step, the partitions being unassigned from their respective servers, so that the partition table shows a non-assigned value for the server, the two partition entries being then replaced with a single entry having a low key of K and a high key of N, the epoch number assigned to this partition being one greater than the highest value of the merged partitions, which corresponds to 10 in this example, and the new partition then being assigned to a server. The number of partitions is reduced by removing the first partition from the current server.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to based on a number of data partitions exceeding a threshold value, combine records across partitions within the set of partitions; and reduce a number of partitions in the set of partitions by removing the particular partition from the worker node, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 2, the limitations of parent claim 1 have been discussed above. Gould does not specifically disclose wherein the set of data partitions is a first group of partitions, and wherein the worker node maintains a plurality of groups of partitions, each group of partitions associated with a subset of potential values of the field. However, Skjolsvold teaches
wherein the set of data partitions is a first group of partitions, and wherein the worker node [partition master] maintains a plurality of groups of partitions [partitions assigned to partition servers], each group of partitions associated with a subset of potential values [keys] of the field [identifier] (see e.g., [0017] for a key being a value from a namespace or domain, an example of a namespace being an identifier corresponding to all storage accounts in a cloud computing environment, and in such an example, a key corresponding to an account name, account number, or another identifier that allows a specific account to be referenced, [0018] for a "partition" being a range defined by a low (inclusive) and high (exclusive) key, [0019] for a "partition server" being a virtual machine within a cloud computing environment that corresponds to a role instance for serving zero or more partitions, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, and [0029] for based on a namespace, a computation being partitioned so that multiple partition servers handle or perform different portions of processing for the namespace, each partition corresponding to a range of key values, and when a partition is assigned to a partition server, the server performing the desired computation for any requests that contain a key value within the range corresponding to an assigned partition. The partition master maintains a plurality of groups of partitions assigned to partition servers. Each group of partitions assigned to a partition server is associated with a subset of keys of an identifier.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the set of data partitions is a first group of partitions, and wherein the worker node maintains a plurality of groups of partitions, each group of partitions associated with a subset of potential values of the field, as taught by Skjolsvold, for the benefit of providing a framework for handling features such as scalability, fault tolerance, and/or availability while reducing or minimizing the amount of effort required to address these features (see e.g., Skjolsvold, [0016]).

As to claim 3, the limitations of parent claim 1 have been discussed above. Gould does not specifically disclose wherein the set of data partitions is a first group of partitions, wherein the worker node maintains a plurality of groups of partitions, and wherein a number of the groups is equal to a number of processor cores of the worker node. However, Skjolsvold teaches
wherein the set of data partitions is a first group of partitions, wherein the worker node [partition master] maintains a plurality of groups of partitions [partitions assigned to partition servers], and wherein a number of the groups is equal to a number of processor cores [partition servers] of the worker node (see e.g., [0017] for a key being a value from a namespace or domain, an example of a namespace being an identifier corresponding to all storage accounts in a cloud computing environment, and in such an example, a key corresponding to an account name, account number, or another identifier that allows a specific account to be referenced, [0018] for a "partition" being a range defined by a low (inclusive) and high (exclusive) key, [0019] for a "partition server" being a virtual machine within a cloud computing environment that corresponds to a role instance for serving zero or more partitions, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, and [0029] for based on a namespace, a computation being partitioned so that multiple partition servers handle or perform different portions of processing for the namespace, each partition corresponding to a range of key values, and when a partition is assigned to a partition server, the server performing the desired computation for any requests that contain a key value within the range corresponding to an assigned partition. The partition master maintains a plurality of groups of partitions assigned to partition servers. The number of groups of partitions assigned to partition servers is equal to the number of partition servers.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the set of data partitions is a first group of partitions, wherein the worker node maintains a plurality of groups of partitions, and wherein a number of the groups is equal to a number of processor cores of the worker node, as taught by Skjolsvold, for the benefit of providing a framework for handling features such as scalability, fault tolerance, and/or availability while reducing or minimizing the amount of effort required to address these features (see e.g., Skjolsvold, [0016]).

As to claim 4, the limitations of parent claim 1 have been discussed above. Gould teaches wherein the set of data partitions is a first group of partitions, and wherein the method further comprises:
assigning one or more additional records of the plurality of records to individual data partitions of a second group of data partitions at the worker node (see e.g., [0094] for a flow control mechanism being implemented using input queues for the links entering a component, [0095] for when two components are connected by a flow, the upstream component sending work elements to the downstream component as long as the downstream component keeps consuming the work elements and if the downstream component falls behind, the upstream component filling up the input queue of the downstream component and stop working until the input queue clears out again, and [0125] for the partition by round-robin component 612 taking records from the single or multiple partitions of the input data set 402 and re-partitioning the records among a number of parallel processors and/or computers (e.g., as selected by the user) in order to balance the work load among the processors and/or computers. The profiling and processing subsystem assigns additional input records to a second group of partitions.);
combining records [census elements] across partitions within the second group of partitions, wherein combining records across partitions within the second group of partitions combines records sharing a second field value in a particular partition of the second group (see e.g., [0126] for the canonicalize component 616 taking in a flow of records and sending out a flow of census elements containing a field/value pair representing values for each field in an input record, an input record with ten fields yielding a flow of ten census elements and the census elements flowing into a local rollup field/value component which (for each partition) takes occurrences of the same value for the same field and combines them into one census element including a count of the number of occurrences and [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value. Census elements across the second group of partitions that share a field value are combined in a particular partition.); and
reducing the second group of data partitions by aggregating records of the particular partition of the second group with records of an additional partition of the second group (see e.g., [0128] for the global rollup field/value component 626 processing these census elements in potentially multiple partitions for a potentially parallel file represented by the census file component 410 and [0129] for a partition by field component 632 reading a flow of census elements from the census file component 410 and re-partitioning the census elements according to a hash value based on the field such that census records with the same field (but different values) are in the same partition. Census elements of the particular partition are aggregated with census elements of an additional partition having the same field but different value. This reduces the number of partitions since there are fewer fields than values per field.).
Gould does not specifically disclose based on the number of data partitions satisfying a threshold value, combining records across partitions within the second group of partitions; reducing the second group of data partitions by removing the particular partition of the second group from the worker node; and wherein operations related to the second group of data partitions occur concurrently with operations related to the first group of data partitions. However, Skjolsvold teaches
based on the number of data partitions satisfying a threshold value [value approaching the upper limit], combining records across partitions within the second group of partitions (see e.g., [0063] for merging of partitions allowing partitions that have lower amounts of activity to be combined, this reducing the overhead required to track and maintain the various partitions for a data set, optionally, a user defining an upper limit on the number of partitions for a namespace, the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit, and the upper limit for number of partitions being dynamically configured and [0080] for it being often desirable to avoid having too many partitions, as the maximum number of partitions is approached for a server, the likelihood of merging partitions increasing, and as an example, it being desirable to maintain between 5 and 8 partitions per server. Based on the number of data partitions exceeding a value approaching the upper limit, partitions may be combined.);
reducing the second group of data partitions by removing the particular partition [first partition] of the second group from the worker node [current server] (see e.g., [0083] for when partitions are merged, as an initial step the partitions for merger being unassigned from the current server, for example, a first partition on server S2 having a low key value of K and a high key value of M, in this example, the epoch number for the first partition being 7, a second partition on server S4 having a low key value of M and a high key value of N, the epoch value for the second partition being 9 in this example, as an initial step, the partitions being unassigned from their respective servers, so that the partition table shows a non-assigned value for the server, the two partition entries being then replaced with a single entry having a low key of K and a high key of N, the epoch number assigned to this partition being one greater than the highest value of the merged partitions, which corresponds to 10 in this example, and the new partition then being assigned to a server. The number of partitions is reduced by removing the first partition from the current server.); and
wherein operations related to the second group of data partitions [partitions assigned to the second partition server] occur concurrently with operations related to the first group of data partitions [partitions assigned to the first partition server] (see e.g., [0017] for a key being a value from a namespace or domain, an example of a namespace being an identifier corresponding to all storage accounts in a cloud computing environment, and iIn such an example, a key corresponding to an account name, account number, or another identifier that allows a specific account to be referenced, [0018] for a "partition" being a range defined by a low (inclusive) and high (exclusive) key, [0019] for a "partition server" being a virtual machine within a cloud computing environment that corresponds to a role instance for serving zero or more partitions, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, [0029] for based on a namespace, a computation being partitioned so that multiple partition servers handle or perform different portions of processing for the namespace, each partition corresponding to a range of key values, and when a partition is assigned to a partition server, the server performing the desired computation for any requests that contain a key value within the range corresponding to an assigned partition, and [0061] for a heartbeat or another type of message being used to query each server regarding the server's current partition assignments, this query including a query for the name of the storage object for a server, and the tasks proceeding in parallel. Operations related to the partitions assigned to the first partition server occur concurrently with operations related to the partitions assigned to the second partition server).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to based on the number of data partitions satisfying a threshold value, combine records across partitions within the second group of partitions; reduce the second group of data partitions by removing the particular partition of the second group from the worker node; and wherein operations related to the second group of data partitions occur concurrently with operations related to the first group of data partitions, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 7, the limitations of parent claim 1 have been discussed above. Gould teaches
obtaining one or more additional chunks of data, the additional chunks comprising a second plurality of records associated with the query (see e.g., [0094] for a flow control mechanism being implemented using input queues for the links entering a component, [0095] for when two components are connected by a flow, the upstream component sending work elements to the downstream component as long as the downstream component keeps consuming the work elements and if the downstream component falls behind, the upstream component filling up the input queue of the downstream component and stop working until the input queue clears out again, and [0203] for profiling module 100 generating a "virtual table" that includes fields from the multiple sources and the virtual table being generated, for example, by performing a join operation on the sources using a key field that is common to the sources. Records that are part of the join operation continue being input into queues of the profiling and processing subsystem 20 after the method is implemented for the initial portion of the data set.);
assigning records of the second plurality of records to individual data partitions of the set of data partitions at the worker node (see e.g., [0125] for the partition by round-robin component 612 taking records from the single or multiple partitions of the input data set 402 and re-partitioning the records among a number of parallel processors and/or computers (e.g., as selected by the user) in order to balance the work load among the processors and/or computers. The profiling and processing subsystem assigns the records to partitions.);
combining records [census elements] across partitions within the set of partitions, wherein combining records across partitions within the set of partitions combines records sharing a second field value in a second particular partition (see e.g., [0126] for the canonicalize component 616 taking in a flow of records and sending out a flow of census elements containing a field/value pair representing values for each field in an input record, an input record with ten fields yielding a flow of ten census elements and the census elements flowing into a local rollup field/value component which (for each partition) takes occurrences of the same value for the same field and combines them into one census element including a count of the number of occurrences and [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value. Census elements across the partitions that share a field value are combined in a particular partition.);
combining the records sharing the second field value in the second particular partition into an individual record [census element] having the second field value (see e.g., [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value so that the rollup process performed in the global rollup field/value component 626 can add the occurrences calculated in different partitions to produce a total occurrences count in a single census element for each unique field/value pair contained within the profiled records. Census elements sharing the field value in the particular partition are combined into a single census element having the field value.); and
reducing the set of data partitions by aggregating records of second particular partition with records of another partition (see e.g., [0128] for the global rollup field/value component 626 processing these census elements in potentially multiple partitions for a potentially parallel file represented by the census file component 410 and [0129] for a partition by field component 632 reading a flow of census elements from the census file component 410 and re-partitioning the census elements according to a hash value based on the field such that census records with the same field (but different values) are in the same partition. Census elements of the particular partition are aggregated with census elements of an additional partition having the same field but different value. This reduces the number of partitions since there are fewer fields than values per field.).
Gould does not specifically disclose based on the number of data partitions satisfying the threshold value, combining records across partitions within the set of partitions; and reducing he set of data partitions by removing the second particular partition from the worker node. However, Skjolsvold teaches
based on the number of data partitions satisfying the threshold value, combining records across partitions within the set of partitions (see e.g., [0063] for merging of partitions allowing partitions that have lower amounts of activity to be combined, this reducing the overhead required to track and maintain the various partitions for a data set, optionally, a user defining an upper limit on the number of partitions for a namespace, the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit, and the upper limit for number of partitions being dynamically configured and [0080] for it being often desirable to avoid having too many partitions, as the maximum number of partitions is approached for a server, the likelihood of merging partitions increasing, and as an example, it being desirable to maintain between 5 and 8 partitions per server. Based on the number of data partitions exceeding a value approaching the upper limit, partitions may be combined.); and
reducing the set of data partitions by removing the second particular partition from the worker node (see e.g., [0083] for when partitions are merged, as an initial step the partitions for merger being unassigned from the current server, for example, a first partition on server S2 having a low key value of K and a high key value of M, in this example, the epoch number for the first partition being 7, a second partition on server S4 having a low key value of M and a high key value of N, the epoch value for the second partition being 9 in this example, as an initial step, the partitions being unassigned from their respective servers, so that the partition table shows a non-assigned value for the server, the two partition entries being then replaced with a single entry having a low key of K and a high key of N, the epoch number assigned to this partition being one greater than the highest value of the merged partitions, which corresponds to 10 in this example, and the new partition then being assigned to a server. The number of partitions is reduced by removing the first partition from the current server.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to based on the number of data partitions satisfying the threshold value, combine records across partitions within the set of partitions; and reduce the set of data partitions by removing the second particular partition from the worker node, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 8, the limitations of parent claim 1 have been discussed above. Gould teaches
wherein each record of the plurality of record reflects one or more events [occurrence of a field attaining a particular value] detected within raw machine data (see e.g., [0113] for reading individual work elements (e.g., individual records) from raw data in a data system, the runtime environment providing access to a physical computer-readable storage medium (e.g., a magnetic, optical, or magneto-optical medium) as a string of raw data bits (e.g., mounted in a file system or flowing over a network connection), and the import component accessing a DML file to determine how to read and interpret the raw data in order to generate a flow of work elements and [0116] for a component interpreting a block of raw data to extract values for all of the fields of a record. Each record reflects occurrences of fields attaining particular values detected within raw machine data.).

As to claim 9, the limitations of parent claim 1 have been discussed above. Gould teaches
wherein each record of the plurality of record reflects one or more events [occurrence of a field attaining a particular value] detected within raw machine data, and wherein the chunk is obtained from an indexer device [import component] configured to generate the record from the one or more events (see e.g., [0106] for an import component implementing the portion of the profiling module 100 that can interpret the data format of a wide variety of data systems, [0113] for reading individual work elements (e.g., individual records) from raw data in a data system, the runtime environment providing access to a physical computer-readable storage medium (e.g., a magnetic, optical, or magneto-optical medium) as a string of raw data bits (e.g., mounted in a file system or flowing over a network connection), and the import component accessing a DML file to determine how to read and interpret the raw data in order to generate a flow of work elements and [0116] for a component interpreting a block of raw data to extract values for all of the fields of a record. Each record reflects occurrences of fields attaining particular values detected within raw machine data. The initial portion of the data set is obtained from the import component, which is configured to generate the record form occurrences of fields attaining particular values.).

As to claim 10, the limitations of parent claim 1 have been discussed above. Gould teaches 
wherein the particular partition includes records obtained from multiple different chunks [data sources] (see e.g., [0072] for a data processing system 10 including a profiling and processing subsystem 20, which is used to process data from data sources 30, [0076] for the profiling and processing subsystem 20 including a profiling module 100, which reads data directly from a data source without necessarily landing a complete copy of the data to a storage medium before profiling in units of discrete work elements such as individual records, [0073] for data sources 30 in general including a variety of individual data sources, each of which may have unique storage formats and interfaces (for example, database tables, spreadsheet files, flat text files, or a native format used by a mainframe 110), and  [0079] for the profiling module 100 reading records from a data source. The particular partition includes census elements obtained from multiple different data sources).

As to claim 11, the limitations of parent claim 1 have been discussed above. Gould teaches 
prior to combining records across partitions within the set of partitions, combining records in each partition that have shared field values (see e.g., [0126] for the canonicalize component 616 taking in a flow of records and sending out a flow of census elements containing a field/value pair representing values for each field in an input record, an input record with ten fields yielding a flow of ten census elements and the census elements flowing into a local rollup field/value component which (for each partition) takes occurrences of the same value for the same field and combines them into one census element including a count of the number of occurrences. Records in each partition that have shared field values are combined into one census element.). 

As to claim 12, the limitations of parent claim 1 have been discussed above. Gould does not specifically disclose wherein the number of data partitions is a number of data partitions at the worker node. However, Skjolsvold teaches
wherein the number of data partitions is a number of data partitions at the worker node (see e.g., [0018] for the union of all partitions spanning the entire domain or namespace, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, and [0063] for a user defining an upper limit on the number of partitions for a namespace, the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit, and the upper limit for number of partitions being dynamically configured. The number of partitions refers to the number of partitions managed by the partition master.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the number of data partitions is a number of data partitions at the worker node, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 13. the limitations of parent claim 1 have been discussed above. Gould does not specifically disclose wherein the worker node is one of a plurality of worker nodes within the distributed query execution environment, and wherein the number of data partitions is a number of data partitions across the plurality of worker nodes. However, Skjolsvold teaches
wherein the worker node [dictator] is one of a plurality of worker nodes within the distributed query execution environment (see e.g., [0024] for partition masters for a given type of role being preferably redundant, so that at least one additional partition master is available if a failure occurs and a "dictator" being defined as the partition master that current performs the partition master functions for a given type of role. The dictator is one of a plurality of partition masters.), and
wherein the number of data partitions is a number of data partitions across the plurality of worker nodes (see e.g., [0018] for the union of all partitions spanning the entire domain or namespace, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, and [0063] for a user defining an upper limit on the number of partitions for a namespace, the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit, and the upper limit for number of partitions being dynamically configured. The number of partitions refers to the number of partitions across the dictator and additional partition masters provided for redundancy.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the worker node is one of a plurality of worker nodes within the distributed query execution environment, and wherein the number of data partitions is a number of data partitions across the plurality of worker nodes, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 14, the limitations of parent claim 1 have been discussed above. Gould does not specifically disclose wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises obtaining the number of data partitions from the search master. However, Skjolsvold teaches
wherein the distributed query execution environment includes a search master [partition table] configured to track the number of data partitions, and wherein the method further comprises obtaining the number of data partitions from the search master (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. Prior to merging, the highest epoch number of the partitions indicates the number of partitions. The partitions master obtains the number of partitions from the partition table in order to determine if the number is approaching an upper limit.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises obtaining the number of data partitions from the search master, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 15, the limitations of parent claim 1 have been discussed above. Gould does not specifically disclose wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master. However, Skjolsvold teaches
wherein the distributed query execution environment includes a search master [partition table] configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. Prior to merging, the highest epoch number of the partitions indicates the number of partitions. The partition master reports each partition and partition epoch number to the partition table.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 16, the limitations of parent claim 1 have been discussed above. Gould does not specifically disclose wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master and obtaining the number of data partitions from the search master in response to the reporting. However, Skjolsvold teaches
wherein the distributed query execution environment includes a search master [partition table] configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master and obtaining the number of data partitions from the search master in response to the reporting (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. Prior to merging, the highest epoch number of the partitions indicates the number of partitions. The partition master reports each partition and partition epoch number to the partition table. The partitions master obtains, in response to reporting additional partitions, the number of partitions from the partition table in order to determine if the number is approaching an upper limit.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the distributed query execution environment includes a search master configured to track the number of data partitions, and wherein the method further comprises reporting the number of data partitions to the search master and obtaining the number of data partitions from the search master in response to the reporting, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 21, the limitations of parent claim 1 have been discussed above. Gould teaches
wherein the query is associated with multiple chunks, and wherein the method is implemented prior to one or more additional chunks being obtained at the worker node (see e.g., [0094] for a flow control mechanism being implemented using input queues for the links entering a component, [0095] for when two components are connected by a flow, the upstream component sending work elements to the downstream component as long as the downstream component keeps consuming the work elements and if the downstream component falls behind, the upstream component filling up the input queue of the downstream component and stop working until the input queue clears out again, and [0203] for profiling module 100 generating a "virtual table" that includes fields from the multiple sources and the virtual table being generated, for example, by performing a join operation on the sources using a key field that is common to the sources. Records that are part of the join operation continue being input into queues of the profiling and processing subsystem 20 after the method is implemented for the initial portion of the data set.). 

As to claim 22, the limitations of parent claim 1 have been discussed above. Gould teaches
wherein the field value is derived from a combination of fields of the plurality of records (see e.g., [0180] for a functional dependency relationship existing among a subset of fields where the value associated with one field of a record can be uniquely determined by the values associated with other fields of the record and for example, the value of the Zip Code field being uniquely determined by the values of a City field and a Street field. The Zip Code field is derived from a combination of City and Street fields.).

As to claim 26, Gould teaches a system implementing a worker node [profiling and processing subsystem] of a distributed query execution environment, the system comprising:
a data store [data storage system] including computer-executable instructions [procedures] (see e.g., [0207] for the software forming procedures in one or more computer programs that execute on one or more programmed or programmable computer systems (which may be of various architectures, such as distributed, client/server, or grid) each including at least one processor, at least one data storage system (for example, volatile and non-volatile memory and/or storage elements), at least one input device or port, and at least one output device or port); and 
a processor in communication with the data store and configured to execute the computer- executable instructions (see e.g., [0207] for the software forming procedures in one or more computer programs that execute on one or more programmed or programmable computer systems (which may be of various architectures, such as distributed, client/server, or grid) each including at least one processor, at least one data storage system (for example, volatile and non-volatile memory and/or storage elements), at least one input device or port, and at least one output device or port) to:
obtain a chunk of data [initial portion of a data set], wherein the chunk of data comprises a plurality of records associated with a query (see e.g., [0072] for a data processing system 10 including a profiling and processing subsystem 20, which is used to process data from data sources 30, [0076] for the profiling and processing subsystem 20 including a profiling module 100, which reads data directly from a data source without necessarily landing a complete copy of the data to a storage medium before profiling in units of discrete work elements such as individual records, [0079] for the profiling module 100 reading records from a data source, [0087] for the profiling module 100 obtaining an initial portion of the data set, [0093] for the runtime environment providing for the profiling module 100 to execute as a parallel process and parallel processing systems including any configuration of computer systems using multiple central processing units (CPUs), either locally distributed or remotely distributed, and [0203] for profiling module 100 generating a "virtual table" that includes fields from the multiple sources and the virtual table being generated, for example, by performing a join operation on the sources using a key field that is common to the sources. The profiling and processing subsystem obtains an initial portion of a data set comprising records. A distributed profiling module of the profiling and processing subsystem performs a join query on the records.);
assign records of the plurality of records to individual data partitions of a set of data partitions at the worker node (see e.g., [0125] for the partition by round-robin component 612 taking records from the single or multiple partitions of the input data set 402 and re-partitioning the records among a number of parallel processors and/or computers (e.g., as selected by the user) in order to balance the work load among the processors and/or computers. The profiling and processing subsystem assigns the records to partitions.);
combine records [census elements] across partitions within the set of partitions, wherein combining records across partitions within the set of partitions combines records sharing a field value in a particular partition (see e.g., [0126] for the canonicalize component 616 taking in a flow of records and sending out a flow of census elements containing a field/value pair representing values for each field in an input record, an input record with ten fields yielding a flow of ten census elements and the census elements flowing into a local rollup field/value component which (for each partition) takes occurrences of the same value for the same field and combines them into one census element including a count of the number of occurrences and [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value. Census elements across the partitions that share a field value are combined in a particular partition.);
combine the records sharing the field value in the particular partition into a single record [census element] having the field value (see e.g., [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value so that the rollup process performed in the global rollup field/value component 626 can add the occurrences calculated in different partitions to produce a total occurrences count in a single census element for each unique field/value pair contained within the profiled records. Census elements sharing the field value in the particular partition are combined into a single census element having the field value.); and
reduce the set of data partitions by aggregating records of the particular partition with records of an additional partition (see e.g., [0128] for the global rollup field/value component 626 processing these census elements in potentially multiple partitions for a potentially parallel file represented by the census file component 410 and [0129] for a partition by field component 632 reading a flow of census elements from the census file component 410 and re-partitioning the census elements according to a hash value based on the field such that census records with the same field (but different values) are in the same partition. Census elements of the particular partition are aggregated with census elements of an additional partition having the same field but different value. This reduces the number of partitions since there are fewer fields than values per field.).
Gould does not specifically disclose a based on a number of data partitions satisfying a threshold value, combine records across partitions within the set of partitions; and reduce the set of data partitions by removing the particular partition from the worker node. However, Skjolsvold teaches
based on a number of data partitions satisfying a threshold value [value approaching the upper limit], combine records across partitions within the set of partitions (see e.g., [0063] for merging of partitions allowing partitions that have lower amounts of activity to be combined, this reducing the overhead required to track and maintain the various partitions for a data set, optionally, a user defining an upper limit on the number of partitions for a namespace, the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit, and the upper limit for number of partitions being dynamically configured and [0080] for it being often desirable to avoid having too many partitions, as the maximum number of partitions is approached for a server, the likelihood of merging partitions increasing, and as an example, it being desirable to maintain between 5 and 8 partitions per server. Based on the number of data partitions exceeding a value approaching the upper limit, partitions may be combined.); and 
reduce the set of data partitions by removing the particular partition [first partition] from the worker node [current server] (see e.g., [0083] for when partitions are merged, as an initial step the partitions for merger being unassigned from the current server, for example, a first partition on server S2 having a low key value of K and a high key value of M, in this example, the epoch number for the first partition being 7, a second partition on server S4 having a low key value of M and a high key value of N, the epoch value for the second partition being 9 in this example, as an initial step, the partitions being unassigned from their respective servers, so that the partition table shows a non-assigned value for the server, the two partition entries being then replaced with a single entry having a low key of K and a high key of N, the epoch number assigned to this partition being one greater than the highest value of the merged partitions, which corresponds to 10 in this example, and the new partition then being assigned to a server. The number of partitions is reduced by removing the first partition from the current server.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to based on a number of data partitions satisfying a threshold value, combine records across partitions within the set of partitions; and reduce the set of data partitions by removing the particular partition from the worker node, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 28, the limitations of parent claim 26 have been discussed above. Gould does not specifically disclose wherein the worker node is one of a plurality of worker nodes within the distributed query execution environment, and wherein the number of data partitions is a number of data partitions across the plurality of worker nodes. However, Skjolsvold teaches
wherein the worker node [dictator] is one of a plurality of worker nodes within the distributed query execution environment (see e.g., [0024] for partition masters for a given type of role being preferably redundant, so that at least one additional partition master is available if a failure occurs and a "dictator" being defined as the partition master that current performs the partition master functions for a given type of role. The dictator is one of a plurality of partition masters.), and
wherein the number of data partitions is a number of data partitions across the plurality of worker nodes (see e.g., [0018] for the union of all partitions spanning the entire domain or namespace, [0020] for a "partition master" being a role that manages partition servers for a given type of role, such as by assigning and unassigning partitions to partition servers, and [0063] for a user defining an upper limit on the number of partitions for a namespace, the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit, and the upper limit for number of partitions being dynamically configured. The number of partitions refers to the number of partitions across the dictator and additional partition masters provided for redundancy.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein the worker node is one of a plurality of worker nodes within the distributed query execution environment, and wherein the number of data partitions is a number of data partitions across the plurality of worker nodes, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

As to claim 29, Gould teaches non-transitory computer-readable media comprising computer-executable instructions (see e.g., [0208] for each such computer program being preferably stored on or downloaded to a storage media or device (e.g., solid state memory or media, or magnetic or optical media) readable by a general or special purpose programmable computer, for configuring and operating the computer when the storage media or device is read by the computer system to perform the procedures described herein) that, when executed by a worker node [profiling and processing subsystem] of a distributed query execution environment, cause the worker node to:
obtain a chunk of data [initial portion of a data set], wherein the chunk of data comprises a plurality of records associated with a query (see e.g., [0072] for a data processing system 10 including a profiling and processing subsystem 20, which is used to process data from data sources 30, [0076] for the profiling and processing subsystem 20 including a profiling module 100, which reads data directly from a data source without necessarily landing a complete copy of the data to a storage medium before profiling in units of discrete work elements such as individual records, [0079] for the profiling module 100 reading records from a data source, [0087] for the profiling module 100 obtaining an initial portion of the data set, [0093] for the runtime environment providing for the profiling module 100 to execute as a parallel process and parallel processing systems including any configuration of computer systems using multiple central processing units (CPUs), either locally distributed or remotely distributed, and [0203] for profiling module 100 generating a "virtual table" that includes fields from the multiple sources and the virtual table being generated, for example, by performing a join operation on the sources using a key field that is common to the sources. The profiling and processing subsystem obtains an initial portion of a data set comprising records. A distributed profiling module of the profiling and processing subsystem performs a join query on the records.);
assign records of the plurality of records to individual data partitions of a set of data partitions at the worker node (see e.g., [0125] for the partition by round-robin component 612 taking records from the single or multiple partitions of the input data set 402 and re-partitioning the records among a number of parallel processors and/or computers (e.g., as selected by the user) in order to balance the work load among the processors and/or computers. The profiling and processing subsystem assigns the records to partitions.);
combine records [census elements] across partitions within the set of partitions, wherein combining records across partitions within the set of partitions combines records sharing a field value in a particular partition (see e.g., [0126] for the canonicalize component 616 taking in a flow of records and sending out a flow of census elements containing a field/value pair representing values for each field in an input record, an input record with ten fields yielding a flow of ten census elements and the census elements flowing into a local rollup field/value component which (for each partition) takes occurrences of the same value for the same field and combines them into one census element including a count of the number of occurrences and [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value. Census elements across the partitions that share a field value are combined in a particular partition.);
combine the records sharing the field value in the particular partition into a single record  [census element] having the field value  (see e.g., [0128] for the partition by field/value component 624 re-partitioning the census elements by field and value so that the rollup process performed in the global rollup field/value component 626 can add the occurrences calculated in different partitions to produce a total occurrences count in a single census element for each unique field/value pair contained within the profiled records. Census elements sharing the field value in the particular partition are combined into a single census element having the field value.); and
reduce the set of data partitions by aggregating records of the particular partition with records of an additional partition (see e.g., [0128] for the global rollup field/value component 626 processing these census elements in potentially multiple partitions for a potentially parallel file represented by the census file component 410 and [0129] for a partition by field component 632 reading a flow of census elements from the census file component 410 and re-partitioning the census elements according to a hash value based on the field such that census records with the same field (but different values) are in the same partition. Census elements of the particular partition are aggregated with census elements of an additional partition having the same field but different value. This reduces the number of partitions since there are fewer fields than values per field.).
Gould does not specifically disclose a based on a number of data partitions satisfying a threshold value, combine records across partitions within the set of partitions; and reduce the set of data partitions by removing the particular partition from the worker node. However, Skjolsvold teaches
based on a number of data partitions satisfying a threshold value [value approaching the upper limit], combine records across partitions within the set of partitions (see e.g., [0063] for merging of partitions allowing partitions that have lower amounts of activity to be combined, this reducing the overhead required to track and maintain the various partitions for a data set, optionally, a user defining an upper limit on the number of partitions for a namespace, the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit, and the upper limit for number of partitions being dynamically configured and [0080] for it being often desirable to avoid having too many partitions, as the maximum number of partitions is approached for a server, the likelihood of merging partitions increasing, and as an example, it being desirable to maintain between 5 and 8 partitions per server. Based on the number of data partitions exceeding a value approaching the upper limit, partitions may be combined.); and 
reduce the set of data partitions by removing the particular partition [first partition] from the worker node [current server] (see e.g., [0083] for when partitions are merged, as an initial step the partitions for merger being unassigned from the current server, for example, a first partition on server S2 having a low key value of K and a high key value of M, in this example, the epoch number for the first partition being 7, a second partition on server S4 having a low key value of M and a high key value of N, the epoch value for the second partition being 9 in this example, as an initial step, the partitions being unassigned from their respective servers, so that the partition table shows a non-assigned value for the server, the two partition entries being then replaced with a single entry having a low key of K and a high key of N, the epoch number assigned to this partition being one greater than the highest value of the merged partitions, which corresponds to 10 in this example, and the new partition then being assigned to a server. The number of partitions is reduced by removing the first partition from the current server.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to based on a number of data partitions satisfying a threshold value, combine records across partitions within the set of partitions; and reduce the set of data partitions by removing the particular partition from the worker node, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).

Claims 5, 6, 20, and 23-25 are rejected under 35 U.S.C. 103 as being unpatentable over Gould et al. (US Publication No. 2005/0102325) in view of Skjolsvold et al. (US Publication No. 2013/0204991) as applied to claims 1-4, 7-16, 21, 22, 26, 28, and 29 above, and further in view of Kim (US Publication No. 2020/0057818).

As to claim 5, the limitations of parent claim 1 have been discussed above. Gould in view of Skjolsvold does not specifically disclose wherein each data partition of the set of data partitions contains records received at the worker node during a distinct time period. However, Kim teaches
wherein each data partition of the set of data partitions contains records received at the worker node during a distinct time period (see e.g., [0071] for referring to FIG. 3, tag names appearing repetitively in several partitions, with real-time input, sensor data occurring continuously with respect to time, however, as regards time values, data being typically inputted sequentially from the past to the present, therefore, if the minimum and maximum values of time for a partition are maintained in the memory, it being possible to forego reading several partitions based on the condition of input time, when merging indexes, the minimum and maximum values for the time values being obtained and recorded in the partition header, and such information being maintained. Partition 0 contains records received from 00:00 to 00:10. Partition 1 contains records received from 00:10 to 00:20. Partition 2 contains records received from 00:20 to 00:30. Partition 3 contains records received from 00:40 to 00:50.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein each data partition of the set of data partitions contains records received at the worker node during a distinct time period, as taught by Kim, for the benefit of reducing the computational cost of searching time series data (see e.g., Kim, [0078]-[0079]).

As to claim 6, the limitations of parent claim 1 have been discussed above. Gould in view of Skjolsvold does not specifically disclose wherein assigning records of the plurality of records to individual data partitions of the set of data partitions at the worker node comprises assigning records to an individual data partition of the set of data partitions until the individual data partition reaches a maximum number of records and then assigning records to a second individual data partition of the set of data partitions. However, Kim teaches
wherein assigning records of the plurality of records to individual data partitions of the set of data partitions at the worker node comprises assigning records to an individual data partition of the set of data partitions until the individual data partition reaches a maximum number of records and then assigning records to a second individual data partition of the set of data partitions (see e.g., [0040] for the records being simply stored until the number of inputted records reaches the partition's maximum count and FIG. 2 and [0048] for the data partition size at the initial level being 4. Records 0-3 are assigned to Partition 0 and Records 4-7 are assigned to Partition 1.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein assigning records of the plurality of records to individual data partitions of the set of data partitions at the worker node comprises assigning records to an individual data partition of the set of data partitions until the individual data partition reaches a maximum number of records and then assigning records to a second individual data partition of the set of data partitions, as taught by Kim, for the benefit of reducing the computational cost of searching time series data (see e.g., Kim, [0078]-[0079]).

As to claim 20, the limitations of parent claim 1 have been discussed above. Gould in view of Skjolsvold does not specifically disclose combining, within the additional partition, two or more records sharing the field value into an individual record having the field value; and reducing the set of data partitions by aggregating records of additional partition with records of another partition and removing the additional partition from the worker node. However, Kim teaches
combining, within the additional partition [Block 0-1], two or more records sharing the field [tag] value into an individual record having the field value (see e.g., [0049] for FIG. 2 showing how multiple blocks configured with multiple records having the fields <Tag, Time, Value> may be merged together, for example, Block 0 and Block 1 from before the merging being merged together to form Block 0-1, while Block 2 and Block 3 being merged to form Block 2-3, and the merged Block 0-1 and Block 2-3 being configured to have the fields <Tag, Count, Time, Value, Row ID>, here, the “count” field representing the number of records having the same tag, for example, for Block 0 and Block 1 from before the merging, the numbers of records of which the tag ID is 0 being two and one, respectively, and accordingly, from the value of the “count” field, it being seen that the merged Block 0-1 has three records of which the tag ID is 0. Records sharing the same tag value are combined within Block 0-1 into an individual record having the tag value.); and 
reducing the set of data partitions by aggregating records of additional partition with records of another partition [Block 2-3] and removing the additional partition from the worker node (see e.g., [0050] for Block 0-1 and Block 2-3, which have undergone a primary merging, being merged to generate Block 0-3 via a secondary merging, by repeating the merging steps in a leveled manner, including a primary merging process and a secondary merging process, there being the advantage that it is possible to generate one index file for a hundred million or more pieces of data, and also, when generating a second index file of a subsequent level from the partitioned first index files of a previous level is completed, then the completed status of the second index file being recorded in the head region and tail region of the second index file, and there being the advantage that the first index files can be deleted. The set of data partitions is reduced by aggregating records of Block 0-1 with records of Block 2-3 and removing Block 0-1.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold to combine, within the additional partition, two or more records sharing the field value into an individual record having the field value; and reduce the set of data partitions by aggregating records of additional partition with records of another partition and removing the additional partition from the worker node, as taught by Kim, for the benefit of reducing the computational cost of searching time series data (see e.g., Kim, [0078]-[0079]).

 As to claim 23, the limitations of parent claim 1 have been discussed above. Gould in view of Skjolsvold does not specifically disclose wherein reducing the set of data partitions by aggregating records of the particular partition with records of an additional partition comprises selecting the particular partition for aggregation based a number of records within the particular partition. However, Kim teaches
wherein reducing the set of data partitions by aggregating records of the particular partition [Block 0] with records of an additional partition [Block 1] comprises selecting the particular partition for aggregation based a number of records within the particular partition (see e.g., [0049] for multiple blocks configured with multiple records having the fields <Tag, Time, Value> being merged together, for example, Block 0 and Block 1 from before the merging being merged together to form Block 0-1, while Block 2 and Block 3 being merged to form Block 2-3, and the merged Block 0-1 and Block 2-3 being configured to have the fields <Tag, Count, Time, Value, Row ID> and [0052] for the data stored in the memory 200 being stored as records configured as <time, tag name, value>, which may be stored for each partition until the number of stored records reaches the maximum count and here, for a partition for which the storage is completed, an index being generated with <tag name, time> as a key, and <value, RID> being recorded in the data region of the index. Block 0 is selected for indexing based on the number of records within Block 0 reaching the maximum count. Indexing Block 0 includes aggregating the records of Block 0 with the records of Block 1.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein reducing the set of data partitions by aggregating records of the particular partition with records of an additional partition comprises selecting the particular partition for aggregation based a number of records within the particular partition, as taught by Kim, for the benefit of reducing the computational cost of searching time series data (see e.g., Kim, [0078]-[0079]).

As to claim 24, the limitations of parent claim 1 have been discussed above. Gould does not specifically disclose wherein reducing the set of data partitions by aggregating records of the particular partition with records of an additional partition comprises selecting the particular partition for aggregation based the particular partition having a minimum amount of storage usage compared to other partitions of the set of data partitions. However, Skjolsvold teaches
wherein reducing the set of data partitions by aggregating records of the particular partition with records of an additional partition comprises selecting the particular partition for aggregation based the particular partition having a minimum amount of storage usage compared to other partitions of the set of data partitions (see e.g., [0063] for load balancing including merging multiple partitions into a single partition and merging of partitions allowing partitions that have lower amounts of activity to be combined, [0064] for as an example of determining when to split or move a partition, all partitions for a namespace being sorted based on load, the load referring to one or more metrics related to performing calculations for a partition, and the load referring to storage used for a partition, and [0069] for dimensions for triggering a partition split also being useful for identifying when to merge two partitions. A partition is selected for merging based on the partition having a minimum amount of storage usage compared to other partitions.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould wherein reducing the set of data partitions by aggregating records of the particular partition with records of an additional partition comprises selecting the particular partition for aggregation based the particular partition having a minimum amount of storage usage compared to other partitions of the set of data partitions, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).
Gould in view of Skjolsvold does not specifically disclose the amount of storage usage being the number of records. However, Kim teaches
the amount of storage usage being the number of records (see e.g., [0040] for to remove the load for data input and index generation, inputted data being stored in the form of a record of <time, tagname, value>, with additional columns stored separately according to column and the records being simply stored until the number of inputted records reaches the partition's maximum count. The amount of storage a partition uses is determined by the number of records).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold to include the amount of storage usage being the number of records, as taught by Kim, for the benefit of reducing the computational cost of searching time series data (see e.g., Kim, [0078]-[0079]).

As to claim 25, the limitations of claim 1 have been discussed above. Gould in view of Skjolsvold does not specifically disclose wherein reducing the set of data partitions by aggregating records of the particular partition with records of an additional partition comprises selecting the additional partition for aggregation based the additional partition having a highest number of records, compared to other partitions of the set of data partitions, that does not exceed a maximum number of records allowable within the additional partition. However, Kim teaches
wherein reducing the set of data partitions by aggregating records of the particular partition [Block 0] with records of an additional partition [Block 1] comprises selecting the additional partition for aggregation based the additional partition having a highest number of records, compared to other partitions of the set of data partitions, that does not exceed a maximum number of records allowable within the additional partition (see e.g., [0049] for multiple blocks configured with multiple records having the fields <Tag, Time, Value> being merged together, for example, Block 0 and Block 1 from before the merging being merged together to form Block 0-1, while Block 2 and Block 3 being merged to form Block 2-3, and the merged Block 0-1 and Block 2-3 being configured to have the fields <Tag, Count, Time, Value, Row ID> and [0052] for the data stored in the memory 200 being stored as records configured as <time, tag name, value>, which may be stored for each partition until the number of stored records reaches the maximum count and here, for a partition for which the storage is completed, an index being generated with <tag name, time> as a key, and <value, RID> being recorded in the data region of the index. Block 1 is selected for indexing based on the number of records within Block 1 reaching the maximum count. Indexing Block 1 includes aggregating the records of Block 0 with the records of Block 1.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein reducing the set of data partitions by aggregating records of the particular partition with records of an additional partition comprises selecting the additional partition for aggregation based the additional partition having a highest number of records, compared to other partitions of the set of data partitions, that does not exceed a maximum number of records allowable within the additional partition, as taught by Kim, for the benefit of reducing the computational cost of searching time series data (see e.g., Kim, [0078]-[0079]).

Claims 17-19, 27, and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Gould et al. (US Publication No. 2005/0102325) in view of Skjolsvold et al. (US Publication No. 2013/0204991) as applied to claims 1-4, 7-16, 21, 22, 26, 28, and 29 above, and further in view of Chhabra et al. (US Publication No. 2019/0229924).

As to claim 17, the limitations of parent claim 1 have been discussed above. Gould does not specifically disclose a memory allocated to track the number of data partitions. However, Skjolsvold teaches
a memory [partition table] allocated to track the number of data partitions (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. The partitions master obtains the number of partitions from the partition table in order to determine if the number is approaching an upper limit.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to include a memory allocated to track the number of data partitions, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).
Gould in view of Skjolsvold does not specifically disclose wherein the threshold is set based on the memory allocated to track the number. However, Chhabra teaches
wherein the threshold is set based on the memory allocated [counter] to track the number (see e.g., [0068] for it being important for the counter to be large enough to prevent an overflow, wherein the counter reaches its maximum value, in its foreseeable lifetime and [0081] for setting a threshold that indicates when the counter is getting close to entering the overflow state. A threshold is set based on the size of the counter.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein the threshold is set based on the memory allocated to track the number, as taught by Chhabra, for the benefit of reducing data processing overhead by reducing counter overflow (see e.g., Chhabra, [0078] and [0081]).

As to claim 18, the limitations of parent claim 1 have been discussed above. Gould teaches
wherein the memory allocated [number of bytes] is determined from a data type of a variable allocated (see e.g., [0114] for referring to FIG. 4, a type object 502 being, for example, a base type 504 or a compound type 506, a base type object 504 specifying how to interpret a string of bits (of a given length) as a single value, the base type object 504 including a length specification indicating the number of raw data bits to be read and parsed, and a length specification indicating a fixed length, such as a specified number of bytes. The number of bytes allocated is determined from a data type of a variable.).
Gould does not specifically disclose a memory allocated to track the number of data partitions. However, Skjolsvold teaches
a memory [partition table] allocated to track the number of data partitions (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. The partitions master obtains the number of partitions from the partition table in order to determine if the number is approaching an upper limit.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to include a memory allocated to track the number of data partitions, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).
Gould in view of Skjolsvold does not specifically disclose wherein the threshold is set based on the memory allocated to track the number; and a variable allocated to track the number. However, Chhabra teaches
wherein the threshold is set based on the memory allocated [counter] to track the number (see e.g., [0068] for it being important for the counter to be large enough to prevent an overflow, wherein the counter reaches its maximum value, in its foreseeable lifetime and [0081] for setting a threshold that indicates when the counter is getting close to entering the overflow state. A threshold is set based on the size of the counter.); and
a variable [counter] allocated to track the number (see e.g., [0056] for counters being incremented).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein the threshold is set based on the memory allocated to track the number; and to include a variable allocated to track the number, as taught by Chhabra, for the benefit of reducing data processing overhead by reducing counter overflow (see e.g., Chhabra, [0078] and [0081]).

As to claim 19, the limitations of parent claim 1 have been discussed above. Gould does not specifically disclose a memory allocated to track the number of data partitions. However, Skjolsvold teaches
a memory [partition table] allocated to track the number of data partitions (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. The partitions master obtains the number of partitions from the partition table in order to determine if the number is approaching an upper limit.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to include a memory allocated to track the number of data partitions, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).
Gould in view of Skjolsvold does not specifically disclose wherein the threshold is set based on the memory allocated to track the number; and wherein the threshold is set to avoid an overflow error in the memory when the number satisfies the threshold value. However, Chhabra teaches
wherein the threshold is set based on the memory allocated [counter] to track the number (see e.g., [0068] for it being important for the counter to be large enough to prevent an overflow, wherein the counter reaches its maximum value, in its foreseeable lifetime and [0081] for setting a threshold that indicates when the counter is getting close to entering the overflow state. A threshold is set based on the size of the counter.); and
wherein the threshold is set to avoid an overflow error in the memory when the number satisfies the threshold value (see e.g., [0081] for setting a threshold that indicates when the counter is getting close to entering the overflow state, for example, if the counter value plus the threshold exceeds the maximum value of the counter, then remapping of the counter being necessary, if a counter is determined to be close to overflow (e.g., based on the threshold value) adaptive mapping promoting the data line to a larger counter, and promotion including, for example, searching counters with the next larger size to find the next-larger sized counter with the smallest value. The threshold is set to ovoid an overflow error in the counter.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein the threshold is set based on the memory allocated to track the number; and wherein the threshold is set to avoid an overflow error in the memory when the number satisfies the threshold value, as taught by Chhabra, for the benefit of reducing data processing overhead by reducing counter overflow (see e.g., Chhabra, [0078] and [0081]).

As to claim 27, the limitations of parent claim 26 have been discussed above. Gould does not specifically disclose a memory allocated to track the number of data partitions. However, Skjolsvold teaches
a memory [partition table] allocated to track the number of data partitions (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. The partitions master obtains the number of partitions from the partition table in order to determine if the number is approaching an upper limit.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to include a memory allocated to track the number of data partitions, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).
Gould in view of Skjolsvold does not specifically disclose wherein the threshold is set based on the memory allocated to track the number; and wherein the threshold is set to avoid an overflow error in the memory when the number satisfies the threshold value. However, Chhabra teaches
wherein the threshold is set based on the memory allocated [counter] to track the number (see e.g., [0068] for it being important for the counter to be large enough to prevent an overflow, wherein the counter reaches its maximum value, in its foreseeable lifetime and [0081] for setting a threshold that indicates when the counter is getting close to entering the overflow state. A threshold is set based on the size of the counter.); and
wherein the threshold is set to avoid an overflow error in the memory when the number satisfies the threshold value (see e.g., [0081] for setting a threshold that indicates when the counter is getting close to entering the overflow state, for example, if the counter value plus the threshold exceeds the maximum value of the counter, then remapping of the counter being necessary, if a counter is determined to be close to overflow (e.g., based on the threshold value) adaptive mapping promoting the data line to a larger counter, and promotion including, for example, searching counters with the next larger size to find the next-larger sized counter with the smallest value. The threshold is set to ovoid an overflow error in the counter.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein the threshold is set based on the memory allocated to track the number; and wherein the threshold is set to avoid an overflow error in the memory when the number satisfies the threshold value, as taught by Chhabra, for the benefit of reducing data processing overhead by reducing counter overflow (see e.g., Chhabra, [0078] and [0081]).

As to claim 30, the limitations of parent claim 29 have been discussed above. Gould does not specifically disclose a memory allocated to track the number of data partitions. However, Skjolsvold teaches
a memory [partition table] allocated to track the number of data partitions (see e.g., [0030] for a partition table being used to track the current assignments of partitions to partition servers, when an active master or dictator assigns a partition to a server, the partition table being updated first to reflect the assignment, the partition table then being used to determine the partition server that will handle a client request based on the key specified in the client request, as an example, an entry in a partition table including the low key for a range, the high key for the range, and the role instance or server instance that will perform a requested task on the data or state corresponding to requested key, and a partition table also including other data, such as an epoch number or version number, [0039] for each partition having a current epoch number that is updated by the master when a change occurs for the partition, examples of changes for a partition including assignment of a partition to a new server, splitting of a partition, and merging of two partitions, assignments of a partition to a new server causing the epoch number to increase by one, splitting of a partition into two or more new partitions causing each child partition to receive the parent's epoch number incremented by one, and when two partitions are merged, the epoch number for the merged partition being the maximum epoch number for any of the partitions prior to merge incremented by one, [0042] for the information in partition table 455 being populated based on the partition decisions made by partition master 460, and [0063] for the thresholds for initiating a merge of partitions being reduced as the number of partitions approaches the upper limit. The partition table tracks the number of partitions by tracking the current assignments of partitions to partition servers. The partitions master obtains the number of partitions from the partition table in order to determine if the number is approaching an upper limit.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould to include a memory allocated to track the number of data partitions, as taught by Skjolsvold, for the benefit of reducing the overhead required to track and maintain the various partitions for a data set (see e.g., Skjolsvold, [0063]).
Gould in view of Skjolsvold does not specifically disclose wherein the threshold is set based on the memory allocated to track the number; and wherein the threshold is set to avoid an overflow error in the memory when the number satisfies the threshold value. However, Chhabra teaches
wherein the threshold is set based on the memory allocated [counter] to track the number (see e.g., [0068] for it being important for the counter to be large enough to prevent an overflow, wherein the counter reaches its maximum value, in its foreseeable lifetime and [0081] for setting a threshold that indicates when the counter is getting close to entering the overflow state. A threshold is set based on the size of the counter.); and
wherein the threshold is set to avoid an overflow error in the memory when the number satisfies the threshold value (see e.g., [0081] for setting a threshold that indicates when the counter is getting close to entering the overflow state, for example, if the counter value plus the threshold exceeds the maximum value of the counter, then remapping of the counter being necessary, if a counter is determined to be close to overflow (e.g., based on the threshold value) adaptive mapping promoting the data line to a larger counter, and promotion including, for example, searching counters with the next larger size to find the next-larger sized counter with the smallest value. The threshold is set to ovoid an overflow error in the counter.).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to modify the data processing system of Gould in view of Skjolsvold wherein the threshold is set based on the memory allocated to track the number; and wherein the threshold is set to avoid an overflow error in the memory when the number satisfies the threshold value, as taught by Chhabra, for the benefit of reducing data processing overhead by reducing counter overflow (see e.g., Chhabra, [0078] and [0081]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

Goo (US Patent No. 9,672,274) for determining whether a number of records in the stored data satisfies a size threshold and in response to a determination that the number of records satisfies the size threshold, aggregating the stored data in the one or more partitions into a batch of data (see claim 1).
Applicant’s filed specification recites “[o]n detecting that the number of partitions has exceeded the threshold, the node 3306 can implement techniques for reducing a number of the partitions on the node 3306. Specifically, at point 7404, the node 3306 can shuffle records between partitions, such that records having the same key value are co-located within the same partition” (see [1320]).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DARA J GLASSER whose telephone number is (571)270-3666. The examiner can normally be reached Monday-Thursday, 10:00am-2:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Apu Mofiz can be reached on (571)272-4080. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




12-08-22
/DARA J GLASSER/Examiner, Art Unit 2161                                                                                                                                                                                                        

















/APU M MOFIZ/Supervisory Patent Examiner, Art Unit 2161