Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

		Detailed Action
Remarks
This communication has been issued in response to Applicant’s amended language and submitted arguments filed 30 August 2022.  Claims 1, 3, 5, 7-11, 13, 15 & 17-20 remain pending in this application.  

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 5, 7-11, 13, 15 & 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Brewster et al (US Patent No. 11,461,304B2; Brewster hereinafter) in view of Mukherjee et al (USPG Pub No. 20160026667A1; Mukherjee hereinafter).
As for Claim 1, Brewster recites, A method for performing multi-caching on data sources of different types by using a cluster-based processing system, the method comprising:
“determining, in a case that a user query is acquired, whether a result set corresponding to a query result of the user query is present as first cache data in a master node or in a worker node, wherein the worker node is included in a same system with the master node and communicates with the master node” (see col. 7, lines 4-25, 46-67; e.g., the reference of Brewster provides signature-based cache optimization for data preparation, where the system of Bewster utilizes at least a “Caching Engine” that can take as input a set of sequenced operations received in a script generated from user input via user interfaces, derive an operation signature, and compare it to the signatures associated with existing cached results.  Existing cached results can be stored in one or more of a “cache layer”, in memory, stored on a local or networked storage device {i.e. a disk or a storage server}. As taught within lines 46-67, a “Spark Master” {i.e. considered equivalent to Applicant’s “master node”} receives requests from external clients, allowing the Spark Master to break down and distribute work portions/“chunks” to various “Spark Workers”.  Both Spark Masters and workers use a companion application {i.e. “pipeline application”} to perform work, running on all machines that run a Spark process with both Masters and workers, and reading on Applicant’s claimed limitation as the Masters and workers are within the same system.  As further elaborated upon within column 13, lines 58-67 through column 14, lines 1-11 provide teachings into the request and retrieval of cached data {i.e. cached data fragments/representations} cached at various stages of a pipeline within at least one or more of a “cache layer”, allowing for intermediate results at the various stages of a pipeline to be viewed).
 “in a case that the determining indicates that at least a specific part of the result set is not present in both the master node and the worker node” (see col. 8, lines 53-67; col. 9, lines 1-17; e.g., the reference of Brewster teaches that a “master node” is configured to devise an intelligent strategy  to partition a data set by taking into consideration various pieces of information, such as information about the data being operated on, data preparation operations to be performed, and performance statistics. A “Spark (pipeline) master” queries an input data set, such as that being obtained from a source location described in a received script.  Column 35, lines 1-53 provided a plurality of examples involving cached representations of operations/steps performed, along with attached signatures/fingerprints, that lead to corresponding cached results. Examples of checks being performed for matches of data between at least two compared tree representations of cached data, with one example being a determination that “no direct match is found”, considered equivalent to Applicant’s limitation that “...determining indicates that at least a specific part of the result set is not present in both the master node and the worker node”.  A match is taught to be an indication of the existence of at least a cached representation for some portion of a first/second sequenced set of operations): 
(i) establishing an execution plan to sequentially execute a search operation, a data processing operation, and an output operation based on a result of parsing the user query” (see col. 35, lines 7-66; e.g., the reference of Brewster teaches of determining a match/non-match between a retrieved tree representation of a previously cached representation and a tree representation derived from the signature of sequenced set of operations specified by a first/second user.  Earlier text of column 17, lines 48-66 of Brewster teaches of the utilization of one or more of a “data traversal program (DTP)” generated or cached, considered equivalent to Applicant’s “execution plan”, as the data traversal program includes a reference table and reference stack to provide information on how to read the state of a portion of the data as of a certain stage of a pipeline.  According to at least column 41, lines 34-67, a data traversal program can be cached to one or more of a “cache layer”, along with data pertaining to the data traversal program, such as a “references table”. The decision of whether to cache a representation is based on the data operation that was performed, and its respective complexity/computational cost of an operation/set of operations {i.e. join, sort, filter} that affect an entire data set, allowing the resulting data traversal program to be cached),  
“(ii) acquiring a first subset, determined as present as second cache data in the master node or in the worker node, by instructing the master node or the worker node to execute the search operation according to the execution plan, wherein the second cache data represents elemental sets included in the result set” (see col. 14, lines 2-21; col. 17, lines 48-66; e.g., the reference of Brewster teaches of the utilization of one or more of a “data traversal program (DTP)”, considered equivalent to Applicant’s “execution plan”, as the data traversal program includes a reference table and reference stack for provide information on how to read the state of a portion of the data as of a certain stage of a pipeline.  Column 30, lines 55-67 through column 31, lines 1-62, discuss the performance of at least an “Append” operation, where, according to at least a data traversal program, multiple data sets having one or more partitions including rows of data are combined {i.e. “DS1” and “DS2”} from different pipelines to form new or updated representations of the combined data. At least column 30, lines 47-67 through column 31, lines 1-33 and column 32, lines 1-49 teach of operating on a compact representation of one or more data sets in the form of a “data traversal program” for the execution of data preparation operations such as “append” and “join” {i.e. considered equivalent to Applicant’s “JOIN” and “UNION” operations} on multiple data sets for combining); 
“(iii) acquiring a second subset, determined as present in none of the master node and the worker node, from the second cache data by instructing at least one external data source to execute the search operation according to the execution plan” (see col. 4, lines 49-67; e.g., the reference of Brewster teaches of connecting the computer system to an “external network” and transfer data according to standard protocols, sharing a portion of the processing with a remote processor.  Additionally, column 7, lines 46-66 teaches of the utilization of a “SPARK master”, equivalent to Applicant’s master node, for acquiring final results to one or more requesting clients.  Column 30, lines 41-62 teach of aggregating results from potentially a large consumption of resources, such as memory resources and disk resources, to accommodate an entire data set by operating on a compact representation of the dataset in the form of the “data traversal program”, equivalent to Applicant’s “execution plan”); and 
“(iv) applying at least part of a joint operation included in the search operation to the first subset and the second subset according to the execution plan, to acquire a result of the search operation, wherein the joint operation includes at least part of a JOIN operation and a UNION operation, and the second cache data is updated such that the second cache data includes the second subset” (see col. 14, lines 2-21; col. 17, lines 48-66; e.g., the reference of Brewster teaches of the utilization of one or more of a “data traversal program (DTP)”, considered equivalent to Applicant’s “execution plan”, as the data traversal program includes a reference table and reference stack for provide information on how to read the state of a portion of the data as of a certain stage of a pipeline.  Column 30, lines 55-67 through column 31, lines 1-62, discuss the performance of at least an “Append” operation, where, according to at least a data traversal program, multiple data sets having a partitions including rows of data are combined {i.e. “DS1” and “DS2”} from different pipelines to form new or updated representations of the combined data. At least column 30, lines 47-67 through column 31, lines 1-33 and column 32, lines 1-49 teach of operating on a compact representation of one or more data sets in the form of a “data traversal program” for the execution of data preparation operations such as “append” and “join” {i.e. considered equivalent to Applicant’s “JOIN” and “UNION” operations} on multiple data sets for combining.  A plurality of examples had been provided within the cited column 31, further describing the “APPEND” operation being executed on at least two different data sets imported from their respective pipelines, and the updating of rows of data through appending for a Data Set partitions {i.e. DS1/DS2}.  Column 32, lines 1-49 further describes the appending of a first data set to a second data set and vice versa based on the number of determined partitions and further information associated with corresponding data traversal programs, creating new pipelines of appended data sets, and further appending existing data sets); 
	“at the step of (ii), in response to determining that a first data set to an n-th data set, to which at least part of a joint operation included in the execution plan is to be applied, are present as the first subset in the master node or in the worker node:

	(ii-1) instructing the master node or the worker node to execute the search operation according to the execution plan to acquire the first subset present as second cache data in the master node or the worker node” (see col. 35, lines 7-41, 60-66; e.g., the cited column 35, lines 7-41 provided teachings into the appending of cached data set partitions {i.e. DS1/DS2} from different pipelines at the request of a first and/or second user.  Performing the second sequenced set of operations specified by at least the second user can determine whether a previously cached representation can be leveraged to provide at least some or all of the results, thus, making a determination as to whether a subset is present within two or more compared tree representations derived from one or more a corresponding signature/fingerprints.  Trees can be compared to determine whether a graph/subgraph’s paths match between the two, with a match indicating that a cached representation for some portion of the second sequenced set of operations exists.  Lines 60-66 teach that “the cached result associated with the signature representing tree can be obtained.  The cached result can be leveraged to reduce the amount of computation to perform at least the second sequenced set of operations.  As stated within rationale provided above, at least column 30, lines 47-67 through column 31, lines 1-33 and column 32, lines 1-49 teach of operating on a compact representation of one or more data sets in the form of a “data traversal program” for the execution of data preparation operations such as “append” and “join” {i.e. considered equivalent to Applicant’s “JOIN” and “UNION” operations} on multiple data sets for combining.  A plurality of examples had been provided within the cited column 31, further describing the “APPEND” operation being executed on at least two different data sets imported from their respective pipelines, and the updating of rows of data through appending for a Data Set partitions {i.e. DS1/DS2}.  Column 32, lines 1-49 further describes the appending of a first data set to a second data set and vice versa based on the number of determined partitions and further information associated with corresponding data traversal programs, creating new pipelines of appended data sets, and further appending existing data sets), and

	“(ii-2) instructing the master node or the worker node to apply the joint
operation to the first data set to the n-th data set, to acquire the first subset, to thereby acquire the first subset” (see col. 35, lines 54-67; col. 36, lines 1-61; e.g., the reference of Brewster teaches of applying joint operations in the form of “Append” and “Join” operations on at least two or more data sets having matched data, where execution of the “Append” and “Join” operations were based on a corresponding data traversal program including references table and corresponding reference stack.  A new pipeline is declared to represent the combined result of a join, including the same number of partitions as the aggregate number of partitions across the DS1 and DS2 pipeline spaces);

	“at the step of (iii), in response to determining that at least part of the first data set to the n-th data set, required for acquiring the second subset, is not present in both the master node and the worker node:

	(iii-1) instructing the master node or the worker node to execute the
search operation according to the execution plan, to acquire at least one specific data set among the first data set to the n-th data set, wherein the specific data set is determined as present in the master node or in the worker node” (see col. 8, lines 53-67; col. 9, lines 1-17; e.g., as stated within rationale provided above, the reference of Brewster teaches that a “master node” is configured to devise an intelligent strategy  to partition a data set by taking into consideration various pieces of information, such as information about the data being operated on, data preparation operations to be performed, and performance statistics. A “Spark (pipeline) master” queries an input data set, such as that being obtained from a source location described in a received script.  Column 35, lines 1-53 provided a plurality of examples involving cached representations of operations/steps performed, along with attached signatures/fingerprints, that lead to corresponding cached results. Examples of checks being performed for matches of data between at least two compared tree representations of cached data, with one example being a determination that “no direct match is found”, considered equivalent to Applicant’s limitation that “...determining indicates that at least a specific part of the result set is not present in both the master node and the worker node”.  A match is taught to be an indication of the existence of at least a cached representation for some portion of a first/second sequenced set of operations);  

	“(iii-2) allowing the external data source to execute the search operation according to the execution plan, to acquire a remaining data set among the first data set to the n-th data set, wherein the remaining data set is determined as not present in both the master node and the worker node” (see col. 4, lines 49-67; e.g., the reference of Brewster teaches of connecting the computer system to an “external network” and transfer data according to standard protocols, sharing a portion of the processing with a remote processor.  Additionally, column 7, lines 46-66 teaches of the utilization of a “SPARK master”, equivalent to Applicant’s master node, for acquiring final results to one or more requesting clients.  Column 30, lines 41-62 teach of aggregating results from potentially a large consumption of resources, such as memory resources and disk resources, to accommodate an entire data set by operating on a compact representation of the dataset in the form of the “data traversal program”, equivalent to Applicant’s “execution plan”);
	“the external data source is a data source of a different type than the master node and the worker node” (see col. 4, lines 49-67; e.g., the reference of Brewster teaches of utilizing a network interface allowing for coupling to another computer, computer network, or telecommunications network {i.e. external network} in conjunction with a remote processor that shares a portion of the processing), and

	“the result set is multi-cached on the master node and the worker node” (see col. 8, lines 25-67; col. 9, lines 1-64; e.g., the reference of Brewster teaches of result sets having one or more of a plurality of data sets from Spark master and worker pipelines having a plurality of rows of data within multiple partitions of data that have data preparation operations performed on it).

	The reference of Brewter does not appear to explicitly recite the amended limitations of, “(iii-3) instructing the master node to apply the joint operation to the
specific data set and the remaining data set, to acquire the second subset, wherein the cluster-based processing system includes two or more master nodes so that structural redundancy of the master nodes is established”, “the master node is connected with both the external data source and the worker node and applies the joint operation to the specific data set and the remaining data set to acquire the second subset in a case that at least part of the first data set to the n-th data set is determined as being not present in both the master node and the worker node”, and “the master node stores the first cache data and the second cache data and executes operations, including the search operation, the data processing operation, and the output operation, on data included in the master node, the worker node and the external data source, according to the user query, and the worker node includes the first cache data and the second cache data and executes the operations on data included in the worker node according to the user query”.
	The reference of Mukherjee teaches, “(iii-3) instructing the master node to apply the joint operation to the specific data set and the remaining data set, to acquire the second subset, wherein the cluster-based processing system includes two or more master nodes so that structural redundancy of the master nodes is established” (see pp. [0046-0047]; e.g., the reference of Mukherjee serves as an enhancement to the teachings of Brewster, and teaches of a query initiating one or more of a join operation to be converted to multiple partition-wise join operations, where work granules that correspond to partition-wise join operations are generated. Partition-wise join operations are assigned to nodes by a query coordinator based on partition-to-node mappings.  Work granules are distributed so that partition-wise join operations are sent to nodes that have IMCs of the partitions involved in the partition-wise join operations so that the joins can be executed against IMCs of corresponding partitions.  The parallel query coordinator assigns partition-wise join operations to nodes with the necessary partitions currently residing in local volatile memory, apparently to derive a first and/or second set/subset of data.  Paragraph [0169] of Mukherjee teaches of utilizing an execution plan for the fetching/prefetching of remote files. Further processing of read data from jobs retrieved by slave nodes {i.e. worker nodes} are processed according to the execution plan, where an example is provided, allowing cache evictions as the coordinator evicts replicas from a local storage by lowering the replication of a file, except for the provided replica.  Paragraph [214] provides teachings into the utilization of at least multiple load operation masters, where “two separate load-operation masters” may determine how to distribute respective tables to nodes independently.  Each load-operation master accesses the same dictionary of partitioning schemes and generates the same matrix to determine how to load data for each table),
	“the master node is connected with both the external data source and the worker node and applies the joint operation to the specific data set and the remaining data set to acquire the second subset in a case that at least part of the first data set to the n-th data set is determined as being not present in both the master node and the worker node” (see paragraphs [0062-0063], [0118], [0135-0136]; e.g., Mukherjee teaches of one or more of a “load-operation master” controlling the access of data segments.  The reference of Mukherjee teaches of the accessing of one or more of a plurality of requested chunks by “work granules” known as “parallel query slaves” assigned to database instances that generate “query execution plans” relating to “chunk-to-node-mapping”.  The results produced by the parallel query slaves are sent to and aggregated by the parallel query coordinator {i.e. database server instance”}.  As stated within paragraphs [0145-0147], parallel query slaves can determine whether they are able to process a work granule entirely by accessing data in a local volatile memory, further accessing some or all the necessary data from disk if that determination is true.  Within the utilized system, mappings may specify a particular computing unit, and “...When the mapping for a particular sub-chunk specifies a particular computing unit within a node, the work granule is sent to the designated node for execution by the designated computing unit”.  According to earlier text of paragraph [0118], “if a database server instance determines that it itself was the old host node of the particular chunk, and that the particular chunk now maps to another node, then the database server instance can discard from its volatile memory the container that holds the data from the chunk”, thus providing another check performed to determine whether a sought after chunk is located within an instance of a node.  Also stated, “database server instance 106 will update the entries associated with chunk 304 to indicate that node 102 is now the host for chunk 304. Database server instance 106 will then proceed to load chunk 304 into its volatile memory 104, thereby creating a new copy of IMC 324. The new copy of IMC 324 may be built with data from a snapshot that is different than the snapshot used to create the original copy of IMC 324. As a result, already existing IMCs in live nodes will be of earlier snapshots and the new ones of later snapshots” {i.e. pp. [0122]}.  The cited portion describes the accessing and loading of accessed chunks, after updating chunk-to-node mappings),

	“the master node stores the first cache data and the second cache data and executes operations, including the search operation, the data processing operation, and the output operation, on data included in the master node, the worker node and the external data source, according to the user query, and the worker node includes the first cache data and the second cache data and executes the operations on data included in the worker node according to the user query” (see pp. [0038-0042]; e.g., the reference of Mukherjee teaches of the querying and aggregation of pre-loaded objects” distributed within one or more of a plurality of volatile memories of a plurality of host nodes, where “chunk-to-node mapping” is adhered to for the retrieval of objects according to one or more execution plans.  Accessing data within a chunk is executed in parallel by the various host nodes, and augmenting an execution plan for the access of copies of a chunk if a node fails, for example.  According to at least paragraphs [00231], [0238] and [0240], a “join” operation can be performed on rows of data satisfying a received query having join conditions.  The rows of data are retrieved from a plurality of hosted nodes {i.e. work granules}, and can be performed without requesting data from disk).
The combined references of Brewster and Mukherjee are considered analogous art for being within the same field of endeavor as the claimed invention, which is reliability and/or availability techniques used for database management systems utilizing the caching optimization.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the querying and joining of distributed data contained within a plurality of nodes, as taught by Mukherjee, with the method of Brewster, in order allow any node in a cluster to act as a parallel query coordinator when receiving a query that requires work to be performed against a pre-loaded partitioned object.  (Mukherjee; [0045])

As for Claim 3, Brewster teaches, “further comprising: updating the second cache data such that the second cache data includes the first subset”, wherein the cluster-based processing system includes two or more of the worker nodes so that a cluster comprised of the worker nodes is implemented” (see col. 7, lines 4-25, 46-67; e.g., the reference of Brewster teaches that existing cached results can be stored in one or more of a “cache layer”, in memory, stored on a local or networked storage device {i.e. a disk or a storage server}. As taught within lines 46-67, a “Spark Master” {i.e. considered equivalent to Applicant’s “master node”} receives requests from external clients, allowing the Spark Master to break down and distribute work portions/“chunks” to various “Spark Workers”.  Both Spark Masters and workers use a companion application {i.e. “pipeline application”} to perform work, running on all machines that run a Spark process with both Masters and workers.  As further elaborated upon within column 13, lines 58-67 through column 14, lines 1-11 provide teachings into the request and retrieval of cached data {i.e. cached data fragments/representations} cached at various stages of a pipeline within at least one or more of a “cache layer”, allowing for intermediate results at the various stages of a pipeline to be viewed).

As for Claim 5, Brewster teaches, “further comprising updating the second cache data such that the second cache data includes the remaining data set and the second subset” (see col. 14, lines 2-21; col. 17, lines 48-66; e.g., the reference of Brewster teaches of the utilization of one or more of a “data traversal program (DTP)”, considered equivalent to Applicant’s “execution plan”, as the data traversal program includes a reference table and reference stack for provide information on how to read the state of a portion of the data as of a certain stage of a pipeline.  Column 30, lines 55-67 through column 31, lines 1-62, discuss the performance of at least an “Append” operation, where, according to at least a data traversal program, multiple data sets having one or more partitions including rows of data are combined {i.e. “DS1” and “DS2”} from different pipelines to form new or updated representations of the combined data. At least column 30, lines 47-67 through column 31, lines 1-33 and column 32, lines 1-49 teach of operating on a compact representation of one or more data sets in the form of a “data traversal program” for the execution of data preparation operations such as “append” and “join” {i.e. considered equivalent to Applicant’s “JOIN” and “UNION” operations} on multiple data sets for combining).

As for Claim 7, the reference of Brewster provides signature-based cache optimization for data preparation.
Brewster does not recite the limitation of, “further comprising executing the search operation and the data processing operation in a file-based manner”.
Mukherjee teaches, “further comprising executing the search operation and the data processing operation in a file-based manner” (see pp. [0118], [0057], [0196-0203]; e.g., metadata operations within an “HDFS”, for example, as discussed within paragraph [0118].  Paragraphs of [0196-0202] discuss the assignment of data from multiple partitioned tables to host nodes based on similar partitioning criteria, considered equivalent to the execution of a data processing operation ins a file-based manner.  Subsequent paragraph [0203] teaches of at least a database maintaining a dictionary of partitioning schemes that stores a particular partitioning scheme ID for each partitioning scheme and stores the names of tables using those partitioning schemes, considered equivalent to the execution of at least Applicant’s “file storing operation”. Earlier text of at least paragraph [0057] teaches of sending queries to one or more of a cluster, where the queries target a memory-enabled segment and are sent to the database server instance on any cluster node, returning results).
The combined references of Brewster and Mukherjee are considered analogous art for being within the same field of endeavor as the claimed invention, which is reliability and/or availability techniques used for database management systems utilizing the caching optimization.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the querying and joining of distributed data contained within a plurality of nodes, as taught by Mukherjee, with the method of Brewster, in order allow any node in a cluster to act as a parallel query coordinator when receiving a query that requires work to be performed against a pre-loaded partitioned object.  (Mukherjee; [0045])

As for Claim 8, the reference of Brewster provides signature-based cache optimization for data preparation.
Brewster does not recite the limitation of, “wherein the data processing operation includes at least one of an aggregating operation, a data transforming operation, a filtering operation, a sorting operation, and a data truncating operation”.
Mukherjee teaches, “wherein the data processing operation includes at least one of an aggregating operation, a data transforming operation, a filtering operation, a sorting operation, and a data truncating operation” (see pp. [0136]; e.g., the reference of Mukherjee teaches of at least a parallel query coordinator having the ability to aggregate results produced by a plurality of “parallel query slaves, with the parallel query coordinator performing any necessary processing on the data).
The combined references of Brewster and Mukherjee are considered analogous art for being within the same field of endeavor as the claimed invention, which is reliability and/or availability techniques used for database management systems utilizing the caching optimization.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the querying and joining of distributed data contained within a plurality of nodes, as taught by Mukherjee, with the method of Brewster, in order allow any node in a cluster to act as a parallel query coordinator when receiving a query that requires work to be performed against a pre-loaded partitioned object.  (Mukherjee; [0045])

As for Claim 9, the reference of Brewster provides signature-based cache optimization for data preparation.
Brewster does not recite the limitation of, “wherein applying the data processing operation includes updating the first cache data such that the first cache data include the query result”.
Mukherjee teaches, “wherein applying the data processing operation includes updating the first cache data such that the first cache data include the query result” (see pp. [0118]; e.g., the reference of Mukherjee teaches of updating entries of the one or more database server instances being utilized that correspond to particular chunks in its chunk-to-node mappings and its sub-chunk-to-node mappings whenever the database server instance receives a request that targets a particular chunk, reading on Applicant’s claimed limitation.  The database server instance accesses data items of a table that are cached in the volatile memory that resides on the node in which the database server instance is executing).
The combined references of Brewster and Mukherjee are considered analogous art for being within the same field of endeavor as the claimed invention, which is reliability and/or availability techniques used for database management systems utilizing the caching optimization.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the querying and joining of distributed data contained within a plurality of nodes, as taught by Mukherjee, with the method of Brewster, in order allow any node in a cluster to act as a parallel query coordinator when receiving a query that requires work to be performed against a pre-loaded partitioned object.  (Mukherjee; [0045])

As for Claim 10, the reference of Brewster provides signature-based cache optimization for data preparation.
Brewster does not recite the limitation of, “wherein the output operation includes at least one of a screen displaying operation, a remote RDB (relational database) storing operation, and a file storing operation”.
Mukherjee teaches, “wherein the output operation includes at least one of a screen displaying operation, a remote RDB (relational database) storing operation, and a file storing operation” (see pp. [0093], [0257]; e.g., The cited reference, at paragraph [0093], teaches of at least a database server instance within a cluster having the ability to store in its local memory metadata that reflects sub-chunk-to-node mappings.  The reference of Mukherjee teaches of displaying information to a computer user within the cited paragraph [0257].  Paragraph [0203] teaches of at least a database maintaining a dictionary of partitioning schemes that stores a particular partitioning scheme ID for each partitioning scheme and stores the names of tables using those partitioning schemes, considered equivalent to the execution of at least Applicant’s “file storing operation”). 
The combined references of Brewster and Mukherjee are considered analogous art for being within the same field of endeavor as the claimed invention, which is reliability and/or availability techniques used for database management systems utilizing the caching optimization.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the querying and joining of distributed data contained within a plurality of nodes, as taught by Mukherjee, with the method of Brewster, in order allow any node in a cluster to act as a parallel query coordinator when receiving a query that requires work to be performed against a pre-loaded partitioned object.  (Mukherjee; [0045])

Claims 11, 13, 15 & 17-20 amount to a big data cluster management device comprising instructions that, when executed by one or more processors, performs the method of Claims 1, 3, 5, & 7-11, respectively.  Accordingly, Claims 11, 13, 15 & 17-20 are rejected for substantially the same reasons as presented above for Claims 1, 3, 5, & 7-11 and based on the references’ disclosure of the necessary supporting hardware and software (Mukherjee; see pp. [0253-0263]; e.g., method for implementation integrating hardware and software components).



Response to Arguments
Applicant's arguments and amendments, with respect to the rejection(s) of Claims 1-3, 5-13 & 15-20, and Mukherjee and Aranha’s alleged failure to teach the subject matter of at least amended Claims 1 and 11 have been fully considered, and are persuasive in-part, as the Aranha reference has been withdrawn from consideration.  The Mukherjee reference continue to be utilized for its respective teachings, as well as updated rationale in regards to Applicant’s amended language discussed within this communication above.  
Upon further consideration and in direct response to Applicant’s claim limitations and arguments, a new ground(s) of rejection for Claims 1, 3, 5, 7-11, 13, 15 & 17-20 is made in view of Mukherjee et al (USPG Pub No. 20160026667A1).


Conclusion
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to RAHEEM HOFFLER whose telephone number is (571)270-1036. The examiner can normally be reached Monday-Friday: 10:00am-2:00pm; 6pm-10:00pm w/ flex.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached on 5712724241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/TAMARA T KYLE/Supervisory Patent Examiner, Art Unit 2156                                                                                                                                                                                                        
/RAHEEM HOFFLER/
Examiner
Art Unit 2156

							10/6/2022