Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

		Detailed Action
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Claims 1, 3, 5, 7-11, 13, 15 & 17-20 now remain pending in this application.  Applicant's submission filed on 21 December 2021 has been entered. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 5, 7-11, 13, 15 & 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Mukherjee et al (USPG Pub No. 20160026667A1; Mukherjee hereinafter) in view of Aranha et al (USPG Pub No. 20080222159A1; Aranha hereinafter).
As for Claim 1, Mukherjee recites, A method for performing multi-caching on data sources of different types by using a cluster-based processing system, the method comprising:
 “in a case that the determining indicates that at least a specific part of the result set is present in none of the master node and the worker node: (i) establishing an execution plan to sequentially execute a search operation, a data processing operation, and an output operation based on a result of parsing the user query” (see pp. [0042], [0131-0137]; e.g., the reference of Mukherjee teaches of providing data storage and retrieval techniques in a database cluster, where if it is determined that one of the host nodes of a particular chunk being accessed has failed, a query execution plan is augmented for a query that requires access to a particular chunk in order to leverage the copy of the chunk that is in a host node of the chunk that has not failed, reading on Applicant’s claimed limitation.  The reference of Mukherjee describes a process in which one or more of a node failure is detected after work for a query has already been distributed across a cluster to a plurality of “parallel query slaves” through sub-chunk-to-node mappings, considered equivalent to Applicant’s “master node or worker node”.  A “parallel query coordinator” generates a “query execution plan” to specify how work required by the query is to be separated into work granules that perform the work. The parallel query coordinator receives a message of the failure or one or more nodes, and restarts execution of the query from scratch, resulting in the creation of a new query execution plan leveraging/ performing work on a set of IMCs {i.e. “In-Memory Copy”} of data),  
“(ii) acquiring a first subset, determined as present as second cache data in the master node or in the worker node, by instructing the master node or the worker node to execute the search operation according to the execution plan, wherein the second cache data represents elemental sets included in the result set” (see pp. [0041-0042]; e.g., the reference of Mukherjee teaches of executing queries against pre-loaded objects, where, using sub-chunk-to-node mapping, any database instance in the cluster may generate a query execution plan for a query that targets a pre-loaded object, equivalent to the acquiring of a first subset.  First and second queries are sent to multiple host nodes to access data in a particular chunk); 
“(iii) acquiring a second subset, determined as present in none of the master node and the worker node, from the second cache data by instructing at least one external data source to execute the search operation according to the execution plan” (see pp. [0041-0042]; e.g., the reference of Mukherjee teaches of executing queries against pre-loaded objects, where, using sub-chunk-to-node mapping, any database instance in the cluster may generate a query execution plan for a query that targets a pre-loaded object, equivalent to the acquiring of a first subset.  First and second queries are sent to multiple host nodes to access data in a particular chunk); and 
“(iv) applying at least part of a joint operation included in the search operation to the first subset and the second subset according to the execution plan, to acquire a result of the search operation, wherein the joint operation includes at least part of a JOIN operation and a UNION operation, and the second cache data is updated such that the second cache data includes the second subset” (see pp. [0043-0047]; e.g. applying at least one or more of a join operation/ number of smaller partition-wise joins against at least two objects that have been partitioned into sub-objects.  The execution of in memory partition-wise joins takes at least one or more pre-loaded partitioned objects to be joined with any given partition distributed to at least the same host node.  This allows any node in a cluster to act as a parallel query coordinator when receiving a query that requires work to be performed against a pre-loaded partitioned object.  The parallel query coordinator preferably assigns the partition-wise join operations to nodes with the necessary partitions already residing in local volatile memory); 
	“in a case that at least part of a first data set to an n-th data set, to which at least part of the joint operation is to be applied, is determined as present in none of the master node and the worker node:

	“(v) instructing the master node or the worker node to execute the
search operation according to the execution plan, to acquire at least one specific data set among the first data set to the n-th data set, wherein the specific data set is determined as present in the master node or in the worker node” (see pp. [0131-0137], [0167-0174]; e.g., the reference of Mukherjee describes a process in which one or more of a node failure is detected after work for a query has already been distributed across a cluster to a plurality of “parallel query slaves” through sub-chunk-to-node mappings, considered equivalent to Applicant’s “master node or worker node”.  A “parallel query coordinator” generates a “query execution plan” to specify how work required by the query is to be separated into work granules that perform the work. The parallel query coordinator receives a message of the failure or one or more nodes, and restarts execution of the query from scratch, resulting in the creation of a new query execution plan leveraging/performing work on a set of IMCs {i.e. “In-Memory Copy”} of data.  Primary and secondary IMC set data is utilized for data retrieval by the plurality of working nodes.  As stated within at least paragraphs [0167-0168], “...When a node fails in a single-hosts-per-chunk system, the chunks stored in the failed node's volatile memory cease to be available for query processing. The chunk may be re-hosted elsewhere...but in the meantime, query execution would normally require accessing some data from disk”, reading on the claimed limitation in which secondary data is retrieved if not found within at least a master or worker node);  

	“(vi) allowing the external data source to execute the search operation according to the execution plan, to acquire a remaining data set among the first data set to the n-th data set, wherein the remaining data set is determined as present in none of the master node and the worker node” (see pp. [0167-0169], [0173-0174]; e.g, The cited paragraphs [0167-0169] teach of allowing the redistribution of work granules to working nodes based on a sub-chunk-to-node mapping corresponding to at least a secondary IMC set of data.  the reference of Mukherjee teaches of allowing the “parallel query coordinator” to identify the one or more chunks residing on a failed/inaccessible node, and resending granules in regards to the execution of those chucks to a different node {i.e. external data source} identified by chunk-to-node mapping of the identified chunk); and
	“(vii) instructing the master node to apply the joint operation to the
specific data set and the remaining data set, to acquire the second subset” (see pp. [0046-0047]; e.g., the reference of Mukherjee teaches of a query initiating one or more of a join operation to be converted to multiple partition-wise join operations, where work granules that correspond to partition-wise join operations are generated. Partition-wise join operations are assigned to nodes by a query coordinator based on partition-to-node mappings.  Work granules are distributed so that partition-wise join operations are sent to nodes that have IMCs of the partitions involved in the partition-wise join operations so that the joins can be executed against IMCs of corresponding partitions.  The parallel query coordinator assigns partition-wise join operations to nodes with the necessary partitions currently residing in local volatile memory, apparently to derive a first and/or second set/subset of data);
	“in a case that the first data set to the n-th data set are determined as present in the master node or the worker node, the method further comprises instructing the master node or the worker node to apply the joint operation to the first data set to the n-th data set, to acquire the first subset” (see pp. [0046-0047]; e.g., the reference of Mukherjee teaches of a query initiating one or more of a join operation to be converted to multiple partition-wise join operations, where work granules that correspond to partition-wise join operations are generated. Partition-wise join operations are assigned to nodes by a query coordinator based on partition-to-node mappings.  Work granules are distributed so that partition-wise join operations are sent to nodes that have IMCs of the partitions involved in the partition-wise join operations so that the joins can be executed against IMCs of corresponding partitions.  The parallel query coordinator assigns partition-wise join operations to nodes with the necessary partitions currently residing {i.e. determined as present} in local volatile memory, apparently to derive a first and/or second set/subset of data);
	“applying the data processing operation and the output operation to the result of the search operation according to the execution plan, to acquire the result set as the query result” (see pp. [0169]; e.g. the reference of Mukherjee teaches of utilizing an execution plan for the fetching/prefetching of remote files. Further processing of read data from jobs retrieved by slave nodes, are processed according to the execution plan, where an example is provided, allowing cache evictions as the coordinator evicts replicas from a local storage by lowering the replication of a file, except for the provided replica); and  
“outputting result set, wherein the cluster-based processing system includes two or more master nodes so that structural redundancy of the master nodes is established” (see pp. [0118]; e.g. the reference of Mukherjee teaches of the coordination of cache activities, where a cluster-wide coordinator may perform appropriate metadata operations with a file system master to ensure file blocks can be cached.  As discussed within earlier text of paragraphs [0049-0054], a cache assignment determiner may implement a “linear program (LP)” to identify a prefetch dataset, thus, outputting a result set).
The reference of Mukherjee does not appear to explicitly recite the limitations of, “determining, in a case that a user query is acquired, whether a result set corresponding to a query result of the user query is present as first cache data in a master node or in a worker node, wherein the worker node is included in a same system with the master node and communicates with the master node”.
Aranha teaches, “determining, in a case that a user query is acquired, whether a result set corresponding to a query result of the user query is present as first cache data in a master node or in a worker node, wherein the worker node is included in a same system with the master node and communicates with the master node” (see pp. [0045], [0047], [0066-0070];e.g., Paragraphs [0045] teaches of nodes, such as an active node, standby node, and replica node, considered equivalent to one or more of Applicant’s “worker nodes”.  As stated within the cited paragraph [0047], “...the roles of master node, active node, and standby node are...associated with particular portions of data, such as cache groups (or tables). For example, at the active node, a first cache group is updated using autorefresh from the backend database system (which thus acts as the master node for the first cache group), and a second cache group is updated by client updates (and thus the active node is the master node for the second cache group)”, reading on Applicant’s claimed limitation by providing for the designation of “master node” or “worker node” for a first cache group having first cache data.  Paragraph [0051] provides teachings into the use of a “MUL locator”, which is used to identify one or a group of updates stored in a “MUL”, where a “MUL” is a “master update list” serving as a “transaction log”/”change log table” including a plurality of entries {i.e. “list of updates (committed transaction)” applied to a master node, as updates are tracked while propagated to other nodes}.  Paragraph [0067] discusses clients sourcing updates to nodes, such as SQL queries that solely read data without modification, and the sourcing of updates by clients to nodes, where updates represent SQL queries which received and read data without modification to one or more nodes representing cache groups or tables that are autorefreshed  {i.e. pp. [0069]}.  Paragraph [0067] clearly teaches of the receipt of one or more read-only requests {i.e. queries}, which may optionally utilize a “load balancer” for the distribution of read-only requests among the plurality of nodes).
The combined references of Mukherjee and Aranha are considered analogous art for being within the same field of endeavor as the claimed invention, which is reliability and/or availability techniques used for database management systems utilizing caching and distributed in a cluster.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the utilization of master nodes, active nodes and standby nodes for the caching of data, as taught by Aranha, with the method of Mukherjee, because it would be desirable to have techniques with both improved data persistence and improved data reliability so that a failure of a single-in memory database system does not cause a loss of access to data.  (Aranha; [0012])

As for Claim 3, Mukherjee relates to data storage and retrieval techniques in a database cluster, specifically to memory-aware joins in a database cluster.
Mukherjee does not appear to explicitly recite the limitation of, “further comprising: updating the second cache data such that the second cache data includes the first subset”, wherein the cluster-based processing system includes two or more of the worker nodes so that a cluster comprised of the worker nodes is implemented”. 
Aranha teaches, “further comprising: updating the second cache data such that the second cache data includes the first subset”, wherein the cluster-based processing system includes two or more of the worker nodes so that a cluster comprised of the worker nodes is implemented” (see pp. [0047], [0057]; e.g., the reference of Aranha teaches of utilizing a plurality of nodes, such as a master node, active node and standby node, which are associated with particular portions of data {i.e. cache groups}.  Roles such as master node, active node and standby node are not fixed to specific ones of nodes, and as stated within the cited paragraph [0047], a “backend database system” can act as the master node for a first cache group, and an active node can be the master node for the second group, considered equivalent to a “device” {i.e. big data cluster management device} performing or supporting the updating process.  An example is provided where a second cache group is updated by client updates at an active node, where the second cache group is equivalent to Applicant’s second cache data, and the “client updates” of Aranha are considered equivalent to Applicant’s second subset being included within the second cache data).
The combined references of Mukherjee and Aranha are considered analogous art for being within the same field of endeavor as the claimed invention, which is reliability and/or availability techniques used for database management systems utilizing caching and distributed in a cluster.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the utilization of master nodes, active nodes and standby nodes for the caching of data, as taught by Aranha, with the method of Mukherjee, because it would be desirable to have techniques with both improved data persistence and improved data reliability so that a failure of a single-in memory database system does not cause a loss of access to data.  (Aranha; [0012])

As for Claim 5, Mukherjee relates to data storage and retrieval techniques in a database cluster, specifically to memory-aware joins in a database cluster.
Mukherjee does not appear to explicitly recite the limitation of, “further comprising updating the second cache data such that the second cache data includes the remaining data set and the second subset”.
Aranha teaches, “further comprising updating the second cache data such that the second cache data includes the remaining data set and the second subset” (see pp. [0047], [0057-0058]; e.g., the reference of Aranha teaches of utilizing a plurality of nodes, such as a master node, active node and standby node, which are associated with particular portions of data {i.e. cache groups}.  Roles such as master node, active node and standby node are not fixed to specific ones of nodes, and as stated within the cited paragraph [0047], a “backend database system” can act as the master node for a first cache group, and an active node can be the master node for the second group, considered equivalent to a “device” {i.e. big data cluster management device} performing or supporting the updating process.  An example is provided where a second cache group is updated by client updates at an active node, where the second cache group is equivalent to Applicant’s second cache data, and the “client updates” of Aranha are considered equivalent to Applicant’s second subset being included within the second cache data.  Paragraphs [0057-0058] teach of client updates being sent to an active node, and continuing to be applied to respective databases of the active node despite a failure to at least the backend database system, and as the backend database system is recovered, continuing/resuming the writing of client updates to the backend database system, therefore, including the client updates {i.e. autorefresh updates} with previously updated data of the backend database system.  This process is performed in the same fashion as “second cache data” including “the remaining data set and the second subset”).
The combined references of Mukherjee and Aranha are considered analogous art for being within the same field of endeavor as the claimed invention, which is reliability and/or availability techniques used for database management systems utilizing caching and distributed in a cluster.  Therefore, it would have been obvious to one of ordinary skill in the art by the effective filing date of the claimed invention to have combined the utilization of master nodes, active nodes and standby nodes for the caching of data, as taught by Aranha, with the method of Mukherjee, because it would be desirable to have techniques with both improved data persistence and improved data reliability so that a failure of a single-in memory database system does not cause a loss of access to data.  (Aranha; [0012])

As for Claim 7, Mukherjee teaches, “further comprising executing the search operation and the data processing operation in a file-based manner” (see pp. [0118], [0057], [0196-0203]; e.g., metadata operations within an “HDFS”, for example, as discussed within paragraph [0118].  Paragraphs of [0196-0202] discuss the assignment of data from multiple partitioned tables to host nodes based on similar partitioning criteria, considered equivalent to the execution of a data processing operation ins a file-based manner.  Subsequent paragraph [0203] teaches of at least a database maintaining a dictionary of partitioning schemes that stores a particular partitioning scheme ID for each partitioning scheme and stores the names of tables using those partitioning schemes, considered equivalent to the execution of at least Applicant’s “file storing operation”. Earlier text of at least paragraph [0057] teaches of sending queries to one or more of a cluster, where the queries target a memory-enabled segment and are sent to the database server instance on any cluster node, returning results).

As for Claim 8, Mukherjee teaches, “wherein the data processing operation includes at least one of an aggregating operation, a data transforming operation, a filtering operation, a sorting operation, and a data truncating operation” (see pp. [0136]; e.g., the reference of Mukherjee teaches of at least a parallel query coordinator having the ability to aggregate results produced by a plurality of “parallel query slaves, with the parallel query coordinator performing any necessary processing on the data).

As for Claim 9, Mukherjee teaches, “wherein applying the data processing operation includes updating the first cache data such that the first cache data include the query result” (see pp. [0118]; e.g., the reference of Mukherjee teaches of updating entries of the one or more database server instances being utilized that correspond to particular chunks in its chunk-to-node mappings and its sub-chunk-to-node mappings whenever the database server instance receives a request that targets a particular chunk, reading on Applicant’s claimed limitation.  The database server instance accesses data items of a table that are cached in the volatile memory that resides on the node in which the database server instance is executing).

As for Claim 10, Mukherjee teaches, “wherein the output operation includes at least one of a screen displaying operation, a remote RDB (relational database) storing operation, and a file storing operation” (see pp. [0093], [0257]; e.g., The cited reference, at paragraph [0093],  teaches of at least a database server instance within a cluster having the ability to store in its local memory metadata that reflects sub-chunk-to-node mappings.  The reference of Mukherjee teaches of displaying information to a computer user within the cited paragraph [0257].  Paragraph [0203] teaches of at least a database maintaining a dictionary of partitioning schemes that stores a particular partitioning scheme ID for each partitioning scheme and stores the names of tables using those partitioning schemes, considered equivalent to the execution of at least Applicant’s “file storing operation”). 

Claims 11, 13, 15 & 17-20 amount to a big data cluster management device comprising instructions that, when executed by one or more processors, performs the method of Claims 1, 3, 5, & 7-11, respectively.  Accordingly, Claims 11, 13, 15 & 17-20 are rejected for substantially the same reasons as presented above for Claims 1, 3, 5, & 7-11 and based on the references’ disclosure of the necessary supporting hardware and software (Mukherjee; see pp. [0253-0263]; e.g., method for implementation integrating hardware and software components).


Response to Arguments
Applicant's arguments and amendments, with respect to the rejection(s) of Claims 1-3, 5-13 & 15-20, and Ranganathan and Aranha’s alleged failure to teach the subject matter of at least amended Claims 1 and 11 have been fully considered, and are persuasive, as the Ranganathan reference has been withdrawn from consideration.  The Aranha reference continue to be utilized for its applicable teachings, as discussed within updated rationale above.  
Upon further consideration and in direct response to Applicant’s claim limitations and arguments, a new ground(s) of rejection for Claims 1, 3, 5, 7-11, 13, 15 & 17-20 is made in view of Mukherjee et al (USPG Pub No. 20160026667A1).

Conclusion
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to RAHEEM HOFFLER whose telephone number is (571)270-1036. The examiner can normally be reached Monday-Friday: 10:00am-2:00pm; 6pm-10:00pm w/ flex.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached on 5712724241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/TAMARA T KYLE/Supervisory Patent Examiner, Art Unit 2156                                                                                                                                                                                                        
/RAHEEM HOFFLER/
Examiner
Art Unit 2156

								5/21/2022