DETAILED ACTION

Remarks
1.	Claims 1-8, 10-16, 18 and 21-25 are presently pending in the instant application, of which, claims 1, 14 and 21 are presented as in independent form.
2.	Applicants filed Appeal Brief on 03/22/2022. Applicants’ arguments with respect to rejected claims have been fully considered, but are moot in view of the new ground(s) of rejection, the details of which are provided below.
3.	In view of the appeal brief filed on 03/22/2022 PROSECUTION IS HEREBY REOPENED.  New grounds of rejection are set forth below.
To avoid abandonment of the application, appellant must exercise one of the following two options:
(1) file a reply under 37 CFR 1.111 (if this Office action is non-final) or a reply under 37 CFR 1.113 (if this Office action is final); or,
(2) initiate a new appeal by filing a notice of appeal under 37 CFR 41.31 followed by an appeal brief under 37 CFR 41.37.  The previously paid notice of appeal fee and appeal brief fee can be applied to the new appeal.  If, however, the appeal fees set forth in 37 CFR 41.20 have been increased since they were previously paid, then appellant must pay the difference between the increased fees and the amount previously paid.
A Supervisory Patent Examiner (SPE) has approved of reopening prosecution by signing below.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

Claims 1-3, 10, 14 and 21 are rejected under 35 U.S.C. 103(a) as being unpatentable over Soundararajan et al., Pub. No.: US 2013/0132967 (Soundararajan) in view of Chen et al., “Integration of Workflow Partitioning and Resource Provisioning” (Chen). 

Claim 1.	Soundararajan teaches:
A method comprising:
receiving a query directed to database data stored across a plurality of shared storage devices; (¶¶ 36, 40, FIGs. 1, and 5-7, a received analytic job/query is directed to data located in the memory of data analytics processing nodes and storage server of FIGS. 1, and 5-7)
 referencing the metadata store to determine whether the set of files is cached among execution nodes of an execution platform comprising a plurality of execution nodes, wherein the execution platform is separate from the metadata store and the plurality of shared storage devices; (¶¶ 50 and 66, metadata/location information in FIGs. 5-6B is referenced for assigning tasks to data analytics processing nodes; metadata is separate from data analytics processing nodes; ¶¶ 30-31, each data analytic processing nodes include multiple execution nodes, e.g., processors/CPUs and caches)     
in response to determining that at least a portion of the set of files is cached among the plurality of execution nodes, assigning by one or more processors, processing of one or more of the set of files to each of one or more execution nodes that have cached at least a portion of the set of files; (¶¶ 30-31, 50 and 66, metadata/location information in FIGs. 5-6B is referenced for assigning tasks to data analytics processing nodes including multiple execution nodes, e.g., processors/CPUs and caches where a requested data block is located)
for each of the one or more assigned execution nodes:
determining, by the assigned execution node, whether the assigned one or more files is stored at least in part in a cache of the assigned execution node; and (¶¶ 47-50, 57-58, 66, FIGs. 5-6B, job manger 540 distributes jobs to analytic processing nodes including multiple execution nodes, e.g., processors/CPUs and caches in response to locating data in memory caches 514, 524, and 534 using metadata 544; an analytic processing node including multiple execution nodes, e.g., processors/CPUs and caches performs the assigned task using the cached data and further communicates with storage server for additional data if needed)
in response to the assigned execution node determining the assigned one or more files is not entirely stored in the cache of the assigned execution node:
retrieving a missing portion of the assigned one or more files from one or more remote storage devices of the plurality of remote storage devices including the missing portion of the assigned one or more files,  (¶¶ 47-50, 57-58, 66, FIGs. 5-6B, each data analytics processing nodes including multiple execution nodes, e.g., processors/CPUs and caches performs the assigned task using the cached data and further determines if additional data is needed; it retrieves additional data from the storage server)  
wherein the plurality of execution nodes are organized into one or more virtual warehouses, and a virtual warehouse including the assigned execution node dynamically establishes a communication link with each of the one or more of the plurality of remote storage devices based at least in part on the query so that the assigned execution node may retrieve the missing portion; (¶¶ 30 and 70, data analytics processing nodes/each including multiple execution nodes, e.g., processors/CPUs, are implemented as virtual machines as in ¶ 70; a data analytics processing node is a virtual warehouse, e.g., is implemented as a virtual machine including multiple processors; a data analytics processing node retrieves data from any other data analytics processing nodes/memory caches and storage server as needed using I/O coordinators as in ¶¶ 62 and 69 indicates that data analytics processing nodes are logically mapped to each other and the storage server for accessing and retrieving data as needed)
storing, by the assigned execution node, the missing portion of the assigned one or more files in the cache of the assigned execution node so that the entire one or more files is stored in the cache of the assigned execution node; (¶¶ 41-42, 47-50, 57-58, 66, FIGs. 5-6B, a data analytics processing node including multiple execution nodes, e.g., processors/CPUs and caches retrieves needed data entirely from other processing nodes or storage server and stores the retrieved data in its cache for performing a task as described in ¶ 61)
processing the query using the assigned one or more files stored in the cache of the assigned execution node; and (FIG. 6B, ¶ 67, a task is processed by a given  data analytics processing node including multiple execution nodes, e.g., processors/CPUs and caches and the task result is returned to job manager)
updating the metadata store to indicate the entire assigned one or more files is now cached in the cache of the assigned execution node; (¶¶ 50 and 66, metadata in FIG. 5, 544 and FIG. 6B, 644 include updated location information)
wherein any of the set of files stored in the plurality of shared storage devices may be accessed by any of a plurality of execution nodes of the execution platform; (¶ 42, an of data analytics processing node including multiple execution nodes, e.g., processors/CPUs and caches can access files in storage server)
wherein any of the set of files stored in the plurality of shared storage devices may be stored in a cache of any of the plurality of execution nodes of the execution platform; and (¶¶ 42, 50, 66, FIG. 5, 544 and FIG. 6B, 644, data analytics processing nodes including multiple execution nodes, e.g., processors/CPUs and caches store a retrieved file from storage server into their caches)
wherein any of the set of files stored in the plurality of shared storage devices may be stored in a cache of multiple execution nodes of the plurality of execution nodes of the execution platform at one point in time. (FIG. 6B, ¶ 61, any file stored in storage server  650 may be stored in data analytics processing nodes 610, 620 and 630 as needed)
Soundararajan did not specifically teach: wherein the plurality of execution nodes are organized into one or more virtual warehouses having one or more logical mappings between them.
Chen teaches: wherein the plurality of execution nodes are organized into one or more virtual warehouses having one or more logical mappings between them. (wherein multiple virtual clusters each including multiple virtual machines and wherein each multiple virtual machines have one logical mapping between them for processing a patriation of a workflow: p.1, sec. I, left col. “workflow partitioning [5] is an approach to divide a workflow into several sub-workflows and then submit these sub-workflows to different execution sites (virtual clusters). Workflow partitioning requires the sub-workflows to be suitable for execution in an execution site in terms of storage constraints, communication cost and computation cost… we are able to dynamically allocate resources into multiple execution sites or virtual clusters [7] and then execute the sub-workflows on these sites”; p.4, sec. B, “The XML format describes virtual clusters as a collection of several nodes, which correspond to virtual machines. Each node is defined with the characteristics of the virtual machine to be provisioned, such as the VM image to use and the hardware resource type (CPU, memory, disk, etc.)”)
Soundararajan implicitly taught the feature in ¶¶ 30-31 by disclosing a data analytics processing node including multiple processers and caches and in ¶ 70 by implements the data analytics processing node as “one or more virtual machines”. It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine the applied references for disclosing wherein the plurality of execution nodes are organized into one or more virtual warehouses having one or more logical mappings between them because doing so would explicitly provide for lunching VM instances and constructing virtual clusters dynamically as needed for executing jobs/sub-workflows related to a workflow. 

Claim 14.	Soundararajan teaches:
A system comprising:
a plurality of shared storage devices collectively storing database data; (¶¶ 24, “a shared storage system”, 33 and fig. 6A, 650)
a metadata store separate from the plurality of shared storage devices, the metadata store comprising metadata for the database data stored across the plurality of shared storage devices; and (fig. 5, figs. 6A-6B, ¶¶ 50 and 66, metadata 544 in fig. 5 and metadata in figs. 6A-6B include location information for requested data/file; ¶¶ 47-50, 57-58, 66, fig. 5, figs. 6A-6B, processing nodes 510, 520 and 530 are separate from metadata 544)
one or more processors operatively coupled to the metadata store, the one or more processors to:
receive a query directed to database data stored across a plurality of shared storage devices; (¶¶ 36, 40, FIGs. 1, and 5-7, a received analytic job/query is directed to data located in the memory of data analytics processing nodes and storage server of FIGS. 1, and 5-7)
 reference the metadata store to locate a set of files that comprises data that needs to be processed to respond to the query; (¶¶ 50 and 66, metadata/location information in FIGs. 5-6B is referenced for assigning tasks to data analytics processing nodes; metadata is separate from data analytics processing nodes; ¶¶ 30-31, each data analytic processing nodes include multiple execution nodes, e.g., processors/CPUs and caches) 
reference the metadata store to determine whether the set of files is cached among execution nodes of an execution platform comprising a plurality of execution nodes, wherein the execution platform is separate from the metadata store and the plurality of shared storage devices; and (¶¶ 50 and 66, metadata/location information in FIGs. 5-6B is referenced for assigning tasks to data analytics processing nodes; metadata is separate from data analytics processing nodes; ¶¶ 30-31, each data analytic processing nodes include multiple execution nodes, e.g., processors/CPUs and caches) 
in response to determining that at least a portion of the set of files is cached among the plurality execution nodes, processing of one or more of the set of files to each of one or more execution nodes that have cached at least a portion of the set of files; (¶¶ 30-31, 50 and 66, metadata/location information in FIGs. 5-6B is referenced for assigning tasks to data analytics processing nodes including multiple execution nodes, e.g., processors/CPUs and caches where a requested data block is located)
for each of the one or more assigned execution nodes:
determining, by the assigned execution node, whether the assigned one or more files is stored at least in part in a cache of the assigned execution node; and (¶¶ 47-50, 57-58, 66, FIGs. 5-6B, job manger 540 distributes jobs to analytic processing nodes including multiple execution nodes, e.g., processors/CPUs and caches in response to locating data in memory caches 514, 524, and 534 using metadata 544; an analytic processing node including multiple execution nodes, e.g., processors/CPUs and caches performs the assigned task using the cached data and further communicates with storage server for additional data if needed)
in response to the assigned execution node determining the assigned one or more files is not entirely stored in the cache of the assigned execution node:
retrieve a missing portion of the assigned one or more files from one or more remote storage devices of the plurality of remote storage devices including the missing portion of the assigned one or more files,  (¶¶ 47-50, 57-58, 66, FIGs. 5-6B, each data analytics processing nodes including multiple execution nodes, e.g., processors/CPUs and caches performs the assigned task using the cached data and further determines if additional data is needed; it retrieves additional data from the storage server)  
wherein the plurality of execution nodes are organized into one or more virtual warehouses, and a virtual warehouse including the assigned execution node dynamically establishes a communication link with each of the one or more of the plurality of remote storage devices based at least in part on the query so that the assigned execution node may retrieve the missing portion; (¶¶ 30 and 70, data analytics processing nodes/each including multiple execution nodes, e.g., processors/CPUs, are implemented as virtual machines as in ¶ 70; a data analytics processing node is a virtual warehouse, e.g., is implemented as a virtual machine including multiple processors; a data analytics processing node retrieves data from any other data analytics processing nodes/memory caches and storage server as needed using I/O coordinators as in ¶¶ 62 and 69 indicates that data analytics processing nodes are logically mapped to each other and the storage server for accessing and retrieving data as needed)
store, by the assigned execution node, the entire assigned one or more files in the cache of the assigned execution node; (¶¶ 41-42, 47-50, 57-58, 66, FIGs. 5-6B, a data analytics processing node including multiple execution nodes, e.g., processors/CPUs and caches retrieves needed data entirely from other processing nodes or storage server and stores the retrieved data in its cache for performing a task as described in ¶ 61)
process the query using the assigned one or more files stored in the cache of the assigned execution node; and (FIG. 6B, ¶ 67, a task is processed by a given  data analytics processing node including multiple execution nodes, e.g., processors/CPUs and caches and the task result is returned to job manager)
update the metadata store to indicate the entire assigned one or more files is now cached in the cache of the assigned execution node; (¶¶ 50 and 66, metadata in FIG. 5, 544 and FIG. 6B, 644 include updated location information)
wherein any of the set of files stored in the plurality of shared storage devices may be accessed by any of a plurality of execution nodes of the execution platform; (¶ 42, an of data analytics processing node including multiple execution nodes, e.g., processors/CPUs and caches can access files in storage server)
wherein any of the set of files stored in the plurality of shared storage devices may be stored in a cache of any of the plurality of execution nodes of the execution platform; and (¶¶ 42, 50, 66, FIG. 5, 544 and FIG. 6B, 644, data analytics processing nodes including multiple execution nodes, e.g., processors/CPUs and caches store a retrieved file from storage server into their caches)
wherein any of the set of files stored in the plurality of shared storage devices may be stored in a cache of multiple execution nodes of the plurality of execution nodes of the execution platform at one point in time. (FIG. 6B, ¶ 61, any file stored in storage server  650 may be stored in data analytics processing nodes 610, 620 and 630 as needed)
Soundararajan did not specifically teach: wherein the plurality of execution nodes are organized into one or more virtual warehouses having one or more logical mappings between them.
Chen teaches: wherein the plurality of execution nodes are organized into one or more virtual warehouses having one or more logical mappings between them. (wherein multiple virtual clusters each including multiple virtual machines and wherein each multiple virtual machines have one logical mapping between them for processing a patriation of a workflow: p.1, sec. I, left col. “workflow partitioning [5] is an approach to divide a workflow into several sub-workflows and then submit these sub-workflows to different execution sites (virtual clusters). Workflow partitioning requires the sub-workflows to be suitable for execution in an execution site in terms of storage constraints, communication cost and computation cost… we are able to dynamically allocate resources into multiple execution sites or virtual clusters [7] and then execute the sub-workflows on these sites”; p.4, sec. B, “The XML format describes virtual clusters as a collection of several nodes, which correspond to virtual machines. Each node is defined with the characteristics of the virtual machine to be provisioned, such as the VM image to use and the hardware resource type (CPU, memory, disk, etc.)”)
Soundararajan implicitly taught the feature in ¶¶ 30-31 by disclosing a data analytics processing node including multiple processers and caches and in ¶ 70 by implements the data analytics processing node as “one or more virtual machines”. It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine the applied references for disclosing wherein the plurality of execution nodes are organized into one or more virtual warehouses having one or more logical mappings between them because doing so would explicitly provide for lunching VM instances and constructing virtual clusters dynamically as needed for executing jobs/sub-workflows related to a workflow. 

Claim 21.	Soundararajan teaches:
A non-transitory computer readable storage medium having instructions stored thereon that, when executed by one or more processors, cause the one or more processors to:
receive a query directed to database data stored across a plurality of shared storage devices; (¶¶ 36, 40, FIGs. 1, and 5-7, a received analytic job/query is directed to data located in the memory of data analytics processing nodes and storage server of FIGS. 1, and 5-7)
reference a metadata store to locate a set of files that comprises data that needs to be processed to respond to the query; (¶¶ 50 and 66, metadata/location information in FIGs. 5-6B is referenced for assigning tasks to data analytics processing nodes; metadata is separate from data analytics processing nodes; ¶¶ 30-31, each data analytic processing nodes include multiple execution nodes, e.g., processors/CPUs and caches) 
reference the metadata store to determine whether the set of files is cached among execution nodes of an execution platform comprising a plurality of execution nodes, wherein the execution platform is separate from the metadata store and the plurality of shared storage devices; and (¶¶ 50 and 66, metadata/location information in FIGs. 5-6B is referenced for assigning tasks to data analytics processing nodes; metadata is separate from data analytics processing nodes; ¶¶ 30-31, each data analytic processing nodes include multiple execution nodes, e.g., processors/CPUs and caches) 
in response to determining that at least a portion of the set of files is cached among the plurality of execution nodes, assigning by the one or more processors, processing of one or more of the set of files to each of one or more execution nodes that have cached at least a portion of the set of files; (¶¶ 30-31, 50 and 66, metadata/location information in FIGs. 5-6B is referenced for assigning tasks to data analytics processing nodes including multiple execution nodes, e.g., processors/CPUs and caches where a requested data block is located)
for each of the one or more assigned execution nodes:
determining, by the assigned execution node, whether the assigned one or more files is stored at least in part in a cache of the assigned execution node; and (¶¶ 47-50, 57-58, 66, FIGs. 5-6B, job manger 540 distributes jobs to analytic processing nodes including multiple execution nodes, e.g., processors/CPUs and caches in response to locating data in memory caches 514, 524, and 534 using metadata 544; an analytic processing node including multiple execution nodes, e.g., processors/CPUs and caches performs the assigned task using the cached data and further communicates with storage server for additional data if needed)
in response to the assigned execution node determining the assigned one or more files is not entirely stored in the cache of the assigned execution node:
retrieving a missing portion of the assigned one or more files from one or more remote storage devices of the plurality of remote storage devices including the missing portion, (¶¶ 47-50, 57-58, 66, FIGs. 5-6B, each data analytics processing nodes including multiple execution nodes, e.g., processors/CPUs and caches performs the assigned task using the cached data and further determines if additional data is needed; it retrieves additional data from the storage server)  
wherein the plurality of execution nodes are organized into one or more virtual warehouses, and a virtual warehouse including the assigned execution node dynamically establishes a communication link with each of the one or more of the plurality of remote storage devices based at least in part on the query so that the assigned execution node may retrieve the missing portion; (¶¶ 30 and 70, data analytics processing nodes/each including multiple execution nodes, e.g., processors/CPUs, are implemented as virtual machines as in ¶ 70; a data analytics processing node is a virtual warehouse, e.g., is implemented as a virtual machine including multiple processors; a data analytics processing node retrieves data from any other data analytics processing nodes/memory caches and storage server as needed using I/O coordinators as in ¶¶ 62 and 69 indicates that data analytics processing nodes are logically mapped to each other and the storage server for accessing and retrieving data as needed)
storing, by the assigned execution node, the entire assigned one or more files in the cache of the assigned execution node; (¶¶ 41-42, 47-50, 57-58, 66, FIGs. 5-6B, a data analytics processing node including multiple execution nodes, e.g., processors/CPUs and caches retrieves needed data entirely from other processing nodes or storage server and stores the retrieved data in its cache for performing a task as described in ¶ 61)
processing the query using the assigned one or more files stored in the cache of the assigned execution node; and (FIG. 6B, ¶ 67, a task is processed by a given  data analytics processing node including multiple execution nodes, e.g., processors/CPUs and caches and the task result is returned to job manager)
updating the metadata store to indicate the entire assigned one or more files is now cached in the cache of the assigned execution node; (¶¶ 50 and 66, metadata in FIG. 5, 544 and FIG. 6B, 644 include updated location information)
wherein any of the set of files stored in the plurality of shared storage devices may be accessed by any of a plurality of execution nodes of the execution platform; (¶ 42, an of data analytics processing node including multiple execution nodes, e.g., processors/CPUs and caches can access files in storage server)
wherein any of the set of files stored in the plurality of shared storage devices may be stored in a cache of any of the plurality of execution nodes of the execution platform; and (¶¶ 42, 50, 66, FIG. 5, 544 and FIG. 6B, 644, data analytics processing nodes including multiple execution nodes, e.g., processors/CPUs and caches store a retrieved file from storage server into their caches)
wherein any of the set of files stored in the plurality of shared storage devices may be stored in a cache of multiple execution nodes of the plurality of execution nodes of the execution platform at one point in time. (FIG. 6B, ¶ 61, any file stored in storage server  650 may be stored in data analytics processing nodes 610, 620 and 630 as needed)
Soundararajan did not specifically teach: wherein the plurality of execution nodes are organized into one or more virtual warehouses having one or more logical mappings between them.
Chen teaches: wherein the plurality of execution nodes are organized into one or more virtual warehouses having one or more logical mappings between them. (wherein multiple virtual clusters each including multiple virtual machines and wherein each multiple virtual machines have one logical mapping between them for processing a patriation of a workflow: p.1, sec. I, left col. “workflow partitioning [5] is an approach to divide a workflow into several sub-workflows and then submit these sub-workflows to different execution sites (virtual clusters). Workflow partitioning requires the sub-workflows to be suitable for execution in an execution site in terms of storage constraints, communication cost and computation cost… we are able to dynamically allocate resources into multiple execution sites or virtual clusters [7] and then execute the sub-workflows on these sites”; p.4, sec. B, “The XML format describes virtual clusters as a collection of several nodes, which correspond to virtual machines. Each node is defined with the characteristics of the virtual machine to be provisioned, such as the VM image to use and the hardware resource type (CPU, memory, disk, etc.)”)
Soundararajan implicitly taught the feature in ¶¶ 30-31 by disclosing a data analytics processing node including multiple processers and caches and in ¶ 70 by implements the data analytics processing node as “one or more virtual machines”. It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine the applied references for disclosing wherein the plurality of execution nodes are organized into one or more virtual warehouses having one or more logical mappings between them because doing so would explicitly provide for lunching VM instances and constructing virtual clusters dynamically as needed for executing jobs/sub-workflows related to a workflow.

Claim 2.	The method of claim 1, further comprising:
in response to the assigned execution node determining the assigned one or more files is entirely stored in the cache of the assigned node, processing, using one or more processors of the assigned execution node, the query using the assigned one or more files stored in the cache of the assigned execution node. (Soundararajan, ¶¶ 40, 47-50, 57-58, 66, fig. 5, figs. 6A-6B, processing node 514, 524, or 534 processes a task if the requested data exist in its cache as described in fig. 3)

Claim 3.	 The method of claim 1, wherein updating the metadata store to indicate the entire assigned one or more files is now cached in the assigned execution node comprises updating the metadata store to identify all files that are duplicated in the cache of the assigned execution node. (Soundararajan, fig. 640, metadata 644 includes copy/duplicate information; A2 in 610 is a duplicate of A2 in 620)

Claim 10.	The method of claim 1, wherein each execution node of the plurality of execution nodes comprises at least one processor and at least one local cache caching a copy of at least a portion of the database data. (Soundararajan, fig. 6A, processing nodes include processors and cached data)

Claims 4-5, 7, 15-16 and 23 are rejected under 35 U.S.C. 103(a) as being unpatentable over the combination of Soundararajan and Chen as applied to claims 1, 14 and 21 above in view of Trevathan, Patent No.: US 7,085,891 (Trevathan).

Claim 4.	Soundararajan as modified taught the method of claim 1 in which the analytics processing nodes 514, 524, or 534 in fig. 5 stores data in their memories; Soundararajan as modified did not specifically teach whether to store the assigned one or more files in faster or slower memory by implementing a least recently used (LRU) algorithm. 
Trevathan teaches whether to store the assigned one or more files in faster or slower memory by implementing a least recently used (LRU) algorithm in col. 2, ll. 20-34, wherein a LRU algorithm is used for managing a cached file in caching system: “When a file enters the caching system…The caching algorithm with the most favorable figure of merit is selected as the preferred caching algorithm, and the entering file is managed according to that algorithm…the set of caching algorithms includes…a least-recently-used caching algorithm”. 
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine the applied references for explicitly disclosing whether to store the assigned one or more files in faster or slower memory by implementing a least recently used (LRU) algorithm because LRU algorithm is very well known in the art and further doing so would increase usability of Soundararajan as modified for a cache management based on metrics associated with the file and the storage (Trevathan, col. 2, ll. 20-34.)

Claim 5.	The method of claim 4, wherein implementing the LRU algorithm comprises identifying one or more copies of the assigned one or more files to be removed from the cache. (Soundararajan, ¶ 56, wherein cached file in a processing node is a copy of a file; Trevathan, col. 1, ll. 51-55 and col. 2, ll. 20-34, wherein a LRU algorithm is used for managing a cached file in caching system)

Claim 7.	The method of claim 1, wherein each execution node of the execution platform comprises a cache, wherein the cache includes a first storage portion and a second storage portion, wherein the first storage portion is significantly faster than the second storage portion. (Soundararajan, fig. 6A, each processing node comprises a cache; Trevathan, col. 1, ll. 38-46: wherein cache includes an “extraordinarily fast” portion)

Claim 15.	The apparatus of claim 14, wherein each execution node of the execution platform comprises at least one local processor and at least one cache, wherein the at least one cache of each execution node comprises a memory device and a disk storage device. (Trevathan, col. 2, ll. 38-46, caching system comprises memory and disk storage: “A longstanding solution to this problem is to use relatively slow memory (disk storage) for storing information whose recall does not directly limit the computer's responsiveness, and to use relatively fast memory (memory device) for storing information that must be retrieved without perceptible delay. This principle is often carried one step further by dividing the fast memory itself into cache memory, which employs extraordinarily fast but expensive memory chips, and main memory, which uses somewhat slower but less expensive chips.”)
Claim 16 is rejected under the same rationale as claim 2.
Claim 23 is rejected under the same rationale as claim 15.

Claim 6 is rejected under 35 U.S.C. 103(a) as being unpatentable over the combination of Soundararajan and Chen as applied to claim 1 above in view of Faibish et al., Patent No.: US 8,589,550 (Faibish).

Claim 6.	Soundararajan as modified teaches:
The method of claim 1, wherein the metadata store is separated, and wherein the metadata store comprises a complete metadata listing of the database data stored across the plurality of shared storage devices and a complete listing of files cached in the plurality execution nodes of the execution platform. (Soundararajan, ¶¶ 36, 49-50, metadata in fig. 5, 544 stores complete location information; furthermore, implementing metadata separately “in special-purpose hardware, programmable hardware, or some combination thereof” indicates that the metadata store is scalable independently)
Soundararajan as modified did not specifically teach the metadata store independently scalable from each of the resource manager, the plurality of shared storage devices, and the execution platform.
Faibish teaches the metadata store independently scalable from each of the resource manager, the plurality of shared storage devices, and the execution platform in col. 3, ll. 29-54 and col. 7, ll. 35-55 by disclosing a network architecture that “permits virtually seamless as well as almost infinite scalability by adding at any time additional servers when there is a need for more throughput to the clients to ensure a desired QoS”.
Soundararajan ¶¶ 40-50 implicitly discloses the feature by implementing metadata storage in a separate device; it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine the applied references for teaching a metadata store that is independently scalable from each of the resource manager, the plurality of shared storage devices, and the execution platform because doing so would explicitly discloses a scalable data processing system in which metadata servers can be scaled as needed.

Claims 8, 22 and 24 are rejected under 35 U.S.C. 103(a) as being unpatentable over the combination of Soundararajan and Chen as applied to claims 1 and 21 above in view of Ghosh et al., Pub. No.: US 2007/0038595 (Ghosh).

Claim 8.	Soundararajan as modified taught the method of claim 1; Soundararajan as modified did not teach wherein the database query comprises a single instruction that is applied by the execution platform to each of the set of files substantially simultaneously. 
Ghosh teaches wherein the database query comprises a single instruction that is applied by the execution platform to each of the set of files substantially simultaneously in ¶ 42, wherein, a single cursor instruction is shared between several processes for parallel operations: “a parallel single cursor (PSC) model involves a system in which a single cursor is shared between several processes. For example, a cursor has been generated by server 102a based on database statement 101 received from a database application…the cursor includes the original statement of the database command…for which the cursor was generated, and an execution plan that describes a plan for accomplishing all of the operations specified by the original statement”)
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine the applied references for disclosing wherein the database query comprises a single instruction that is applied by the execution platform to each of the set of files substantially simultaneously because doing so would increase usability of the applied references by providing for participation of nodes effectively for retrieving a requested data in parallel according to a single instruction (Ghosh, ¶¶ 42-43.)
Claim 22 is rejected under the same rationale as claim 8.
Claim 24 is rejected under the same rationale as claim 2.

Claims 11-13, 18 and 25 are rejected under 35 U.S.C. 103(a) as being unpatentable over the combination of Soundararajan and Chen as applied to claims 1, 14 and 21 above in view of Vinson et al., Patent No.: US 6,453,334 (Vinson).

Claim 11.	Soundararajan, ¶ 56  taught the method of claim 1, comprising in response to the assigned execution node determining that the assigned one or more files is not stored in the cache, retrieving the file from the database server; Soundararajan as modified did not disclose modifying, by the assigned execution node a database data structure of the retrieved copy of the assigned one or more files prior to storing the retrieved copy in the cache. 
Vinson teaches modifying, by the assigned execution node a database data structure of the retrieved copy of the assigned one or more files prior to storing the retrieved copy in the cache in col. 13, ll. 48-56, wherein data structure of downloaded data is modified by decompressing and decrypting data for caching: “after the data from the chunk file has been downloaded into a temporary buffer, decompress the temporary buffer into the 8 buffers allocated from the on-disk cache in step 6. Write the buffers back out to the cache file, and update the MRU and key map of the on-disk cache. Decrypt the data from the one on-disk cache buffer corresponding to the requested block and place it in the in-memory buffer allocated in step 4.”
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine the applied references for disclosing modifying, by the assigned execution node a database data structure of the retrieved copy of the assigned one or more files prior to storing the retrieved copy in the cache because doing so would provide for “improved network performance” (Vinson, col.2, ll. 17-28.)
Claim 25 is rejected under the same rationale as claim 11.

Claim 12.	The method of claim 11, wherein modifying the database data structure of the retrieved copy includes decrypting the retrieved copy. (Vinson, col. 13, ll. 49-56, wherein data is decompressed and decrypted for placing the data into cache: “after the data from the chunk file has been downloaded into a temporary buffer, decompress the temporary buffer into the 8 buffers allocated from the on-disk cache in step 6. Write the buffers back out to the cache file, and update the MRU and key map of the on-disk cache. Decrypt the data from the one on-disk cache buffer corresponding to the requested block and place it in the in-memory buffer allocated in step 4.”)

Claim 13.	The method of claim 11, wherein modifying the database data structure of the retrieved copy includes decompressing the retrieved copy. (Vinson, col. 13, ll. 48-56, wherein data structure is modified by decompressing and decrypting data for caching: “after the data from the chunk file has been downloaded into a temporary buffer, decompress the temporary buffer into the 8 buffers allocated from the on-disk cache in step 6. Write the buffers back out to the cache file, and update the MRU and key map of the on-disk cache. Decrypt the data from the one on-disk cache buffer corresponding to the requested block and place it in the in-memory buffer allocated in step 4.”)
Claim 18.	The apparatus of claim 14, wherein each node of the plurality of nodes is further programmed to modify a data structure associated with the copy prior to storing the copy in the at least one cache thereof. (Vinson, col. 13, ll. 48-56, wherein data structure of downloaded data is modified by decompressing and decrypting data for caching: “after the data from the chunk file has been downloaded into a temporary buffer, decompress the temporary buffer into the 8 buffers allocated from the on-disk cache in step 6. Write the buffers back out to the cache file, and update the MRU and key map of the on-disk cache. Decrypt the data from the one on-disk cache buffer corresponding to the requested block and place it in the in-memory buffer allocated in step 4”)

Response to Amendment and Arguments
In light of applicant’s clarification, the claim objections are withdrawn.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHSEN ALMANI whose telephone number is (571)270-7722.  The examiner can normally be reached on M-F, 9:00 to 5:00.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela Reyes can be reached on (571)270-1006.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MOHSEN ALMANI/Primary Examiner, Art Unit 2159
/Mariela Reyes/Supervisory Patent Examiner, Art Unit 2159