DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
This communication is responsive to the amendment to the original application. This action is Final. Claims 1 – 3, 5 – 6, 8 – 11, 13 – 14, 16 – 18 and 20 are pending and have been examined.  
Response to Amendments
In the reply filed 5/31/22, claims 1, 9 and 17 were amended. Claims 4, 7, 12, 15 and 19 were cancelled. Accordingly, claims 1 – 3, 5 – 6, 8 – 11, 13 – 14, 16 – 18 and 20 are pending. 
Response to Arguments
Applicant's arguments with respect to claims 1 – 20 have been carefully considered but are moot and not deemed persuasive in view of rejections below.
Double patenting rejection is withdrawn based on the amendments. However, examiner respectfully disagrees with the applicant’s arguments on pages 10 – 11, that prior art does not teach, storing the storage metadata in an accelerator pool, wherein the accelerator pool comprises a plurality of high performance nodes; George [0020] teaches, “More specifically, the map can contain a list of rules that tells CRUSH how it should replicate data in a Ceph cluster's pool.  The rules can contain a replication factor for a particular pool of data to help determine how many times the data is to be replicated within the cluster and on which storage nodes the replicated data is to be stored.  A pool can comprise a collection of data, such as objects, and a replication factor can be assigned to each pool.  Typically, a pool can be shared across tenants.” Here, the cluster pool is the accelerator pool with high performance storage nodes.
storing, across a plurality of fault domains, the plurality of deduplicated data chunks and the at least one parity chunk (George [0017] teaches, “Object storage involves storing chunks of data in an object, with each object including metadata and a unique identifier.  Distributed storage systems can also be applied to other types of data storage such as block storage and file storage, for example.  In block storage, data can be stored in blocks (or volumes) where each block acts as an individual hard drive.  File storage is generally a hierarchical way of organizing files containing data such that an individual file can be located by a path to that file.  Certain metadata describing the file and its contents is also typically stored in the file system.  In distributed storage systems, multiple replicas of data in any suitable type of structure (e.g., objects, files, blocks, etc.) can be maintained in order to provide fault tolerance and high availability.”),
wherein a non-accelerator pool comprises a plurality of data nodes that provide lower storage performance than the plurality of high performance nodes, wherein the plurality of fault domains is implemented using the plurality of data nodes (George [0051] teaches, “In at least one implementation (e.g., in Ceph), data is added to a pool corresponding to the tenant.  At 404, gateway 80 may receive a request for data of the tenant to be stored in storage cluster 50 of distributed storage system 100.  In at least one embodiment, the request may be an indication that the authorized user (e.g., human user or application) has stored objects or other data in a pool corresponding to the tenant.  At 406, the tenant associated with the data can be identified based on the pool in which the data is stored.  At 408, a data storage module associated with the tenant is identified.  This identification may be made based on a mapping of a unique identifier of the data storage module to the tenant.”). Therefore, examiner is not persuaded.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1 – 3, 5 – 6, 8 – 11, 13 – 14, 16 – 18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Dirac et al., U.S. Patent Application Publication No.: 2015/0379430 (Hereinafter “Dirac”), and further in view of George et al., U.S. Patent Application Publication No.: 2016/0334998 (Hereinafter “George”).
Regarding claim 1, Dirac teaches, a method for managing data, the method comprising: 
obtaining the data from a host (Dirac [0097]: A given provider network may include numerous data centers hosting various resource pools, such as collections of physical and/or virtualized computer servers, storage devices, networking equipment and the like, needed to implement, configure and distribute the infrastructure and services offered by the provider.); 
applying an erasure coding procedure (Dirac [0105]: erasure coding) to the data to obtain a plurality of data chunks and at least one parity chunk (Dirac [0163]: The concatenated address space of data set 1804 may then be sub-divided into a plurality of contiguous chunks, as indicated in chunk mapping 1806.);
deduplicating (Dirac [Abstract]: At a machine learning service, a determination is made that an analysis to detect whether at least a portion of contents of one or more observation records of a first data set are duplicated in a second set of observation records is to be performed.  A duplication metric is obtained, indicative of a non-zero probability that one or more observation records of the second set are duplicates of respective observation records of the first set.) the plurality of data chunks to obtain a plurality of deduplicated data chunks (Dirac [0204]: In various embodiments, consistent splits may be performed at the chunk level, at the observation record level, or at some combination of chunk and record levels, using consistency metadata of the kind described above.  In at least one embodiment, after a chunk-level split is performed, the records of the individual chunks in the training set or the test set may be shuffled prior to use for training/evaluating a model.); 
generating storage metadata associated with the plurality of deduplicated data chunks and the at least one parity chunk (Dirac [0202]: The plan generator may determine a set of consistency metadata 3152, e.g., metadata that may be shared among related jobs that are inserted in the MLS job queue for the requested split iterations.  The metadata 3152 may comprise the client-provided seed values 3120, for example.); 
Dirac does not clearly teach, storing the storage metadata in an accelerator pool, wherein the accelerator pool comprises a plurality of high performance nodes; However, George [0020] teaches, “More specifically, the map can contain a list of rules that tells CRUSH how it should replicate data in a Ceph cluster's pool.  The rules can contain a replication factor for a particular pool of data to help determine how many times the data is to be replicated within the cluster and on which storage nodes the replicated data is to be stored.  A pool can comprise a collection of data, such as objects, and a replication factor can be assigned to each pool.  Typically, a pool can be shared across tenants.” Here, the cluster pool is the accelerator pool with high performance storage nodes.
storing, across a plurality of fault domains, the plurality of deduplicated data chunks and the at least one parity chunk (George [0017] teaches, “Object storage involves storing chunks of data in an object, with each object including metadata and a unique identifier.  Distributed storage systems can also be applied to other types of data storage such as block storage and file storage, for example.  In block storage, data can be stored in blocks (or volumes) where each block acts as an individual hard drive.  File storage is generally a hierarchical way of organizing files containing data such that an individual file can be located by a path to that file.  Certain metadata describing the file and its contents is also typically stored in the file system.  In distributed storage systems, multiple replicas of data in any suitable type of structure (e.g., objects, files, blocks, etc.) can be maintained in order to provide fault tolerance and high availability.”),
wherein a non-accelerator pool comprises a plurality of data nodes that provide lower storage performance than the plurality of high performance nodes, wherein the plurality of fault domains is implemented using the plurality of data nodes (George [0051] teaches, “In at least one implementation (e.g., in Ceph), data is added to a pool corresponding to the tenant.  At 404, gateway 80 may receive a request for data of the tenant to be stored in storage cluster 50 of distributed storage system 100.  In at least one embodiment, the request may be an indication that the authorized user (e.g., human user or application) has stored objects or other data in a pool corresponding to the tenant.  At 406, the tenant associated with the data can be identified based on the pool in which the data is stored.  At 408, a data storage module associated with the tenant is identified.  This identification may be made based on a mapping of a unique identifier of the data storage module to the tenant.”); and 
initiating storage metadata distribution on the storage metadata across the plurality of fault domains (George [0018]: At least one Ceph metadata server can be provided for a storage cluster to store metadata associated with the objects (e.g., inodes, directories, etc.).  Ceph monitors are provided for monitoring active and failed storage nodes in the cluster.).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to incorporate the teaching of Dirac et al. to the George’s system by adding the feature of storage metadata. The references (Dirac and George) teach features that are analogous art and they are directed to the same field of endeavor, such as data storage. Ordinary skilled artisan would have been motivated to do so to provide Dirac’s system with enhanced data storage. (See George [Abstract], [0017 – 0020], [0028], [0051]). One of the biggest advantages of network machine learning database algorithms is their ability to improve over time. Machine learning technology typically improves efficiency and accuracy thanks to the ever-increasing amounts of data that are processed.
Regarding claim 2, the method of claim 1, further comprising: identifying a storage metadata failure of the storage metadata in the accelerator pool; sending a storage metadata request to at least one fault domain; obtaining a storage metadata response from the at least one fault domain; and performing a storage metadata reconstruction of storage metadata on the accelerator pool using at least the storage metadata response (George [0017]: Object storage involves storing chunks of data in an object, with each object including metadata and a unique identifier.  Distributed storage systems can also be applied to other types of data storage such as block storage and file storage, for example.  In block storage, data can be stored in blocks (or volumes) where each block acts as an individual hard drive.  File storage is generally a hierarchical way of organizing files containing data such that an individual file can be located by a path to that file.  Certain metadata describing the file and its contents is also typically stored in the file system.  In distributed storage systems, multiple replicas of data in any suitable type of structure (e.g., objects, files, blocks, etc.) can be maintained in order to provide fault tolerance and high availability.).
Regarding claim 3, the method of claim 2, wherein the storage metadata response includes at least a portion of the storage metadata (Dirac [0203]: In some implementations, a job object created by the plan generator 3180 may include a reference or pointer to the consistency metadata to be used for that job.  In another implementation, at least a portion of the consistency metadata 3152 may be included within a job object.  When a job is executed, the metadata 3152 may be used to ensure that the input data set is split consistently.). 
Regarding claim 5, the method of claim 1, wherein each deduplicated data chunk of the plurality of data chunks is stored in a unique fault domain of the plurality of fault domains, and wherein a copy of the storage metadata is stored in each fault domain of the plurality of fault domains (Dirac [0347]: FIG. 70 illustrates an example duplicate detector that may utilize space-efficient representations of machine learning data sets to determine whether one data set is likely to include duplicate observation records of another data set at a machine learning service, according to at least some embodiments.).
Regarding claim 6, the method of claim 1,
wherein storing the plurality of deduplicated data chunks and the at least one parity chunk comprises: storing a deduplicated data chunk of the plurality of deduplicated data chunks on a first data node in a fault domain of the plurality of fault domains (Dirac [0348]: In some embodiments, the alternate representation may be generated and stored in parallel with the training of the model, so that, for example, only a single pass through the training data set 7002 may be needed for both (a) training the model and (b) creating and storing the alternate representation 7030.  The alternate representation may require much less (e.g., orders of magnitude less) storage or memory than is occupied by the training data set itself in some implementations.),
wherein initiating storage metadata distribution on the storage metadata across the plurality of fault domains comprises: initiating storage of a copy of the storage metadata on a second data node in the fault domain (Dirac [0179]: In one embodiment, the OR extraction request 2401 may include compression metadata 2406, indicating for example the compression algorithm used for the data set, the sizes of the units or blocks in which the compressed data is stored (which may differ from the sizes of the chunks on which chunk-level in-memory filtering operations are to be performed), and other information that may be necessary to correctly de-compress the data set.  Decryption metadata 2408 such as keys, credentials, and/or an indication of the encryption algorithm used on the data set may be included in a request 2401 in some embodiments.  Authorization/authentication metadata 2410 to be used to be able to obtain read access to the data set may be provided by the client in request 2401 in some implementations and for certain types of data sources.  Such metadata may include, for example, an account name or user name and a corresponding set of credentials, or an identifier and password for a security container (similar to the security containers 390 shown in FIG. 3).).
Regarding claim 8, the method of claim 1, wherein the storage metadata includes at least location information of: at least one of the plurality of deduplicated data chunks and of the at least one parity chunk (Dirac [0185]: After the first filtering operation of the sequence is performed in memory at the MLS servers, the remaining filtering operations (if any) may be performed in place in the depicted embodiment, e.g., without copying the chunks to persistent storage or re-reading the chunks for their original source locations (element 2519).).
Regarding claim 9, Dirac teaches, a non-transitory computer readable medium comprising computer readable program code, which when executed by a computer processor enables the computer processor to perform a method for managing data, the method comprising:
obtaining the data from a host (Dirac [0097]: A given provider network may include numerous data centers hosting various resource pools, such as collections of physical and/or virtualized computer servers, storage devices, networking equipment and the like, needed to implement, configure and distribute the infrastructure and services offered by the provider.); 
applying an erasure coding procedure (Dirac [0105]: erasure coding) to the data to obtain a plurality of data chunks and at least one parity chunk (Dirac [0163]: The concatenated address space of data set 1804 may then be sub-divided into a plurality of contiguous chunks, as indicated in chunk mapping 1806.);
deduplicating (Dirac [Abstract]: At a machine learning service, a determination is made that an analysis to detect whether at least a portion of contents of one or more observation records of a first data set are duplicated in a second set of observation records is to be performed.  A duplication metric is obtained, indicative of a non-zero probability that one or more observation records of the second set are duplicates of respective observation records of the first set.) the plurality of data chunks to obtain a plurality of deduplicated data chunks (Dirac [0204]: In various embodiments, consistent splits may be performed at the chunk level, at the observation record level, or at some combination of chunk and record levels, using consistency metadata of the kind described above.  In at least one embodiment, after a chunk-level split is performed, the records of the individual chunks in the training set or the test set may be shuffled prior to use for training/evaluating a model.);
generating storage metadata associated with the plurality of deduplicated data chunks and the at least one parity chunk (Dirac [0202]: The plan generator may determine a set of consistency metadata 3152, e.g., metadata that may be shared among related jobs that are inserted in the MLS job queue for the requested split iterations.  The metadata 3152 may comprise the client-provided seed values 3120, for example.);
Dirac does not clearly teach, storing the storage metadata in an accelerator pool, wherein the accelerator pool comprises a plurality of high performance nodes; However, George [0020] teaches, “More specifically, the map can contain a list of rules that tells CRUSH how it should replicate data in a Ceph cluster's pool.  The rules can contain a replication factor for a particular pool of data to help determine how many times the data is to be replicated within the cluster and on which storage nodes the replicated data is to be stored.  A pool can comprise a collection of data, such as objects, and a replication factor can be assigned to each pool.  Typically, a pool can be shared across tenants.” Here, the cluster pool is the accelerator pool with high performance storage nodes.
storing, across a plurality of fault domains, the plurality of deduplicated data chunks and the at least one parity chunk (George [0017] teaches, “Object storage involves storing chunks of data in an object, with each object including metadata and a unique identifier.  Distributed storage systems can also be applied to other types of data storage such as block storage and file storage, for example.  In block storage, data can be stored in blocks (or volumes) where each block acts as an individual hard drive.  File storage is generally a hierarchical way of organizing files containing data such that an individual file can be located by a path to that file.  Certain metadata describing the file and its contents is also typically stored in the file system.  In distributed storage systems, multiple replicas of data in any suitable type of structure (e.g., objects, files, blocks, etc.) can be maintained in order to provide fault tolerance and high availability.”),
wherein a non-accelerator pool comprises a plurality of data nodes that provide lower storage performance than the plurality of high performance nodes, wherein the plurality of fault domains is implemented using the plurality of data nodes (George [0051] teaches, “In at least one implementation (e.g., in Ceph), data is added to a pool corresponding to the tenant.  At 404, gateway 80 may receive a request for data of the tenant to be stored in storage cluster 50 of distributed storage system 100.  In at least one embodiment, the request may be an indication that the authorized user (e.g., human user or application) has stored objects or other data in a pool corresponding to the tenant.  At 406, the tenant associated with the data can be identified based on the pool in which the data is stored.  At 408, a data storage module associated with the tenant is identified.  This identification may be made based on a mapping of a unique identifier of the data storage module to the tenant.”); and 
initiating storage metadata distribution on the storage metadata across the plurality of fault domains (George [0018]: At least one Ceph metadata server can be provided for a storage cluster to store metadata associated with the objects (e.g., inodes, directories, etc.).  Ceph monitors are provided for monitoring active and failed storage nodes in the cluster.).
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to incorporate the teaching of Dirac et al. to the George’s system by adding the feature of storage metadata. The references (Dirac and George) teach features that are analogous art and they are directed to the same field of endeavor, such as data storage. Ordinary skilled artisan would have been motivated to do so to provide Dirac’s system with enhanced data storage. (See George [Abstract], [0017 – 0020], [0028], [0051]). One of the biggest advantages of network machine learning database algorithms is their ability to improve over time. Machine learning technology typically improves efficiency and accuracy thanks to the ever-increasing amounts of data that are processed.
Regarding claim 10, the non-transitory computer readable medium of claim 9, the method further comprising:
identifying a storage metadata failure of the storage metadata in the accelerator pool; sending a storage metadata request to at least one fault domain; obtaining a storage metadata response from the at least one fault domain; and performing a storage metadata reconstruction of storage metadata on the accelerator pool using at least the storage metadata response (George [0017]: Object storage involves storing chunks of data in an object, with each object including metadata and a unique identifier.  Distributed storage systems can also be applied to other types of data storage such as block storage and file storage, for example.  In block storage, data can be stored in blocks (or volumes) where each block acts as an individual hard drive.  File storage is generally a hierarchical way of organizing files containing data such that an individual file can be located by a path to that file.  Certain metadata describing the file and its contents is also typically stored in the file system.  In distributed storage systems, multiple replicas of data in any suitable type of structure (e.g., objects, files, blocks, etc.) can be maintained in order to provide fault tolerance and high availability.).
Regarding claim 11, the non-transitory computer readable medium of claim 10, wherein the storage metadata response includes at least a portion of the storage metadata (Dirac [0203]: In some implementations, a job object created by the plan generator 3180 may include a reference or pointer to the consistency metadata to be used for that job.  In another implementation, at least a portion of the consistency metadata 3152 may be included within a job object.  When a job is executed, the metadata 3152 may be used to ensure that the input data set is split consistently.).
Regarding claim 13, the non-transitory computer readable medium of claim 9, wherein each deduplicated data chunk of the plurality of data chunks is stored in a unique fault domain of the plurality of fault domains, and wherein a copy of the storage metadata is stored in a second data node of each fault domain of the plurality of fault domains (Dirac [0347]: FIG. 70 illustrates an example duplicate detector that may utilize space-efficient representations of machine learning data sets to determine whether one data set is likely to include duplicate observation records of another data set at a machine learning service, according to at least some embodiments.).
Regarding claim 14, the non-transitory computer readable medium of claim 9, 
wherein storing the plurality of deduplicated data chunks and the at least one parity chunk comprises: storing a deduplicated data chunk of the plurality of deduplicated data chunks on a first data node in a fault domain of the plurality of fault domains (Dirac [0348]: In some embodiments, the alternate representation may be generated and stored in parallel with the training of the model, so that, for example, only a single pass through the training data set 7002 may be needed for both (a) training the model and (b) creating and storing the alternate representation 7030.  The alternate representation may require much less (e.g., orders of magnitude less) storage or memory than is occupied by the training data set itself in some implementations.), 
wherein initiating storage metadata distribution on the storage metadata across the plurality of fault domains comprises: initiating storage of a copy of the storage metadata on a second data node in the fault domain (Dirac [0179]: In one embodiment, the OR extraction request 2401 may include compression metadata 2406, indicating for example the compression algorithm used for the data set, the sizes of the units or blocks in which the compressed data is stored (which may differ from the sizes of the chunks on which chunk-level in-memory filtering operations are to be performed), and other information that may be necessary to correctly de-compress the data set.  Decryption metadata 2408 such as keys, credentials, and/or an indication of the encryption algorithm used on the data set may be included in a request 2401 in some embodiments.  Authorization/authentication metadata 2410 to be used to be able to obtain read access to the data set may be provided by the client in request 2401 in some implementations and for certain types of data sources.  Such metadata may include, for example, an account name or user name and a corresponding set of credentials, or an identifier and password for a security container (similar to the security containers 390 shown in FIG. 3).).
Regarding claim 16, the non-transitory computer readable medium of claim 9, wherein the storage metadata includes at least location information of: at least one of the plurality of deduplicated data chunks and of the at least one parity chunk (Dirac [0185]: After the first filtering operation of the sequence is performed in memory at the MLS servers, the remaining filtering operations (if any) may be performed in place in the depicted embodiment, e.g., without copying the chunks to persistent storage or re-reading the chunks for their original source locations (element 2519).).
Regarding claim 17, Dirac teaches, a data cluster, comprising: 
a host; a non-accelerator pool comprising a plurality of low performance nodes wherein a plurality of fault domains is implemented using the plurality of low performance nodes; and an accelerator pool comprising (Dirac [0097]: A given provider network may include numerous data centers hosting various resource pools, such as collections of physical and/or virtualized computer servers, storage devices, networking equipment and the like, needed to implement, configure and distribute the infrastructure and services offered by the provider.) a plurality of data nodes, wherein the plurality of data nodes provides high storage performance than the plurality of low performance nodes (Dirac [0216]: In addition to the predicates to be evaluated at each node, a respective predictive utility metric (PUM) 3434 may also be generated for some or all of the nodes of tree 3433 in the depicted embodiment and stored in persistent storage--e.g., PUM 3434A may be computed and stored for node N1, PUM 3434B for node N2, and so on.  Generally speaking, the PUM of a given node may be indicative of the relative contribution or usefulness of that node with respect to the predictions that can be made using all the nodes.), 
wherein a data node of the plurality of data nodes comprises a processor and memory comprising instructions, which when executed by the processor perform a method, the method comprising (Dirac [0372]: In the illustrated embodiment, computing device 9000 includes one or more processors 9010 coupled to a system memory 9020 (which may comprise both non-volatile and volatile memory modules) via an input/output (I/O) interface 9030.): 
obtaining data from the host (Dirac [0097]: A given provider network may include numerous data centers hosting various resource pools, such as collections of physical and/or virtualized computer servers, storage devices, networking equipment and the like, needed to implement, configure and distribute the infrastructure and services offered by the provider.); 
applying an erasure coding procedure (Dirac [0105]: erasure coding) to the data to obtain a plurality of data chunks and at least one parity chunk (Dirac [0163]: The concatenated address space of data set 1804 may then be sub-divided into a plurality of contiguous chunks, as indicated in chunk mapping 1806.);
deduplicating (Dirac [Abstract]: At a machine learning service, a determination is made that an analysis to detect whether at least a portion of contents of one or more observation records of a first data set are duplicated in a second set of observation records is to be performed.  A duplication metric is obtained, indicative of a non-zero probability that one or more observation records of the second set are duplicates of respective observation records of the first set.) the plurality of data chunks to obtain a plurality of deduplicated data chunks (Dirac [0204]: In various embodiments, consistent splits may be performed at the chunk level, at the observation record level, or at some combination of chunk and record levels, using consistency metadata of the kind described above.  In at least one embodiment, after a chunk-level split is performed, the records of the individual chunks in the training set or the test set may be shuffled prior to use for training/evaluating a model.);
generating storage metadata associated with the plurality of deduplicated data chunks and the at least one parity chunk (Dirac [0202]: The plan generator may determine a set of consistency metadata 3152, e.g., metadata that may be shared among related jobs that are inserted in the MLS job queue for the requested split iterations.  The metadata 3152 may comprise the client-provided seed values 3120, for example.);
Dirac does not clearly teach, storing the storage metadata in the accelerator pool; storing, across the plurality of fault domains, the plurality of deduplicated data chunks and the at least one parity chunk; and initiating storage metadata distribution on the storage metadata across the plurality of fault domains. However, George [0017] teaches, “Object storage involves storing chunks of data in an object, with each object including metadata and a unique identifier.  Distributed storage systems can also be applied to other types of data storage such as block storage and file storage, for example.  In block storage, data can be stored in blocks (or volumes) where each block acts as an individual hard drive.  File storage is generally a hierarchical way of organizing files containing data such that an individual file can be located by a path to that file.  Certain metadata describing the file and its contents is also typically stored in the file system.  In distributed storage systems, multiple replicas of data in any suitable type of structure (e.g., objects, files, blocks, etc.) can be maintained in order to provide fault tolerance and high availability.”
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to incorporate the teaching of Dirac et al. to the George’s system by adding the feature of storage metadata. Ordinary skilled artisan would have been motivated to do so to provide Dirac’s system with enhanced data storage. (See George [Abstract], [0017], [0028]). In addition, the references (Dirac and George) teach features that are analogous art and they are directed to the same field of endeavor, such as data storage. This close relation suggests a high expectation of success when combined.
Regarding claim 18, the data cluster of claim 17, the method further comprising: identifying a storage metadata failure of the storage metadata in the accelerator pool; sending a storage metadata request to at least one fault domain; obtaining a storage metadata response from the at least one fault domain; and performing a storage metadata reconstruction of storage metadata on the accelerator pool using at least the storage metadata response (George [0017]: Object storage involves storing chunks of data in an object, with each object including metadata and a unique identifier.  Distributed storage systems can also be applied to other types of data storage such as block storage and file storage, for example.  In block storage, data can be stored in blocks (or volumes) where each block acts as an individual hard drive.  File storage is generally a hierarchical way of organizing files containing data such that an individual file can be located by a path to that file.  Certain metadata describing the file and its contents is also typically stored in the file system.  In distributed storage systems, multiple replicas of data in any suitable type of structure (e.g., objects, files, blocks, etc.) can be maintained in order to provide fault tolerance and high availability.).
Regarding claim 20, the data cluster of claim 17, 
wherein storing the plurality of deduplicated data chunks and the at least one parity chunk comprises: storing a deduplicated data chunk of the plurality of deduplicated data chunks on a first data node in a fault domain of the plurality of fault domains (Dirac [0348]: In some embodiments, the alternate representation may be generated and stored in parallel with the training of the model, so that, for example, only a single pass through the training data set 7002 may be needed for both (a) training the model and (b) creating and storing the alternate representation 7030.  The alternate representation may require much less (e.g., orders of magnitude less) storage or memory than is occupied by the training data set itself in some implementations.), 
wherein initiating storage metadata distribution on the storage metadata across the plurality of fault domains comprises: initiating storage of a copy of the storage metadata on a second data node in the fault domain (Dirac [0179]: In one embodiment, the OR extraction request 2401 may include compression metadata 2406, indicating for example the compression algorithm used for the data set, the sizes of the units or blocks in which the compressed data is stored (which may differ from the sizes of the chunks on which chunk-level in-memory filtering operations are to be performed), and other information that may be necessary to correctly de-compress the data set.  Decryption metadata 2408 such as keys, credentials, and/or an indication of the encryption algorithm used on the data set may be included in a request 2401 in some embodiments.  Authorization/authentication metadata 2410 to be used to be able to obtain read access to the data set may be provided by the client in request 2401 in some implementations and for certain types of data sources.  Such metadata may include, for example, an account name or user name and a corresponding set of credentials, or an identifier and password for a security container (similar to the security containers 390 shown in FIG. 3).). 
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Redlich, US 2009/0254572, Digital Information Infrastructure and Method 
Becker-Szendy, US 2011/0302446, Monitoring lost data in a storage system
Deenadhayalan, US 2008/0282105, Data Integrity Validation in Storage Systems

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SABA AHMED whose telephone number is (571)270-0236.  The examiner can normally be reached on MON – FRI: 9AM – 5PM EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain Alam can be reached on 571-272-3978. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SABA AHMED/
Examiner, Art Unit 2154

/HOSAIN T ALAM/Supervisory Patent Examiner, Art Unit 2154