DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is responsive to the reply filed 24 October 2022.
Claims 1-5, 7-18 and 20-28 are pending and have been presented for examination.
Claims 6 and 19 have been cancelled.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 24 October 2022 has been entered.

Response to Arguments
Applicant's arguments filed 24 October 2022 have been fully considered but they are not persuasive.
Applicant argues (see page 13):
Upon the incorporations, Applicant respectfully submits that amended claims 1 and 14 are patentable for at least the reason that the subject matters of claims 9 and 22 are admitted by the Office as allowable subject matters over the arts of record (see page 17 of the Final Office action).
The Examiner respectfully disagrees.  As only a small portion of claims 9 and 22 were added to claims 1 and 14, respectively, claims 1 and 14 are not patentable.  Claims 9 and 22 contain additional steps regarding a second distribution calculation and dividing of the erasure data that are not disclosed by the prior art.  The Examiner has modified the rejection below to show how GERSHANECK discloses erasure coding and a key associated with the erasure data.
 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-3, 5, 8, 14-16, 18, 21 and 27-28 is/are rejected under 35 U.S.C. 103 as being unpatentable over GERSHANECK (U.S. Patent Application Publication #2020/0034339) in view of EFSTATHOPOULOUS (U.S. Patent #8,898,120) and CHO (U.S. Patent Application Publication #2018/0143780).

1. GERSHANECK discloses A data storage system, comprising: a plurality of storage devices (see [0028]-[0030]: distributed storage topography includes a plurality of systems that include one or more storage devices), suitable for coupling to a communication network (see [0031]: the systems communicate over a network with each other, as well as client devices); and a dispatch device, suitable for receiving a data writing request (see [0034]: one system, among the multiple systems, receives a write requests from a client), wherein the dispatch device is configured to divide an original data corresponding to the data writing request into at least one data chunk (see [0036]: object that is received from the client is divided into a number of chunks), the dispatch device performs a summary calculation on a current data chunk in the at least one data chunk, so as to generate a representative value corresponding to the current data chunk (see [0038]: a hash value of the object is calculated), the dispatch device performs a first distribution calculation on the representative value, so as to determine a destination location corresponding to the representative value (see [0039]-[0040]: a modulo operation is performed on the hash value to determine a target system to send the object), and the dispatch device transmits the current data chunk and the representative value to at least one destination storage device among the storage devices through the communication network according to the destination location (see [0042]: the first system transmits the chunks over the one or more networks to the target system; [0044]: the object metadata is also transmitted, the object metadata includes the hash that was calculated {representative value}), wherein the dispatch device comprises an erasure coding device (see [0036]-[0037]: parity data is generated; [0029]: each system includes a service computing device, this would be the erasure coding device since it executes the program that generates the parity), and the erasure coding device is configured to generate at least one erasure data (see [0036]-[0037]: generate parity data) and generate at least one key value corresponding to the at least one erasure data by performing an erasure coding calculation on the original data (see [0037]-[0041]: parity data is generated using the original data, in the given example, this results in three data chunks and one parity chunk.  Then a hash is computed for the object data, the hash is used to determine where each chunk is stored.  A modulo four algorithm is used to distribute the three data chunks and one parity chunk to the systems.  Mod4 of the hash is the key value corresponding to the erasure data), wherein the at least one destination storage device checks the representative value, so as to determine whether to store the current data chunk in a storage space of the at least one destination storage device (see EFSTATHOPOULOS and CHO below). 
EFSTATHOPOULOUS and CHO disclose the following limitation that is not disclosed by GERSHANECK: wherein the at least one destination storage device checks the representative value, so as to determine whether to store the current data chunk in a storage space of the at least one destination storage device (see EFSTATHOPOULOUS column 10, lines 15-35: the data object and the fingerprint can be transmitted to the target storage device, each storage device can perform deduplication locally; CHO [0035], [0042]: the index, which is a hash, is transmitted along with the data to the storage node, the storage node either stores the data if the data is not present, or updates a reference count in the mapping if the data is present).  Allowing deduplication to be performed locally on the storage node can achieve a substantial rate of deduplication while improving the scalability of a deduplicated data system (see EFSTATHOPOULOUS, column 10, lines 30-35).  Since GERSHANECK is directed to a distributed storage system, the ability to perform deduplication while improving scalability would be beneficial.  CHO discloses one way to implement the deduplication disclosed by EFSTATHOPOULOUS, by deciding about the presence of the data object using the hash.  This results in a deduplication system that efficiently uses storage space (see CHO [0003]).
	It would have been obvious, before the effective filing date of the claimed invention, to a person having ordinary skill in the art to which said subject matter pertains to modify GERSHANECK to decide to store a data chunk based on the representative value, as disclosed by EFSTATHOPOULOUS and CHO.  One of ordinary skill in the art would have been motivated to make such a modification to achieve a substantial rate of deduplication and utilize the storage space efficiently, as taught by EFSTATHOPOULOUS and CHO.  GERSHANECK, EFSTATHOPOULOUS and CHO are analogous/in the same field of endeavor as the references are directed to distributed storage systems and the calculation of hash values when storing data objects in a distributed storage system.

2. The data storage system according to claim 1, wherein at least one of the storage devices is selected to serve as the dispatch device (see GERSHANECK [0034]-[0036]: a fist system of the multiple systems is the one to receive the write request, divided the object and distribute the chunks to the other systems).

3. The data storage system according to claim 1, wherein the summary calculation comprises a hash algorithm, and the representative value comprises a hash value of the current data chunk (see GERSHANECK [0038], [0044]: hash).

5. The data storage system according to claim 1: wherein the at least one destination storage device checks whether the representative value exists in a lookup table (see CHO [0040]: check whether hash exists in the table), the at least one destination storage device abandons storing the current data chunk in the storage space, adjusts and increases a reference number value corresponding to the representative value, and updates the reference number value to the lookup table when the lookup table has the representative value (see CHO [0042]: if the data is present, the data is not written in the storage device and a reference count in the table is updated); and the at least one destination storage device stores the current data chunk in a physical address of the storage space, sets the reference number value corresponding to the representative value to an initial value, and records the representative value, the physical address, and the reference number value in the lookup table when the lookup table does not have the representative value (see CHO [0041]: when a hash is not present in the table, a new entry is created and the data is stored on the storage device).  

8. The data storage system according to claim 1, wherein the at least one destination storage device comprises a first destination storage device and a second destination storage device, and the dispatch device transmits the current data chunk and the representative value to the first destination storage device and the second destination storage device through the communication network (see GERSHANECK [0042]: multiple storage devices are part of the system, parts of the data object are transmitted to each device).

14. GERSHANECK discloses A global deduplication method, comprising: receiving a data writing request (see [0034]: one system, among the multiple systems, receives a write requests from a client), dividing an original data corresponding to the data writing request into at least one data chunk by the dispatch device (see [0036]: object that is received from the client is divided into a number of chunks), performing a summary calculation on a current data chunk in the at least one data chunk by the dispatch device, so as to generate a representative value corresponding to the current data chunk (see [0038]: a hash value of the object is calculated), performing a first distribution calculation on the representative value by the dispatch device, so as to determine a destination location corresponding to the representative value (see [0039]-[0040]: a modulo operation is performed on the hash value to determine a target system to send the object), transmitting the current data chunk and the representative value to at least one destination storage device among the storage devices through the communication network according to the destination location (see [0042]: the first system transmits the chunks over the one or more networks to the target system; [0044]: the object metadata is also transmitted, the object metadata includes the hash that was calculated {representative value}), and checking the representative value by the at least one destination storage device, so as to determine whether to store the current data chunk in a storage space of the at least one destination storage device (see EFSTATHOPOULOUS and CHO below), wherein the dispatch device comprises an erasure coding device (see [0036]-[0037]: parity data is generated; [0029]: each system includes a service computing device, this would be the erasure coding device since it executes the program that generates the parity), the global deduplication method further comprising: performing an erasure coding calculation on the original data to generate at least one erasure data (see [0036]-[0037]: generate parity data) and to generate at least one key value corresponding to the erasure data (see [0037]-[0041]: parity data is generated using the original data, in the given example, this results in three data chunks and one parity chunk.  Then a hash is computed for the object data, the hash is used to determine where each chunk is stored.  A modulo four algorithm is used to distribute the three data chunks and one parity chunk to the systems.  Mod4 of the hash is the key value corresponding to the erasure data).
EFSTATHOPOULOUS and CHO disclose the following limitation that is not disclosed by GERSHANECK: checking the representative value by the at least one destination storage device, so as to determine whether to store the current data chunk in a storage space of the at least one destination storage device (see EFSTATHOPOULOUS column 10, lines 15-35: the data object and the fingerprint can be transmitted to the target storage device, each storage device can perform deduplication locally; CHO [0035], [0042]: the index, which is a hash, is transmitted along with the data to the storage node, the storage node either stores the data if the data is not present, or updates a reference count in the mapping if the data is present).  Allowing deduplication to be performed locally on the storage node can achieve a substantial rate of deduplication while improving the scalability of a deduplicated data system (see EFSTATHOPOULOUS, column 10, lines 30-35).  Since GERSHANECK is directed to a distributed storage system, the ability to perform deduplication while improving scalability would be beneficial.  CHO discloses one way to implement the deduplication disclosed by EFSTATHOPOULOUS, by deciding about the presence of the data object using the hash.  This results in a deduplication system that efficiently uses storage space (see CHO [0003]).
	It would have been obvious, before the effective filing date of the claimed invention, to a person having ordinary skill in the art to which said subject matter pertains to modify GERSHANECK to decide to store a data chunk based on the representative value, as disclosed by EFSTATHOPOULOUS and CHO.  One of ordinary skill in the art would have been motivated to make such a modification to achieve a substantial rate of deduplication and utilize the storage space efficiently, as taught by EFSTATHOPOULOUS and CHO.  GERSHANECK, EFSTATHOPOULOUS and CHO are analogous/in the same field of endeavor as the references are directed to distributed storage systems and the calculation of hash values when storing data objects in a distributed storage system.

15. The global deduplication method according to claim 14, wherein at least one of the storage devices is selected to serve as the dispatch device (see GERSHANECK [0034]-[0036]: a fist system of the multiple systems is the one to receive the write request, divided the object and distribute the chunks to the other systems).

16. The global deduplication method according to claim 14, wherein the summary calculation comprises a hash algorithm, and the representative value comprises a hash value of the current data chunk (see GERSHANECK [0038], [0044]: hash).

18. The global deduplication method according to claim 14, further comprising: checking whether the representative value exists in a lookup table by the at least one destination storage device (see CHO: [0040]: check whether hash exists in the table); abandoning storing the current data chunk in the storage space, adjusting and increasing a reference number value corresponding to the representative value, and updating the reference number value to the lookup table by the at least one destination storage device when the lookup table has the representative value (see CHO [0042]: if the data is present, the data is not written in the storage device and a reference count in the table is updated); and storing the current data chunk in a physical address of the storage space, setting the reference number value corresponding to the representative value to an initial value, and recording the representative value, the physical address, and the reference number value in the lookup table by the at least one destination storage device when the lookup table does not have the representative value (see CHO [0041]: when a hash is not present in the table, a new entry is created and the data is stored on the storage device).

21. The global deduplication method according to claim 14, wherein the at least one destination storage device comprises a first destination storage device and a second destination storage device, the global deduplication method further comprising: transmitting the current data chunk and the representative value to the first destination storage device and the second destination storage device by the dispatch device through the communication network (see GERSHANECK [0042]: multiple storage devices are part of the system, parts of the data object are transmitted to each device).

27. The data storage system according to claim 1, wherein the data writing request comprises an original key value (see GERSHANECK [0034]: object identifier), and the dispatch device records a mapping relationship between the original key value and the representative value in a lookup table (see CHO [0039]: logical address used by the host device is associated with the hash).

28. The global deduplication method according to claim 14, wherein the data writing request comprises an original key value (see GERSHANECK [0034]: object identifier), the global deduplication method further comprising: recording a mapping relationship between the original key value and the representative value in a lookup table by the dispatch device (see CHO [0039]: logical address used by the host device is associated with the hash).


Claims 4 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over GERSHANECK (U.S. Patent Application Publication #2020/0034339), EFSTATHOPOULOUS (U.S. Patent #8,898,120) and CHO (U.S. Patent Application Publication #2018/0143780) as applied to claims 1-3, 5, 8, 14-16, 18 and 21 above, and further in view of UDUPI (U.S. Patent Application Publication #2016/0349993).

4. The data storage system according to claim 1 (see GERSHANECK above), wherein the first distribution calculation comprises a Ceph CRUSH algorithm (see UDUPI below).
UDUPI discloses the following elements that are not disclosed by GERSHANECK: wherein the first distribution calculation comprises a Ceph CRUSH algorithm (see [0026]-[0028]: ceph CRUSH algorithm is used to distribute data chunks in a storage system).  The ceph CRUSH algorithm avoids a single point of failure, a performance bottleneck and physical limit to scalability when storing data in a distributed storage system (see [0025]).
	It would have been obvious, before the effective filing date of the claimed invention, to a person having ordinary skill in the art to which said subject matter pertains to modify GERSHANECK to use the ceph CRUSH algorithm, as disclosed by UDUPI.  One of ordinary skill in the art would have been motivated to make such a modification to avoid a single point of failure, as taught by UDUPI.  GERSHANECK and UDUPI are analogous/in the same field of endeavor as both references are directed to distributed storage systems and a way of mapping data to the multiple storage devices.

17. The global deduplication method according to claim 14 (see GERSHANECK above), wherein the first distribution calculation comprises a Ceph CRUSH algorithm (see UDUPI below).
UDUPI discloses the following elements that are not disclosed by GERSHANECK: wherein the first distribution calculation comprises a Ceph CRUSH algorithm (see [0026]-[0028]: ceph CRUSH algorithm is used to distribute data chunks in a storage system).  The ceph CRUSH algorithm avoids a single point of failure, a performance bottleneck and physical limit to scalability when storing data in a distributed storage system (see [0025]).
	It would have been obvious, before the effective filing date of the claimed invention, to a person having ordinary skill in the art to which said subject matter pertains to modify GERSHANECK to use the ceph CRUSH algorithm, as disclosed by UDUPI.  One of ordinary skill in the art would have been motivated to make such a modification to avoid a single point of failure, as taught by UDUPI.  GERSHANECK and UDUPI are analogous/in the same field of endeavor as both references are directed to distributed storage systems and a way of mapping data to the multiple storage devices.

Claims 7 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over GERSHANECK (U.S. Patent Application Publication #2020/0034339), EFSTATHOPOULOUS (U.S. Patent #8,898,120) and CHO (U.S. Patent Application Publication #2018/0143780) as applied to claims 1-3, 5, 8, 14-16, 18 and 21 above, and further in view of SHANMUGANATHAN (U.S. Patent Application Publication #2014/0307701).

7. The data storage system according to claim 1 (see GERSHANECK above), wherein the dispatch device comprises a main dispatch device and a backup dispatch device (see SHANMUGANATHAN below), the data writing request further comprises an original key value (see CHO [0039]: logical address used by the host device is associated with the hash), and the main dispatch device records a mapping relationship between the original key value and the representative value in a lookup table, and provides the lookup table to the backup dispatch device (see SHANMUGANATHAN below).
SHANMUGANATHAN discloses the following elements not disclosed by GERSHANECK: the dispatch device comprises a main dispatch device and a backup dispatch device, and the main dispatch device records a mapping relationship between the original key value and the representative value in a lookup table, and provides the lookup table to the backup dispatch device (see [0018]: deduplication map is propagated to other nodes in the cluster).  GERSHANECK already discloses the use of hash values associated with the data objects.  The first system that receives the write request calculates this hash value.  The hash would then clearly reside on this first system; however, it is not clear if this table is sent to other systems as well.  SHANMUGANATHAN discloses propagating this information to other nodes in the system.  This provides for additional redundancy to allow other nodes to access the deduplicated data on any node (see [0022]) and provide fault tolerance (see [0023]).
	It would have been obvious, before the effective filing date of the claimed invention, to a person having ordinary skill in the art to which said subject matter pertains to modify GERSHANECK to transmit the relationship to a backup device, as disclosed by SHANMUGANATHAN.  One of ordinary skill in the art would have been motivated to make such a modification to provide redundancy and fault tolerance, as taught by SHANMUGANATHAN.  GERSHANECK and SHANMUGANATHAN are analogous/in the same field of endeavor as both references are distributed storage systems.

20. The global deduplication method according to claim 14 (see GERSHANECK above), wherein the dispatch device comprises a main dispatch device and a backup dispatch device (see SHANMUGANATHAN below), and the data writing request further comprises an original key value (see CHO [0039]: logical address used by the host device is associated with the hash), the global deduplication method further comprising: recording a mapping relationship between the original key value and the representative value in a lookup table by the main dispatch device; and providing the lookup table to the backup dispatch device by the main dispatch device (see SHANMUGANATHAN below).
SHANMUGANATHAN discloses the following elements not disclosed by GERSHANECK: the dispatch device comprises a main dispatch device and a backup dispatch device, the global deduplication method further comprising: recording a mapping relationship between the original key value and the representative value in a lookup table by the main dispatch device; and providing the lookup table to the backup dispatch device by the main dispatch device (see [0018]: deduplication map is propagated to other nodes in the cluster).  GERSHANECK already discloses the use of hash values associated with the data objects.  The first system that receives the write request calculates this hash value.  The hash would then clearly reside on this first system; however, it is not clear if this table is sent to other systems as well.  SHANMUGANATHAN discloses propagating this information to other nodes in the system.  This provides for additional redundancy to allow other nodes to access the deduplicated data on any node (see [0022]) and provide fault tolerance (see [0023]).
	It would have been obvious, before the effective filing date of the claimed invention, to a person having ordinary skill in the art to which said subject matter pertains to modify GERSHANECK to transmit the relationship to a backup device, as disclosed by SHANMUGANATHAN.  One of ordinary skill in the art would have been motivated to make such a modification to provide redundancy and fault tolerance, as taught by SHANMUGANATHAN.  GERSHANECK and SHANMUGANATHAN are analogous/in the same field of endeavor as both references are distributed storage systems.

Allowable Subject Matter
Claims 9-13 and 22-26 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EDWARD J DUDEK JR whose telephone number is (571)270-1030. The examiner can normally be reached Monday - Friday, 8:00A-4:00P.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Rones can be reached on 571-272-4085. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/EDWARD J DUDEK  JR/Primary Examiner, Art Unit 2136