DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Interpretation
Independent claim 15 recites in part, One or more non-transitory computer-readable media storing instructions that, when executed by one or more processors, program the one or more processors of a system to…Examiner is interpreting the non-transitory computer-readable media and one or more processors in light of the specification at paragraphs [0102] – [0103] detailing various hardware embodiments of computing devices, processors and computer-readable media, such that Examiner is of the position that claim 15 provides sufficient structure to support the recited functionality that independent claim 15 does not invoke 35 U.S.C. 112(f).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective 

Claims 1-2, 5-10 and 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Condict U.S. Pub. No. 2013/0018854 (hereinafter “Condict”) in further view of Brasche et al. U.S. Pub. No.  2014/0330889 (hereinafter “Brasche”).
Regarding independent claim 1, Condict discloses:
A system comprising: one or more processors; and one or more non-transitory computer-readable media maintaining executable instructions, which, when executed by the one or more processors, configure the one or more processors to perform operations comprising… (Condict at paragraph [0112] discloses a computer readable medium and a plurality of programmable processors.)

receiving a data object at the system, the data object including object data, wherein the system is a first system of three or more systems, each system located at a different respective geographic location (Condict at paragraph [0014] discloses in part, “When a write request is received at one of the nodes, one of the nodes in the cluster is selected to store the new or modified data according to a deduplication criterion…”  Additionally, Condict at paragraph [0040] discloses in part, “…the clustered storage server system 202 includes a plurality of server nodes 208 (208.1-208.N)…”  Examiner is interpreting Condict reciting a plurality of server nodes, 208.1-208.N, as reading on …three or more systems.  Lastly, Condict at paragraph [0043] discloses in part, a distributed storage system, which Examiner is interpreting as reading on …each system located at a different respective geographic location.)

determining a value representative of content of the object data of the data object (Condict at paragraphs [0100] – [0101] discloses computing a content hash and similarity hash for the new or modified data segments and chunks.  Examiner is interpreting the hash values disclosed in Condict as reading on determining a value representative…)

While Condict at paragraph [0096] discloses a write request comprising a plurality of chunks and nodes, Condict at paragraph [0019] discloses load balancing with respect to assigning write data to nodes, and Condict at paragraph [0048] discloses storing parity information, Condict does not disclose determining a number of chunks based on a number of systems, more specifically, Condict does not disclose:
determining a plurality of chunks by dividing the object data into a plurality of data chunks based on a total number of the systems and determining a parity chunk for the plurality of data chunks.
Brasche at paragraph [0031] teaches in part, “In other more balanced embodiments, the chunks are equally distributed over the receiving node to distribute the load in a balanced way over the network.  In some of the embodiments, the number of chunks equals the number of receiving nodes and each node receives one chunk.”
Both the Condict reference and the Brasche reference, in the sections cited by the Examiner, are in the field of endeavor of load balancing chunks of data in over nodes.  Before the effective filing date of the claimed invention it would have been obvious to one of ordinary skill in the art to combine the load balancing of nodes using gravitational coefficients as disclosed in Condict with the load balancing of nodes by equally distributing chunks of data across the nodes as taught in Brasche to facilitate in balancing the workload of nodes and increasing efficiency of the system.

determining a respective role value corresponding to each of the systems (Condict at paragraph [0016] discloses computing the “geometric center” value of each node based on a content hash and a similarity hash for each data segment in each chunk of the node.”  Examiner is interpreting the “geometric center” disclosed in Condict as reading on …role value.)

sending respective ones of the chunks to respective ones of the systems based on the respective role values and the value representative of the content of the object data (Condict at paragraph [0018] discloses in response to a write request, computing a similarity hash for a chunk that includes the new write data segment and comparing the hash value to the “geometric centers” of the nodes and assigning the writing the data to the node that has the most similar “geographic center”.)

Regarding dependent claim 2, all of the particulars of claim 1 have been addressed above.  Additionally, Condict discloses:
wherein the operation of sending respective ones of the chunks to respective ones of the systems based on the respective role values and the value representative of the content of the object data further comprises: 
determining, using a hashing algorithm, a hash value from the object data as the value representative of the content; and determining the respective systems to receive the respective chunks based on a function of the hash value and the respective role values  (Condict at paragraph [0018] discloses in part, “When a write request is received at any of the nodes in the cluster, the receiving node computes a new similarity hash for a chunk that includes a segment to be written (the "containing chunk") and compares it to the geometric center of each of the nodes, to determine which of the nodes geometric center is "closest" to the containing chunk in terms of its content. In one embodiment, the new or modified data segment is then written to the node whose data is determined to be closest (i.e., the node which has stored data most similar to the write data).”)

Regarding dependent claim 5, all of the particulars of claim 1 have been addressed above.  Additionally, Condict as modified with Brasche discloses:
the operations further comprising retaining one of the chunks by the system based on the role value assigned to the system so that one chunk of the plurality of chunks is maintained at each system of the three or more systems (Brasche at paragraph [0031] teaches load balancing of chunks of data across nodes and more specifically teaches, “In some of the embodiments, the number of chunks equals the number of receiving nodes and each node receives one chunk.”)

Regarding dependent claim 6, all of the particulars of claim 1 have been addressed above.  Additionally, Condict discloses:
receiving object metadata with the object; and sending, to each system that receives a chunk, at least a portion of the object metadata including the value representative of content (Condict at paragraph [0018] discloses in part, “When a write request is received at any of the nodes in the cluster, the receiving node computes a new similarity hash for a chunk that includes a segment to be written (the "containing chunk") and compares it to the geometric center of each of the nodes…”  Examiner is of the position that the similarity hash generated for the new or modified segment of Condict reads on object metadata…and the similarity hash is sent to each system or node to compare with the nodes “”geometric center”.)

Regarding dependent claim 7, all of the particulars of claim 1 have been addressed above.  Additionally, Condict discloses:
the operations further comprising performing deduplication at the system by comparing data of a chunk retained by the system with data of chunks already stored by the system to determine that the chunk retained by the system is a duplicate of a chunk already stored by the system (Condict at paragraphs [0060] – [0063] discloses comparing chunks comprising deduplication segments using hashing.) 

Regarding dependent claim 8, all of the particulars of claims 1 and 7 have been addressed above.  Additionally, Condict discloses:
the operations further comprising: adding metadata for the chunk retained by the system to a metadata data structure; associating the metadata for the chunk with a pointer to the data of the chunk already stored by the system; and indicating that the data of the chunk retained by the system is to be deleted (Condict at paragraph [0005] discloses deleting duplicate data and updating pointers.)

Regarding dependent claim 9, all of the particulars of claims 1 and 7 have been addressed above.  Additionally, Condict discloses:
wherein each other system of the three or more systems, located at the different respective geographic locations is configured to determine whether the chunk received by that system has data that is a duplicate of data of a chunk already received by that system (Condict at paragraph [0014] discloses deduplication across, and comparison across multiple storage server nodes.)



Regarding dependent claim 10, all of the particulars of claims 1 and 7 have been addressed above.  Additionally, Condict discloses:
wherein the operation of comparing the data of the chunk retained by the system with the data of the chunks already stored by the system further comprises: determining a fingerprint representative of the data of the chunk retained by the system; comparing the fingerprint with a plurality of fingerprints in a fingerprint data structure previously generated for a plurality of respective chunks already stored by the system; and based on a match between the fingerprint and one of the fingerprints in the fingerprint data structure, performing a byte-by-byte comparison of the data of the chunk retained by the system and the data of the chunk already stored by the system corresponding to the matched fingerprint in the fingerprint data structure (Condict at paragraph [0073] discloses in part, “Any hash function that operates on contents that are longer than the output of the hash function will have some collisions, since there are more different contents than possible function output values.) The content hash of a deduplication segment can also be called the segment's "fingerprint", since it identifies a particular data segment contents with a very high confidence level.”)

Regarding independent claim 14, claim 14 is rejected under the same rationale as claim 1.

Regarding independent claim 15, claim 15 is rejected under the same rationale as claim 1.

Claims 3-4 are rejected under 35 U.S.C. 103 as being unpatentable over Condict in view of Brasche in further view of Yuan et al. U.S. Pub. No.  2013/0007008 (hereinafter “Yuan”).
Regarding dependent claim 3, all of the particulars of claims 1-2 have been addressed above.  While Condict discloses load balancing and determining which nodes chunks go to based on a hash value, Condict does not disclose a modulo operation, more specifically, Condict does not disclose:
wherein the operation of determining the respective systems to receive the respective chunks based on the function of the hash value and the respective role values further comprises: 
determining a result of a modulo operation on the hash value, wherein the modulo operation is based on the total number of the systems; 
determining the respective systems to receive the respective chunks based on the result of the modulo operation and the respective role values; and sending the respective chunks to respective ones of the systems by matching the result of the modulo operation to one of the role values and distributing the chunks in a sequence based on ascending or descending role values.
However, Yuan in the Abstract teaches using a modulo operation on a hash value, the modulo operation being dependent on the number of storage modules in the system, and using the result to determine which storage module the value will be stored in.
Both the Condict reference and the Yuan reference, in the sections cited by the Examiner, are in the field of endeavor of distributing data across a plurality of storage devices.  Before the effective filing date of the claimed invention it would have been obvious to one of ordinary skill in the art to combine the load balancing of nodes using similarity hashing and gravitational coefficients as disclosed in Condict with the modulo operation taught in Yuan to facilitate in balancing the workload of nodes and increasing efficiency of the system.

Regarding dependent claim 4, all of the particulars of claims 1-2 have been addressed above.  With respect to the claim limitation reciting a modulo operation, claim 4 is rejected under the same rationale as claim 3 above.  While Condict discloses load balancing and determining which nodes chunks go to based on a hash value, Condict does not disclose a modulo operation and using a round robin distribution, more specifically, Condict does not disclose:
wherein the operation of determining the respective systems to receive the respective chunks based on the function of the hash value and the respective role values further comprises: determining a result of a modulo operation on the hash value, wherein the modulo operation is based on the total number of the systems; determining the respective systems to receive the respective chunks based on the result of the modulo operation and the respective role values; and sending the respective chunks to respective ones of the systems using a round robin distribution by matching the result of the modulo operation to one of the role values and distributing the chunks in a sequence based on ascending or descending role values, wherein the sequence is based on an order of data in the content of the object data.
However, Yuan at paragraph [0039] and Figure 3 provided below teaches in part, “As shown in FIG. 3, using L=3 as an example, the key value of KEY mod 3==0 and the corresponding data are outputted to the backend storage module 0 (the corresponding identifier of the backend storage module is 0); the key value of KEY mod 3==1 and the corresponding data are outputted to the backend storage module 1; and the key value of KEY mod 3==2 and the corresponding data are outputted to the backend storage module 2.”

    PNG
    media_image1.png
    231
    240
    media_image1.png
    Greyscale

Examiner is of the position that the Condict discloses load balancing in distributing chunks across nodes and sorting in terms of identifying the node which has a “geometric center” most closely matching a similarity hash, Brasche teaches distributing chunks equally to all the nodes in the system and Yuan teaches a modulo operation to distribute chunks across nodes.  Examiner is of the position that the equal distribution of chunks across nodes when combined with identifying a node that most closely matches a similarity hash reads on a round robin distribution in ascending or descending order.
 
Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Condict in view of Brasche in further view of  Yoshida PCT/JP2014/058861 (hereinafter “Yoshida”).
Regarding dependent claim 11, all of the particulars of claim 1 have been addressed above.  With respect to the claim limitations reciting deduplication, pointers and deletion, claim 11 is rejected under the same rationale as claims 7-8.  However Condict does not disclose retaining an object for a threshold time, more specifically, Condict does not disclose:
wherein the system is configured to retain a complete version of the object data for a threshold time, the operations further comprising: determining that the object data is a duplicate of data already stored by the system; creating, for the data object, a pointer to the data already stored by the system; indicating that the object data is to be deleted; determining that the threshold time has expired; and using the data already stored by the system as the object data for determining the plurality of chunks.
However, Yoshida at paragraph [0074] teaches periodically running a deduplication procedure such that Examiner is of the position that duplicate information would be stored on the system for a threshold period of time and then removed. 
Both the Condict reference and the Yoshida reference, in the sections cited by the Examiner, are in the field of endeavor of deduplication.  Before the effective filing date of the claimed invention it would have been obvious to one of ordinary skill in the art to combine the checking for duplicate information across a plurality of nodes as disclosed in Condict with the periodic deduplication taught in Yoshida to reduce the overhead cost of deduplication (See Yoshida at paragraph [0007]).

Claims 12-13 are rejected under 35 U.S.C. 103 as being unpatentable over Condict in view of Brasche in further view of Lee U.S. Pub. No. 2015/0149819 (hereinafter “Lee”).
Regarding dependent claim 12, all of the particulars of claim 1 have been addressed above.  Condict does not disclose:
the operations further comprising: receiving, from a client device, a request for the data object; receiving, M-2 chunks from the other systems, where M is the total number of the systems; and reconstructing the object data from a chunk retained at the system and the M-2 other chunks; and sending the reconstructed object data to the client device.
However, Lee at paragraph [0005] teaches a distributed storage using a parity chunk operating method that ensures the availability of data even if a data failure occurs, additionally, Lee at paragraphs [0079] – [0080] teaches detecting a data chunk failure on a writing process and requesting data chunks and recovering the failure data. 
Before the effective filing date of the claimed invention it would have been obvious to one of ordinary skill in the art to combine the routing data for improved deduplication as disclosed in Condict with the parity chunk operating method taught in Lee to facilitate in preventing data loss due to a failure (See Lee at paragraph [0005]).

Regarding dependent claim 13, all of the particulars of claim 1 have been addressed above.  Condict as modified with Brasche and Lee discloses:
wherein one of the M-2 chunks or the chunk retained at the system is the parity chunk, the operations further comprising: determining data for a missing data chunk using the parity chunk and others of the M-2 chunks and the chunk retained at the system; and reconstructing the object data using the data for the missing data chunk determined using the parity chunk and others of the M-2 chunks and the chunk retained at the system (Lee at paragraph [0079] – [0080] discloses using a parity chunk to recover failed (i.e., missing data).)

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANTHONY G GEMIGNANI whose telephone number is (571)272-1018.  The examiner can normally be reached on M-F 8-5 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain T Alam can be reached on 571-272-3978.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/A.G.G./Examiner, Art Unit 2154                                                                                                                                                                                                        
/HOSAIN T ALAM/Supervisory Patent Examiner, Art Unit 2154