DETAILED ACTION
1.	This communication is responsive to the Amendment filed 02/22/2021.
Claims 1-2, 7-8 and 13-14 have been amended.  Claims 19-20 have been added.  Claims 1-20 are pending in this application.  This action is made Final.

Notice of Pre-AIA  or AIA  Status
2.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Double Patenting
3.	The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

4.	Claims 1-20 are rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 1-18 of U.S. Patent No. 10,210,186.  Although the conflicting claims are not identical, they are not patentably distinct from each other.  

U.S. Patent Application 16/240,358
Claim 1 
A data processing method in a data processing system comprising a client, the data processing method comprising: 
receiving, by the client, data; 
dividing, by the client, the data into multiple data blocks; 
extracting one or more bits from fingerprint values obtained for each of the multiple data blocks; converting a bit whose value is zero in the extracted bits into negative one to obtain a converted bit; and 
obtaining a new vector by adding the converted bit that is at a same location in each of the fingerprint values of the multiple data blocks.  

U.S. Patent No. 10,210,186
Claim 1
A data processing method configured to be performed in a data processing system, the data processing system comprising a client, a first storage node, and a second storage node, the client coupled to the first storage node and the second storage node, and the data processing method comprising: 
prestoring, by the client, a first vector and a second vector locally to the client, the first vector representing a feature of data blocks stored in the first storage node, and the second vector representing a feature of data blocks stored in the second storage node; periodically updating, by the client, the first vector and the second vector prestored locally to the client;  
receiving, by the client, data;  
dividing, by the client, the data into multiple data blocks;  
obtaining, by the client, fingerprint values of the multiple data blocks and a third vector representing a feature of the data received by the client;  
comparing, by the client, the third vector with the first vector and the second vector to determine the first storage node as a first target storage node;  
determining, by the client, non-deduplicate data blocks from the multiple data blocks by comparing the fingerprint values of the multiple data blocks and fingerprint values of the data blocks stored in the first storage node;  
storing, by the client, the non-deduplicate data blocks from the multiple data blocks to the first target storage node;  
extracting one or more bits from each of the fingerprint values of the multiple data blocks;  
converting a bit whose value is zero in the extracted bits into negative one to obtain a converted bit;  
adding the converted bit that is at a same location in each of the fingerprint values of the multiple data blocks to obtain the third vector;  and 
comparing, in a same multidimensional space, each location of the first vector, the second vector, and the third vector to determine the first vector that forms an included angle with a smallest cosine value with the third vector, the first storage node corresponding to the first vector comprising the first target storage node. 

It is noted that the claimed limitations of claims 1-20 of Patent Application 16/240,358 are similar to that of claims 1-18 of U.S. Patent No. 10,210,186.  It appears to be proper to apply the judicially created doctrine of obvious-type double patenting to the claims at issue.

Claim Rejections - 35 USC § 103
5.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

6.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


7.	The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

8.	Claims 1-18 are rejected under 35 U.S.C. 103 as being unpatentable over Condict (US 2011/0099351) in view of Friedlander et al. (US 2014/0344548) hereinafter Friedlander.

In claim 1, Condict discloses “A data processing method in a data processing system comprising a client, the data processing method comprising: 
receiving, by the client, data ([0056] the storage manager 460 generates operations to load (retrieve) the requested data from disk 212 if it is not resident in memory 320.  If the information is not in memory 320, the storage manager 460 indexes into a 
metadata file to access an appropriate entry and retrieve a logical VBN.  The storage manager 460 then passes a message structure including the logical VBN to the RAID system 480; the logical VBN is mapped to a disk identifier and disk block number (DBN) and sent to an appropriate driver (e.g., SCSI) of the disk driver system 490.  The disk driver accesses the DBN from the specified disk 212 and loads the requested data block(s) in memory for processing by the node); 
dividing, by the client, the data into multiple data blocks ([0059], the data 500 stored in each node and any new data that is added to any node in the cluster is divided logically into contiguous sequences called "deduplication segments", "data segments" or simply "segments". These segments 501 are the basic units of deduplication/data sharing); 
extracting one or more bits from fingerprint values obtained for each of the multiple data blocks ([0070] For each deduplication segment, whether fixed size or variable-sized, a content hash is computed from the contents of the segment.  This content hash is a fixed-length sequence of bits.  In one embodiment it could be a 64-bit checksum computed from the contents.  In another, it could be the SHA-1 function computed on the contents of the segment.  The important property of the hash function is that any two segments with different contents have a very high probability of having different content hash values; that is, the hash function produces few collisions.  The content hash of a deduplication segment can also be called the segment's "fingerprint", since it identifies a particular data segment contents with a very high confidence level); 
obtaining a new vector by adding the converted bit that is at a same location in each of the fingerprint values of the multiple data blocks ([0076] For each chunk 503, a similarity hash 510 is defined, based on the content hashes 508 of all of the segments 501 in the chunk.  Assuming each content hash 508 is n bits long and there are k segments in each chunk 503, the similarity hash 510 can be an array of n integers, each in the range of 0 through k, where the integer value at position i is the number of occurrences of the value 1 in bit position i across the content hashes 508 of all of the segments 501 in the chunk)”.
Condict does not appear to explicitly disclose however, Friedlander discloses “converting a bit whose value is zero in the extracted bits into negative one to obtain a converted bit ([0045] In order to sum and compare these logical addresses, they are all first run though a logical address vector converter 410, which flips each "0" in the original logical addresses to a "-1", as depicted in converted logical addresses 302b-308b.  These converted logical addresses 302b-308b are then summed in a logical address vector summer 412, which produces the summed address vector 314.  Summed address vector 314 is a sum of each bit in each particular bit location in each of the converted logical addresses 302b-308b [0051] converting, by an address vector converter, each zero bit in a set of address vectors that describe a set of physical addresses to a negative one bit to generate a set of converted address vectors.  An address vector summer then sums each bit position from the set of address vectors to generate a summation address vector)”.
Hence, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to combine Condict and Friedlander, the suggestion/motivation for doing so would have been to an improved method to analyzing stored data by converting raw data into a first logical address and payload data, where the first logical address describes metadata about the payload data (Abstract).		

In claim 2, Condict teaches 
The data processing method of claim 1, further comprising comparing, in a same multidimensional space, each location of a first vector, a second vector, and the new vector to determine a storage node as a first target storage node for deduplication, the first vector representing a feature of data blocks stored in a first storage node, and the second vector representing a feature of data blocks stored in a second storage node ([0065] keeping track of a "center of gravity" of each node, treating the position of each of the node's chunks as the center of a particle of equal mass.  As the storage used on a node approaches its physical limit, or the rate of I/O operations to and from the node approaches the bandwidth limit, its gravitational coefficient is reduced towards zero.  Each new chunk of data tends to be sent to the node that has the strongest 
gravitational attraction for it, based on the distance of the chunk's location from the center of gravity of the node, and the node's gravitational coefficient).  

In claim 3, Condict teaches 
The data processing method of claim 2, further comprising: 
determining, by the client, non-deduplicate data blocks from the multiple data blocks by comparing the fingerprint values of the multiple data blocks and fingerprint values of the data blocks stored in the first storage node ([0062] The technique then computes a value 510 called a "similarity hash" for each chunk, as a function of the content hashes 508 of all of the segments in the chunk.  The similarity hash 510 in one embodiment is an integer vector that contains a fixed-number of components, where the number of components is the same as the number of bits, n, in each content hash.  This vector is treated as a point in n-dimensional space); and 
storing, by the client, the non-deduplicate data blocks ([0062] An important property of a similarity hash 510 is that two chunks 503 that contain many identical segments 501 will have similarity hashes that are close to one another in the n-dimensional vector space, i.e., closer than would be expected by chance.  The similarity hash 510 of a chunk may also be referred to as the "position" of the chunk in n-space).  

In claim 4, Condict teaches 
The data processing method of claim 2, further comprising comparing, in the same multidimensional space, each location of the new vector, the first vector, and the second vector to determine the first vector as a vector closest to the new vector ([0063] To decide which node of the cluster is most likely to contain data that is most similar to new data that is to be added to the storage cluster, the technique keeps track of, for each node, the geometric center (or just "center") of the positions of all the chunks on the node.  When a new chunk of data is added to the cluster, it is sent to the node whose center is closest to the "position" of the chunk).  

In claim 5, Condict teaches 
The data processing method of claim 3, further comprising sending the fingerprint values of the multiple data blocks to the first storage node to search for duplicate data by comparing the fingerprint values of the multiple data blocks and the fingerprint values of the data blocks stored in the first storage node to determine non-deduplicate data blocks from the multiple data blocks ([0095] identifies the node whose geometric center is closest to the new similarity hash for the chunk.  This is a search among a relatively small number of items, i.e., the number of nodes in the cluster.  For each node, this operation involves a computation of the distance between two points in n-dimensional space.  The distance measure can be, but does not have to be, Euclidean (which involves expensive squaring and square root operations)).  

In claim 6, Condict teaches 
The data processing method of claim 3, further comprising loading the fingerprint values of the data blocks stored in the first storage node to search for duplicate data by comparing the fingerprint values of the multiple data blocks and the fingerprint values of the data blocks stored in the first storage node to determine non-deduplicate data blocks from the multiple data blocks ([0067], large data sequences (files, extents, or blocks) are broken into fixed-length or variable-length pieces called segments for the 
purpose of deduplication.  The goal of deduplication is that two different segments with the same contents do not exist.  In a storage cluster, it is desirable to achieve this across all the nodes of the cluster, such that no two nodes of the cluster contain a segment with the same contents).

Claims 7-12 are essentially same as claims 1-6 except that they recite claimed invention as a system and are rejected for the same reasons as applied hereinabove.

Claims 13-18 are essentially same as claims 1-6 except that they recite claimed invention as a nonvolatile computer readable storage medium and are rejected for the same reasons as applied hereinabove.

Allowable Subject Matter
9.	The newly added claims 19 and 20 “comparing, in a same multidimensional space, each location of a first vector, a second vector, and the new vector to determine a storage node as a first target storage node for deduplication, wherein the first vector represents a feature of data blocks stored in a first storage node, wherein the second vector represents a feature of data blocks stored in a second storage node, and wherein the first storage node corresponding to the first vector is the first target storage node responsive to a first included angle formed by the first vector and the new vector having a smaller cosine value than a second included angle formed by the second vector and the new vector”, are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Response to Arguments
10.	In the remarks, the applicant argues that:
The Office Action proposes modifying or combining Condict with Friedlander to instead compute the similarity hash (i.e., the purported analog in Condit to the “new vector” of exemplary claim 1), by adding converted bits in the same location in each fingerprint value of multiple data blocks, where the converted bits are obtained by converting bit values of zero into negative one. However, such modification or combination is inappropriate. For example, neither Condict nor Friedlander describes how Condicf s geometric center could be computed if Condicf s similarity hash were modified as proposed, thus rendering Condict inoperable for its intended purpose and/or destroying a principle of operation of Condict. Additionally, neither Condict nor Friedlander describes how Condicf’s center of gravity could be computed if Condic’s similarity hash were modified as proposed, which renders Condict further inoperable for its intended purpose and/or destroys a principle of operation of Condict.
Examiner Responds: Friedlander discloses hashing raw data to generate logical addresses (FIG. 2 and 3 [0030][0031][0032][0033]), using a logical address vector converter to convert the logical addresses, and comparing logical addresses to determine their relativity/similarity (FIG. 4 [0045][0046][0051][0052[0053]).  Condict discloses computing hash for each data segment in each chunk stored in each node ([0015]), for each chunk 503, a similarity hash 510 is defined, based on the content hashes 508 of all of the segments 501 in the chunk (Fig. 5A [0076]), for each node 208 of the cluster, an array of integers of the same size as the similarity hash is also maintained.  This is called the geometric center of the node ([0083]), tracking of a "center of gravity" of each node, treating the position of each of the node's chunks as the center of a particle of equal mass ([0065[0100][0101]).  Without any scientific evident and indications the Applicant argued/concluded combining Friedlander’s logical address vector converting will make Condict’s teaching inoperable and/or destroys a principle of operation. The Applicant also questioned about how to compute Condicf’s “geometric center” and “center of gravity” if bits from fingerprint values converted by Friedlander’s teaching,  Condict does not suggest “converting bits of fingerprint value” or “particular bits of fingerprint value” will alter the principle of computing “geometric center” and “center of gravity”.  Condict and Friedlander are analogous art because they are from the area of hashing data to determine/identify the similarity between the data and therefore they are combinable.

11.	To expedite the prosecution the Examiner suggests to (1) file a Terminal Disclaimer to overcome the Double Patenting rejections; (2) incorporate the allowable subject matter into all the independent claims.
Conclusion
12.	THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUAWEN A PENG whose telephone number is (571)270-5215.  The examiner can normally be reached on Mon thru Fri 8 am to 4 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Boris Gorney can be reached on 571-270-5626.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/HUAWEN A PENG/Primary Examiner, Art Unit 2158