DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This office action is in response to the communication filed on 04/29/2019.
Claims 1-19 are pending for consideration.

Claim Rejections - 35 USC § 112The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 7 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

	Regarding claim 7, the claim recites a limitation “the data extent” on line 8 of the claim. There is a lack of antecedent basis for the limitation.  The term “data extent” was not mentioned before that limitation.
 
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-6, 10-12 and 18-19 are rejected under 35 U.S.C.101 because the claimed invention is directed to abstract ideas without significantly more.
	Step 1 Statutory Category:
		Claims 1-6 and 10-12 are directed to a method of performing deduplication which is a process. The claims are directed to statutory categories.		Claim 18 is directed to in apparatus comprising processing circuitry coupled to memory which is a machine.  The claim is directed to statutory categories.
		Claims 19 is directed to computer program product comprising a non-transitory computer-readable storage medium, which is an article of manufacture. The claim is directed to statutory categories.
Step 2A Prong 1 Judicial exception:		The independent claims recite the following limitations which have been identified as reciting a Mental Process:		Claim 1 recites “… selecting … a sub-block of a block of… by applying a deterministic selection criterion; performing a lookup … of a digest…; effecting storage of the block …”;Step 2A Prong 2, additional elements that integrate into a practical application of the exception:		The claim 1 includes steps such as “… applying a deterministic selection criterion…;  … a digest generated by hashing the selected sub-block …; entry identifying a previously processed block …; … pointing to the previously processed block”.  Applying a deterministic selection step without details can be as simple as selecting the first entry.  Generating a digest of a data is a common hashing mechanism and also is a simple application of a math formula that can be done by a pen and paper.  The identifying a previously processed block is a simple idea of referencing another piece of data that has been written.  An example would be a reference to another article by its title and publication date or referencing a prior art by the prior art ID and page/section numbers without copy and pasting the information inline.  These are insignificant extra solution activities, see MPEP 2106.05(b)(I). When considered individually or as an ordered combination, the claims as a whole do not integrate the judicial exception into a practical application (see MPEP 2106.04(d)(II))..		The idea of searching for a duplicated segment of data using its signature 	Step 2B significantly more:
		The claim 1 includes steps such as “… applying a deterministic selection criterion…;  … a digest generated by hashing the selected sub-block …; entry identifying a previously processed block …; effecting storage of the block … pointing to the previously processed block”. The additional steps of applying a deterministic selection, generate a signature of a piece of data, referencing a written piece of data are basic human actions performed on a general purpose computer.  They are well known and well understood ideas for a person skilled in the art, see MPEP 2106.05(d)(II) and Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015).  When considered individually or viewed as an ordered combination, the claims as a whole do not amount to significantly more than the abstract idea (See MPEP 2106.05 (a)).
		Independent claims 18 and 19 are a system claim and an article of manufacture claim respectively corresponding to claim 1.  The claims are similarly rejected for the same reasons as that of claim 1.
		Regarding dependent claim 2, the claim recites “… identifying a target range of the previously processed block whose contents match a corresponding range of the block of data; and wherein pointing to the previously processed block includes storing metadata in connection with the block that associates the identified target range as being part of the block of data.”  These steps do not cure the deficiencies of the claim 1 with respect to abstract idea.  They are observation and evaluation of data, and instead of recording the duplicated data, the data is recorded using a reference to the duplicated data instead. These are simply a reference to another document using its ID such as publication ID or book title, chapter and page number.  These are basic human process applied on a conventional computer using generic hardware.  As a result, claim 2 is an abstract idea.	Regarding claim 3, the claim recites “the previously processed block also includes a unique range that does not match contents of the block of data”.  This is an intended use that does not have any side effect on the method being performed or what the comparison result is used.  The indication that some part of data does not match when doing a data subset search is not a new idea.  It does not add value nor is significantly more than an abstract idea (See MPEP 2106.05 (a)).  As a result, the claim is still an abstract idea.
	Regarding claim 4, the claim recites “the entry further identifies a first offset … wherein the selected sub-block is located at a second offset within the block of data; identifying the target range includes comparing the second offset with the first offset, and determining that the second offset differs from the first offset”.  These are further details of observation, evaluation and determination.  These are basic human action applied on a generic computer.  The further ideas of using an offset to identify a start of a block of data is not new and does not improve on existing technologies.  The determination of the second offset differing than the first offset is not significantly more.  When comparing data, it’s expected to either match or not match.   When considered as an individual or as an ordered combination, the limitations do not integrate the judicial exception into a practical application (see MPEP 2106.04(d)(II)) and is not significantly more (See MPEP 2106.05 (a)).  As a result, the claim is an abstract idea.	Regarding claim 5, the claim recites “… identifying the target range includes: generating a first block digest by hashing the block in its entirety; comparing the first block digest with a previously-stored second block digest that was generated by hashing the previously processed block in its entirety, and determining that the first block digest differs from the second block digest.”.  The use of hashing of data for comparison is a common practice for a person skilled in the art.  Comparing two blocks of data is a basic human activity implemented on a conventional computer of conventional hardware.  The result of a comparison can be a match (same) or not match (different) is an expected outcome.  As a result, these limitations when considered individually or considered as an ordered combination, the claim as a whole does not integrate the judicial exception into a practical application (see MPEP identifies a first offset …; and wherein identifying the target range includes: identifying the prime sub-block as part of the target range; and comparing a first adjacent sub-block ...”.  These steps basic human actions of observation, evaluation and determination that are performed on a generic computer with conventional hardware.  They can be done with a pen and paper.  The claim further recites “the previously processed block having an offset adjacent to the first offset with a second adjacent sub-block of the block of data having an offset adjacent to the second offset to identify whether the first adjacent sub-block is also part of the target range”. Doing a sequential search starting at a location of matched data is a basic human process applied on a generic computer.  It’s a primitive search method without any novelty in itself.  Combined with other limitations as a whole or in an ordered sequence, the method does not integrate the judicial exception into a practical application (see MPEP 2106.04(d)(II)) and is not significantly more than an abstract idea (See MPEP 2106.05 (a)).  The claim is an abstract idea.

	Regarding claim 10, the claim recites “… generating a block digest by hashing the block in its entirety; and inserting another entry, indexed by the block digest, into the deduplication table, the other entry identifying the block as having been processed”.  The generating a unit ID of a block of data using hashing method is not new and commonly used in the art.  The saving of this hash and marking a block of data as identifying the target range includes: generating a first block digest by hashing the block in its entirety; and comparing the first block digest to a previously-stored second block digest that was generated by hashing the previously processed block in its entirety, and in response, determining that the first block digest equals the second block digest”.  These additional details do not cure the deficiencies of the parent claim with regard to abstract idea.  They use a common technique of hashing for blocks of data for identification and 
	
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-3, 5, 10-12 and 18-19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Colgrove et al. (US 9940060 B1 hereinafter Colgrove).	Regarding claim 1, Colgrove teaches a method of performing deduplication, the method comprising:		selecting, by applying a deterministic selection criterion, a sub-block of a block of data that contains multiple sub-blocks (Colgrove col. 3 lines 47-51: each data block of the stream of data blocks 110 to generate corresponding hash values 120 before determining whether to store the data block at the persistent storage resource 170 [Examiner note: since each data block of the stream of data blocks was used, the 1st data block of the stream of data blocks corresponds to the sub-block.  The first data block is deterministic since it’s not changed if the same method is applied again to the same stream of data].);		performing a lookup, into a deduplication table, of a digest generated by hashing the selected sub-block, the lookup matching an entry indexed by the digest in the deduplication table, the entry identifying a previously processed block (Colgrove col. 4 lines 13-14: … A deduplication map level 166 may refer to a data structure, such as a table with rows (also referred to as “entries” herein); Colgrove col. 4 lines 37-45: The rows may define a value (also referred to as a “key” herein), such as a complete hash value (also referred to as a “hash value” herein), for a data block. The row may also identify a location of the data block (also referred to as a “value of a key-value pair” associated with the key), such as an address in the persistent storage resource 170; Colgrove col. 3 lines 52-54: ... If the corresponding hash values matches previously stored hash values, the contents of the data blocks 110 may be a copy of contents of a previously received data blocks [Examiner note: the hash value corresponds to the digest, deduplication map corresponds to deduplication table].); and		effecting storage of the block, including pointing to the previously processed block (Colgrove col. 3 lines 55-58: Instead of storing the contents of the a pointer to the previously received data blocks with the matching hash values may be used to replace the contents of the subsequent data blocks).

	Regarding claim 2, Colgrove teaches the method of claim 1,		wherein the method further comprises identifying a target range of the previously processed block whose contents match a corresponding range of the block of data (Colgrove Fig. 1: 
    PNG
    media_image1.png
    856
    1313
    media_image1.png
    Greyscale
;Colgrove col. 3 lines 52-54: If the corresponding hash values matches previously stored hash values, the contents of the data blocks 110 may be a copy of contents of a previously received data blocks.); and		wherein pointing to the previously processed block includes storing metadata in connection with the block that associates the identified target range as being part of the block of data (Colgrove col. 3 lines 55-58: Instead of storing the contents of the data blocks 110, a pointer to the previously received data blocks with the matching hash values may be used to replace the contents of the subsequent data blocks).
	Regarding claim 3, Colgrove teaches the method of claim 2 wherein the previously processed block also includes a unique range that does not match contents of the block of data (Colgrove col. 10 lines 31-36: If one or more, but not all the hash values 120, match the partial hash values of index summary 162 in volatile memory 160, storage system 155 may determine to look at data in the index summary 162 around the matching partial hash values and expand the search from that point or points).	Regarding claim 5, Colgrove teaches the method of claim 3, wherein identifying the target range includes:		generating a first block digest by hashing the block in its entirety (Colgrove col. 12 lines 4-6: the deduplication process may perform a hash function on the data block 110 to generate a hash value 120.);		comparing the first block digest with a previously-stored second block digest that was generated by hashing the previously processed block in its entirety (Colgrove col. 12 lines 6-8: … the hash value 120 may be compared with the partial hash values 212 that are stored in the index summary level 210), and determining that the first block digest differs from the second block digest does not match with any of the partial hash values 212 currently stored in the index summary level 210 … then a copy of the received data block 110 may not already be stored in the persistent storage resource 170. As such, the received data block 110 may be stored in the persistent storage resource 170).	Regarding claim 10, Colgrove teaches the method of claim 3 wherein the method further comprises:		generating a block digest by hashing the block in its entirety (Colgrove col. 12 lines 4-6: The deduplication process may perform a hash function on the data block 110 to generate a hash value 120.); and		inserting another entry, indexed by the block digest, into the deduplication table, the other entry identifying the block as having been processed (Colgrove col. 12 lines 15-21: … the index summary level 210 and the deduplication map level 220 may be updated or recreated to register the received data block 110. For example, an entry 222 of the deduplication map level 220 may be modified to include the hash value 120 of the data block 110 and a physical location identifier 224 of the data block 110 is persistent storage resource 170.).
	Regarding claim 11, Colgrove teaches the method of claim 2 wherein contents of the block of data in its entirety are identical to the contents of the identified target range (Colgrove col. 12 lines 40-49: if the hash value 120 of the received data block 110 is included in one of the entries 222 of the deduplication map 
	Regarding claim 12, Colgrove teaches the method of claim 11, wherein identifying the target range includes:	generating a first block digest by hashing the block in its entirety (Colgrove col. 12 lines 4-6: … The deduplication process may perform a hash function on the data block 110 to generate a hash value 120.); and	comparing the first block digest to a previously-stored second block digest that was generated by hashing the previously processed block in its entirety, and in response, determining that the first block digest equals the second block digest (Colgrove col. 11 lines 56-62: The deduplication map level 220 may include multiple pages 221. Each of the pages 221 may include multiple entries 222 where each entry 222 includes a complete hash value 223 and a physical location identifier 224 of a data block stored in persistent storage resource 170. Each entry 222 of the pages 221 may include a different complete hash value 223; Colgrove col. 12 lines 27-30: Each entry 222 in the page 221 may be searched to determine whether the hash value 120 of the data block 110 is currently included in one of the entries 222 of the page 221; Colgrove col. 12 lines 40-44: if the hash value 120 of the received data block 110 is included in one of the entries 222 of the deduplication map level 220, then the received data block 110 may be a duplicate or a copy of another data block.).
	Regarding claims 18-19, the claims are system claim and article of manufacture claim respectively corresponding to method claim 1.  The claims 18-19 are rejected for the same reasons as that of the claim 1.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Colgrove in view of Khan et al. (US 20170220295 A1 hereinafter Khan) and further in view of Williams (US 5990810 A hereinafter Williams).

	Regarding claim 4, Colgrove teaches the method of claim 3.
		However, Colgrove does not explicitly teaches the entry further identifies a first offset within the previously processed block, the first offset providing a location of contents within the previously processed block that are shared with the selected sub-block;	wherein the selected sub-block is located at a second offset within the block of data; and 			wherein identifying the target range includes comparing the second offset with the first offset, and determining that the second offset differs from the first offset.		Khan teaches a method for deduplication where an entry further identifies a first offset within the previously processed block, the first offset providing a location of contents within the previously processed block that are shared with the selected sub-block (Khan [0031]: … the duplicate analysis module 206 illustratively includes a hash generator 208 configured to generate a hash 216 of each sub-block of a data block to be stored, as described in more detail below .... In this method, referred to herein as hash based index addressing, at least a portion of the computed hash acts as an address range (i.e., a pointer) and this address range has several locations (i.e., entries) in which that data sub-block may be stored. In at least some embodiments, each stored address range (i.e., pointer) may include bits that indicate which entry in the address range contains the exact data of the stored data sub-block [Examiner note: the pointer corresponds to the first offset]);			wherein the selected sub-block is located at a second offset within the block of data (Khan Fig. 9 
    PNG
    media_image2.png
    588
    964
    media_image2.png
    Greyscale
[Examiner note: element 901 shows multiple sub-blocks, with index 0, 1 … 63.  The index of the sub-block corresponds to the second offset]); and
			wherein identifying the target range includes comparing the second  [sub-block] with the first  [sub-block], and determining that the second  [sub-block] differs from the first  [sub-block]  ([Examiner note: the crossed over limitations are discussed below]; Khan [0018]: The data storage controller 102 is configured to perform the deduplication operation in response to a data storage instruction by comparing each sub-block of a data block to be stored to previously stored sub-blocks. … a pointer to the physical address of the location in the data block at which the sub-block is stored is added to a data pointer table associated with the data block to be stored. As subsequent sub-blocks of the data block are to be written to the data table, the data storage device 100 compares each subsequent sub-block to the previously stored sub-blocks to determine whether the particular sub-block is duplicative of an earlier stored sub-block. As discussed below, the data storage device may use one of multiple methods to determine if sub-blocks are duplicates; Khan [0019] If a subsequent sub-block is not a duplicate of any previously stored sub-blocks, the subsequent sub-block is stored in the data table and a new pointer is added to the data pointer table ... [Examiner note: Khan discloses the comparing of a sub-block within the data block to a sub-block form a previously stored sub-blocks, however Khan does not disclose comparing the two sub-blocks using the sub-blocks’ offsets.  This is discussed below.]).		It is obvious to a person of ordinary skill in the art before the effective filing date to incorporate the teachings of Khan, which discloses to identify a first offset within a previously processed block; the second offset of the selected sub-block within the block of data and identifying the target range includes comparing the second offset with the first offset, and determining that the second offset differs from the first offset, into the teaching of Colgrove to result in the limitations the entry further identifies a first offset within the previously processed block, the first offset providing a location of contents within the previously processed block that are shared with the selected sub-block;			wherein the selected sub-block is located at a second offset within the block of data.			wherein identifying the target range includes comparing the second [sub-block] with the first [sub-block], and determining that the second [sub-block] differs from the first [sub-block].
conserves capacity of the memory 116, which can be used to store additional data).
		Although the combined teachings of Colgrove and Khan teach the limitations of claim 4 (see discussion above), the combined teachings of Colgrove and Khan do not explicitly teach comparing using offset of the first sub-block and the offset of the second sub-block.		Williams teaches comparing sub-blocks using the references to the sub-blocks (Williams col. 15 line 64 to col. 16 line 4: Comparing subblocks. In most applications of this invention, there will be a need at some stage to identify identical subblocks.  This can be done in a variety of ways:	Compare the subblocks themselves.
	Compare the hashes of the subblocks.
	Compare identifies of the subblocks.
	Compare references to the subblocks [Examiner note: the references to the subblocks corresponds to the offsets of the sub-blocks]).
		It is obvious to a person of ordinary skill in the art before the effective filing date to incorporate the teaching of Williams, which teaches comparing sub-blocks can be done using either the sub-blocks themselves or the references to the sub-blocks, into the combined teachings of Colgrove and Khan to result in the limitations of the claimed invention.
efficient method for identifying identical portions of data within a group of blocks of data, and for using this identification to increase the efficiency of systems that store and communicate data).

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Colgrove in view of Georgiev (US 20150370495 A1 hereinafter Georgiev).
	Regarding claim 6, Colgrove teaches the method of claim 3.
		However, Colgrove does not explicitly teaches:			wherein an entry further identifies a first offset within the previously processed block, the first offset providing a location of a prime sub-block within the previously processed block whose contents are shared with the selected sub-block;			wherein the selected sub-block is located at a second offset within the block of data; and			wherein identifying the target range includes:				identifying the prime sub-block as part of the target range; and				comparing a first adjacent sub-block of the previously processed block having an offset adjacent to the first offset with a second adjacent sub-block of the block of data having an offset adjacent to the second offset to identify whether the first adjacent sub-block is also part of the target range.		Georgiev teaches an entry further identifies a first offset within the previously processed block, the first offset providing a location of a prime sub-block within the previously processed block whose contents are shared with the selected sub-block (Georgiev [0038]: …  Deduplication is region-based, i.e., performed for generally arbitrary-size contiguous units of storage referred to herein as “regions” and having respective start and end addresses (or offsets); Georgiev [0061]: … The offset field 154 for an entry stores the block address of that data block on the physical volume 52. The offset value indicates address alignment of the data block, which is relevant to the quality of the entry and thus used in evaluation as described below; Georgiev [0078]: … This example depicts data blocks of an area of a volume being written to with increasing logical block addresses from left to right. The result is that a single hash table entry for duplicated block D covers a range of blocks (a-f).
    PNG
    media_image3.png
    298
    552
    media_image3.png
    Greyscale
[Examiner note: the offset of D in “abcDef” on the left side in the first row corresponds to the first offset; D in “abcDef” on the left side of the first row corresponds to the prime sub-block; D in “abcD” on the right side of the second row corresponds to the selected sub-block]);			wherein the selected sub-block is located at a second offset within the block of data (Georgiev [0078]: … This example depicts data blocks of an area of a volume being written to with increasing logical block addresses from left to right. The result is that a single hash table entry for duplicated block D covers a range of blocks (a-f).
    PNG
    media_image3.png
    298
    552
    media_image3.png
    Greyscale
[Examiner note: the offset of D in “abcD” (to the right) of the second row corresponds the second offset to the selected sub-block;]); and			wherein identifying the target range includes: identifying the prime sub-block as part of the target range (Georgiev [0078]: … 
    PNG
    media_image3.png
    298
    552
    media_image3.png
    Greyscale
[Examiner note: “abcDef” of the last row corresponds to the target range; The second row of the table shows D as part of the target range]); and			comparing a first adjacent sub-block of the previously processed block having an offset adjacent to the first offset with a second adjacent sub-block of the block of data having an offset adjacent to the second offset to identify whether the first adjacent sub-block is also part of the target range (Georgiev [0017] … upon occurrence of a hit in the hash table for a given data block in a range of newly written data blocks, comparing the data blocks of the range to corresponding data blocks in a range identified by the hit to maximize a size of a corresponding region to be identified by the translation table as duplicate data; Georgiev [0078]:

    PNG
    media_image3.png
    298
    552
    media_image3.png
    Greyscale
[Examiner note: the second row, a scan left of D was made to identify “c” as part of the target range]).		It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Georgiev, which discloses a method for extending matching range of data sub-blocks with previously processed data sub-blocks, into the teaching of Colgrove to result in the limitations of the claim invention.
		One of ordinary skilled would be motivated to do so as both Colgrove and Georgiev teach the same endeavor; and using Georgiev’s teaching would help improve the efficiency of data deduplication (Georgiev [0005]: Key aspects of the disclosed technique include range-based identification of duplicate data, which can promote efficient operation by avoiding a need for block-by-block translations when performing storage I/O operations.)

	Regarding claim 7, Colgrove in view of Georgiev teaches the method of claim 6, wherein identifying the target range further includes: 	working outwards from the prime sub-block, comparing additional sub- blocks of the previously processed block to corresponding sub-blocks of the block of data until comparison fails (Georgiev [0078]: … 
    PNG
    media_image3.png
    298
    552
    media_image3.png
    Greyscale
[Examiner note: the second row, a scan left of D was made to identify “a-D” as part of the target range, third row, “e” which is to the right of “D” was checked and found to be in the range, and so forth as “f” in the fourth row.]) and 	identifying each additional sub-block of the previously processed block for which comparison to a corresponding sub-block of the block of data succeeded as part of the data extent (Georgiev [0078]: … 
    PNG
    media_image3.png
    298
    552
    media_image3.png
    Greyscale
[Examiner note: the 
		Colgrove in view of Georgiev thus far teaches the limitations of claim 7 (see discussion above).  However, Colgrove in view of Georgiev thus far does not yet teach wherein effecting storage of the block further includes writing contents of the block of data that are not also included within the identified target range to persistent storage.
		Georgiev further teaches wherein effecting storage of the block further includes writing contents of the block of data that are not also included within the identified target range to persistent storage	(Georgiev [0081] … Thus, the comparison that is performed above may also further detect non-duplicate new data blocks in a range of data blocks, and such non-duplicate new data blocks are processed to create one or more new entries in the hash table distinct from the existing entry for the prior data; Georgive Fig. 14: 
    PNG
    media_image4.png
    487
    762
    media_image4.png
    Greyscale
[Examiner note: each entry of the table above contains a value for OFFSET 154.  This is the physical address of the data.  To have the physical address of the content, the reference discloses that the contents of the block data that are not of the target range is also written to physical storage]).
		It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Georgiev, which discloses writing content of the block of data that are not included within the identified target range to persistent storage, into the teaching of Colgrove to result in the limitations of the claim invention.
		One of ordinary skilled would be motivated to do so as both Colgrove and Georgiev teach the same endeavor; and using Georgiev’s teaching would help improve the efficiency of data deduplication (Georgiev [0005]: Key aspects of the disclosed promote efficient operation by avoiding a need for block-by-block translations when performing storage I/O operations.)

	Regarding claim 8, Colgrove in view of Georgiev teaches the method of claim 6, wherein identifying the target range further includes: 		successfully comparing additional sub-blocks of the previously processed block up to an end of the previously processed block with corresponding sub-blocks of the block of data (Georgiev [0017] … upon occurrence of a hit in the hash table for a given data block in a range of newly written data blocks, comparing the data blocks of the range to corresponding data blocks in a range identified by the hit to maximize a size of a corresponding region to be identified by the translation table as duplicate data.; Georgiev [0078]:
 … 
    PNG
    media_image3.png
    298
    552
    media_image3.png
    Greyscale
[Examiner note: Georgiev discloses the comparison to maximize the range.  In the table above, Georgiev discloses scanning on the left and right to find duplicates.  As a result, it would go to the end of the previously processed block if the blocks continue to match.]); and		identifying each additional sub-block of the previously processed block for which comparison to a corresponding sub-block of the block of data succeeded as part of the target range (Georgiev [0017] … upon occurrence of a hit in the hash table for a given data block in a range of newly written data blocks, comparing the data blocks of the range to corresponding data blocks in a range identified by the hit to maximize a size of a corresponding region to be identified by the translation table as duplicate data.; Georgiev [0078]: … 
    PNG
    media_image3.png
    298
    552
    media_image3.png
    Greyscale
); 		Colgrove in view of Georgiev teaches the limitations of claim 8 (see discussion above). However, Colgrove in view of Georgiev thus far does not yet teach wherein the method further includes:			identifying an adjacent previously processed block located adjacent to the previously processed block within a logical address space;			comparing additional sub-blocks of the adjacent previously processed block with corresponding sub-blocks of the block of data; and			identifying each additional sub-block of the adjacent previously processed block for which comparison to a corresponding sub-block of the block of data succeeded as part of another target range whose contents are included within the block of data; and			wherein pointing to the previously processed block further includes storing metadata in connection with the block that associates the identified other target range as being part of the block of data.
	Georgiev further teaches the method further includes:			identifying an adjacent previously processed block located adjacent to the previously processed block within a logical address space (Georgev Fig. 4: 
    PNG
    media_image5.png
    673
    904
    media_image5.png
    Greyscale
 ;Georgiev 
    PNG
    media_image3.png
    298
    552
    media_image3.png
    Greyscale
 [Examiner note: the virtual vol. 50 in figure 4 corresponds to logical address space]);		comparing additional sub-blocks of the adjacent previously processed block with corresponding sub-blocks of the block of data (Georgiev [0078]: … 
    PNG
    media_image3.png
    298
    552
    media_image3.png
    Greyscale
); and		identifying each additional sub-block of the adjacent previously processed block for which comparison to a corresponding sub-block of the block of data succeeded as part of another target range whose contents are included within the block of data (Georgiev [0079]: … These contents can be quickly checked against corresponding blocks within DS regions in order to extend the DS regions accordingly. Thus when a write at offset x occurs and x is the end of the last duplicate the data for this write is compared with the data present at x+last F.sub.Δ where last F.sub.Δ is from the last duplicate found for the current thread. If the data matches, confirming duplication, the existing DS range is extended. [Examiner note: x corresponds to the offset of the other target range]); and		wherein pointing to the previously processed block further includes storing metadata in connection with the block that associates the identified other target range as being part of the block of data (Georgiev Fig. 6:
    PNG
    media_image6.png
    478
    721
    media_image6.png
    Greyscale
[Examiner note: when extending the range, the start address or end address of the range 75 changes to reflect the new range to include data from the identified other target range.  The start address or end address is associated with the other target range.  As a result, the metadata (each row of Fig. 6) is associated with the other target range]).

		One of ordinary skilled would be motivated to do so as both Colgrove and Georgiev teach the same endeavor; and using Georgiev’s teaching would help improve the efficiency of data deduplication (Georgiev [0005]: Key aspects of the disclosed technique include range-based identification of duplicate data, which can promote efficient operation by avoiding a need for block-by-block translations when performing storage I/O operations.)	Regarding claim 9, Colgrove in view of Georgiev teaches the method of claim 8.		Colgrove in view of Georgiev thus far does not teach:			wherein all sub-blocks of the block of data compare successfully to corresponding sub-blocks of the previously processed block and the adjacent previously processed block; and
			wherein effecting storage of the block further includes not writing any portion of the block to persistent storage;
		Georgiev further teaches:			wherein all sub-blocks of the block of data compare successfully to corresponding sub-blocks of the previously processed block and the adjacent previously processed block (Georgiev [0037]: … duplicate data are detected and mapped to locations of shared data on a physical volume (PHYS VOL) 52. This reduces the amount of physical storage space required to support a virtual volume 50 of a given size. In the illustrated example, duplicate copies of blocks A and B of the virtual volume 50 are realized by respective pointers to a single shared instance of each of A and B); and 			wherein effecting storage of the block further includes not writing any portion of the block to persistent storage (Georgiev [0037]: … duplicate data are detected and mapped to locations of shared data on a physical volume (PHYS VOL) 52. This reduces the amount of physical storage space required to support a virtual volume 50 of a given size. In the illustrated example, duplicate copies of blocks A and B of the virtual volume 50 are realized by respective pointers to a single shared instance of each of A and B. The physical regions that would otherwise have been used to store the duplicate regions A and B can be used to store other data. Such space is made available by reporting a correspondingly larger virtual volume size [Examiner note: Georgiev discloses that when both regions A and B are found duplicated, no additional physical space is needed for storing A and B.  As a result, Georgiev discloses that no writing of any portion of the duplicated block to persistent storage]).
		It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Georgiev, which discloses a method for storing data blocks when all data blocks matches previously written data blocks without writing any portion of the data blocks to persistent storage, into the teaching of Colgrove to result in the limitations of the claim invention.
promote efficient operation by avoiding a need for block-by-block translations when performing storage I/O operations).

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Colgrove in view of Jaroch (US 20190245692 A1 hereinafter Jaroch).

	Regarding claim 14, Colgrove teaches the method of claim 1.
		However, Colgrove does not teach wherein selecting, by applying the deterministic selection criterion, includes:			calculating an entropy of each of the multiple sub-blocks of the block; and			selecting a sub-block from the multiple sub-blocks having a highest calculated entropy as the selected sub-block.
		Jaroch teaches wherein selecting, by applying the deterministic selection criterion, includes: 			calculating an entropy of each of the multiple sub-blocks of the block (Jarock [0035]: … The processing further includes determining a number of entropy values associated with multiple sub-areas of the image pixels of the image.); and			selecting a sub-block from the multiple sub-blocks having a highest calculated entropy as the selected sub-block (Jarock [0035]: ... A highest entropy value among the entropy values and a sub-area corresponding to the highest entropy value are determined. An anchor point 15-a within the image pixels of the image is identified. The identified anchor point 15-a has coordinate values matching coordinate values of the determined sub-area, and is used to search or query the database 17 to check if the image is original or flagged as being, for example, copyright protected. [Examiner note: using the anchor point to find an image as original meaning the image data shows it’s not a duplicate.  If an image is a copyright protected, the image sub-area data would indicate it’s matching another image’s sub-area data]).
		It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Jarock to select a target block for data matching using entropy calculation into the teaching of Colgrove to result in the limitations of the claimed invention.
		One of ordinary skilled would be motivated to do so as it would help improve the efficiency of data deduplication (Jarock [0021] Additionally, the subject technology avoids the need to sample the entire image for finding the highest local entropy, which can be quite time consuming.)

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Colgrove in view of Jaroch and further in view of Stoakes et al. (US 20130179408 A1 hereinafter Stoakes).
Regarding claim 15, Colgrove in view of Jaroch teaches the method of claim 14 wherein the method further comprises, for another block of data that also contains multiple sub-blocks:		calculating an entropy of each of the multiple sub-blocks of the other block (Jarock [0035]: … The processing further includes determining a number of entropy values associated with multiple sub-areas of the image pixels of the image.);		selecting a sub-block from the multiple sub-blocks of the other block having a highest calculated entropy as a candidate sub-block (Jarock [0035]: ... A highest entropy value among the entropy values and a sub-area corresponding to the highest entropy value are determined. [Examiner note: image data of sub-area corresponds to the candidate sub-block]);		cif the hash value 120 of the received data block 110 is included in one of the entries 222 of the deduplication map level 220, then the received data block 110 may be a duplicate or a copy of another data block.).
comparing the highest calculated entropy of the candidate sub-block to a predetermined threshold value, yielding a threshold result; and
			in response to the comparison yielding, as its threshold result, a determination that the highest calculated entropy of the candidate sub-block is less than the predetermined threshold value, 
		Stoakes teaches comparing the highest calculated entropy of the candidate sub-block to a predetermined threshold value, yielding a threshold result (Stoakes [0071]: … If the decision at 315 is that the blocklet entropy is not below the threshold or within the range, then method 300 will proceed to 360 where a duplicate determination will be commenced.); and			in response to the comparison yielding, as its threshold result, a determination that the highest calculated entropy of the candidate sub-block is less than the predetermined threshold value,  lower the entropy of the data, the less likely that a conventional rolling hash will find a boundary in the data and the more likely that a maximum sized blocklet will be produced; Stoakes [0071]: … If the decision at 315 is that the blocklet entropy is not below the threshold or within the range, then method 300 will proceed to 360 where a duplicate determination will be commenced. As described above, additional information may be provided to the duplicate determiner since it may be faster to compress the blocklet than to access high-latency memory (e.g., disk) to do a full duplicate determination. [Examiner note: Stoakes discloses performing duplication determination if the entropy is equaled or above a threshold.  Stoakes discloses that with low entropy, it is more likely that a maximum sized blocklet will be produced for a boundary.  As a result, Stoakes teaches that low entropy would be more likely to produce a boundary of a whole data block versus sub-block.  Stoakes further discloses that it may be faster to compress the blocket than to go on with deduplication on the blocklet.  The examiner interpret the claimed invention that checking for data deduplication of the whole block, instead of continuing searching for data duplication on the sub-block level when the threshold level of entropy of the sub-block is not met.  The choice of compression or performing data deduplication on the whole block (versus sub-block) is simply swapping one known method with another known method that achieve a desirable result]).
		It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of Stoakes, which teaches to compare a blocklet’s entropy to a threshold to determine whether searching for duplication on the sub-block level should be performed or using a maximum data block size for data duplication search, into the combined teaches of Colgrove and Jaroch to result in the limitations of the claimed invention.
improve efficiency in a data de-duplication process, the blocklet pattern identifying may be performed independently from a data structure and process used by the duplicate blocklet determiner).

	Regarding claim 15, Colgrove in view of Jaroch and Stoakes teaches the method of claim 15 wherein the method further comprises, in response to looking up, in the deduplication table, the other digest generated by hashing the other block in its entirety (Colgrove col. 12 lines 4-6: The deduplication process may perform a hash function on the data block 110 to generate a hash value 120; Colgrove col. 12 lines 9-15: If the part of the hash value 120 does not match with any of the partial hash values 212 currently stored in the index summary level 210 …):		failing to find any entry indexed by the other digest in the deduplication table (Colgrove col. 12 lines 9-15: If the part of the hash value 120 does not match with any of the partial hash values 212 currently stored in the index summary level 210 …);		writing contents of the other block to persistent storage (Colgrove col. 12 lines 12-15: … then a copy of the received data block 110 may not already be stored in the persistent storage resource 170. As such, the received data block 110 may be stored in the persistent storage resource 170); and		inserting another entry indexed by the other digest into the deduplication table, the other entry identifying the other block as having been processed (Colgrove col. 12 lines 15-21: … the index summary level 210 and the deduplication map level 220 may be updated or recreated to register the received data block 110. For example, an entry 222 of the deduplication map level 220 may be modified to include the hash value 120 of the data block 110 and a physical location identifier 224 of the data block 110 is persistent storage resource 170.).

	Regarding claim 16, Colgrove in view of Jaroch and Stoakes teaches the method of claim 15 wherein the method further comprises, in response to looking up, in the deduplication table, the other digest generated by hashing the other block in its entirety (Colgrove col. 12 lines 4-6: The deduplication process may perform a hash function on the data block 110 to generate a hash value 120; Colgrove col. 12 lines 9-15: If the part of the hash value 120 does not match with any of the partial hash values 212 currently stored in the index summary level 210 …):		finding another entry indexed by the other digest in the deduplication table, the other entry identifying another previously processed block (Colgrove col. 12 lines 40-49: if the hash value 120 of the received data block 110 is included in one of the entries 222 of the deduplication map level 220, then the received data block 110 may be a duplicate or a copy of another data block. The contents of the received data block 110 may not be stored in the persistent storage resource 170 and the received data block 110 may be stored as a pointer to the physical location (e.g., physical location identifier 224) identified by the entry 222 that includes the matching complete hash value 223 of the other data block.); and		effecting storage of the other block by pointing to the previously processed block and not writing any portion of the other block to persistent storage (Colgrove col. 12 lines 44-49: … The contents of the received data block 110 may not be stored in the persistent storage resource 170 and the received data block 110 may be stored as a pointer to the physical location (e.g., physical location identifier 224) identified by the entry 222 that includes the matching complete hash value 223 of the other data block).
	Allowable Subject Matter
Claim 13 is objected to as being dependent upon rejected base claims, but would be allowable if rewritten in independent form including all of the limitations of the base claims and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
20180349053 A1	Data Deduplication in a Storage System	
20170199891 A1	DISTRIBUTED DATA DEDUPLICATION IN A GRID OF PROCESSORS
20130268496 A1	INCREASED IN-LINE DEDUPLICATION EFFICIENCY

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Vy Huy Ho whose telephone number is (571) 272-3261.  The examiner can normally be reached on Monday - Friday 7:30 am-5:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Vital can be reached on (571) 272-4215.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





02/23/2021
/V.H.H/
Examiner, Art Unit 2162

/PIERRE M VITAL/Supervisory Patent Examiner, Art Unit 2162