DETAILED ACTION
This is in response to the application filed on 01/29/2021. 
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Applicant preliminarily amended the claims on 08/06/2021 in which claims 1-30 as originally filed on 01/29/2021 were cancelled and 20 new claims were added which incorrectly renumbered to claims 2-21.  Claims 1-30 as originally filed should have been indicated as “canceled” and the new claims numbered 2-21 should have been numbered as claims 31-50. See claim objection below. 

Claim Objections
Claims 2-21 are objected for indicating incorrect numbers.
The numbering of claims is not in accordance with 37 CFR 1.126 which requires the original numbering of the claims to be preserved throughout the prosecution.  When claims are canceled, the remaining claims must not be renumbered.  When new claims are presented, they must be numbered consecutively beginning with the number next following the highest numbered claims previously presented (whether entered or not).
Claims 1-30 as originally filed on 01/29/2021 were preliminarily canceled on 08/06/2021. The list of the claims should have indicated claims 1-30 as “canceled.” Furthermore, the applicant incorrectly has renumbered new claims to claims 2-21. 
In order to advance the prosecution of the case, misnumbered claims 2-21 have been renumbered to claims 31-50, by the Examiner. 
Applicant is recommended to correctly renumber the claims with proper statuses. The corrections are required. 

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 08/27/2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees.  A nonstatutory double patenting rejection is appropriate where the claims at issue are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the reference application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The USPTO internet Web site contains terminal disclaimer forms which may be used.  Please visit http://www.uspto.gov/forms/.  The filing date of the application will determine what form should be used.  A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission.  For more information about eTerminal Disclaimers, refer to http://www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.  
Claims 2-21 (as renumbered to claims 31-50) are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-29 of U.S. Patent No. 10,938,961 in view of Shilane et al., US 10,838,990 (Shilane, hereafter).
As it is shown below, the subject matter claimed in claim 1 of the instant application is disclosed in claim 1 of the patent and is covered by the patent except the bolded sections:

Instant Application
U. S. Patent No. US 10,938,961
2. (New) A method for data processing, comprising: (a) generating at least a first segment and a second segment from one or more input data streams, wherein said first segment comprises a first plurality of chunks and said second segment comprises a second plurality of chunks; (b) computing (i) a first sketch of said first segment and (ii) a second sketch of said second segment, wherein said first sketch comprises a first set of features that are representative of or unique to said first segment and said first set of features are computed using a first subset of chunks selected from said first plurality of chunks, wherein said second sketch comprises a second set of features that are representative of or unique to said second segment and said second set of features are computed using a second subset of chunks selected from said second plurality of chunks, wherein said first set of features corresponds to said first segment and said second set of features corresponds to said second segment; (c) processing said first sketch and said second sketch to generate a similarity metric indicative of whether said second segment is similar to said first segment; and (d) when said similarity metric is equal to or greater than a similarity threshold, performing a differencing operation on said first segment relative and said second segment to determine a difference between said first segment and said second segment at a chunk level.
1. A method for data reduction, comprising: (a) receiving one or more input data streams from one or more client applications; (b) generating at least a first segment and a second segment from said one or more input data streams, wherein said first segment comprises a first plurality of chunks and said second segment comprises a second plurality of chunks; (c) computing (i) a first sketch of said first segment and (ii) a second sketch of said second segment, wherein said first sketch comprises a first set of features that are representative of or unique to said first segment, wherein said second sketch comprises a second set of features that are representative of or unique to said second segment, wherein said first set of features corresponds to said first segment and said second set of features corresponds to said second segment; (d) processing said first sketch and said second sketch to generate a similarity score that is indicative of a degree of similarity between said second segment and said first segment; (e) determining a hash strength based at least in part on said similarity score; (f) selecting a hash function having said hash strength to provide high processing throughput for a differencing operation, wherein said hash function having said hash strength is selected from a first plurality of hash functions having different hashing strengths; and (g) subsequent to (f), (1) using said hash function having said hash strength selected in (f) to perform said differencing operation on said second segment relative to said first segment when said similarity metric is greater than or equal to a similarity threshold, or (2) storing said first segment and said second segment in a database without performing said differencing operation when said similarity metric is less than said similarity threshold.



On the other hand, Shilane discloses regions of data (e.g. data segments) comprising a plurality of data chunks. Shilane further discloses generating sketch for regions based on sketches data chunks. The sketch for a chunk is computed based on features of the chunks (See Shilane: at least col. 6, lines 5-23, col. 11, lines 9-45, and Fig. 22-23 and associated text). 
Therefore, it would have been obvious to one of ordinary skill in the art before the time the invention was effectively filed to modify the claimed subject matter of the U.S. Patent No. 10,938,961 with Shilane’s teaching in order to compute said first set of features computed using a first subset of chunks selected from said first plurality of chunks, and compute said second set of features computed using a second subset of chunks selected from said second plurality of chunks. The motivation for doing so would have been to improve compression of data chunks of the similar segments by first identifying similar data segments and then recognizing and compressing data chunks of the similar segments based on the sketches of the data chunks. 

	
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-5, 7-19, and 21 (as renumbered to claims 31-34, 36-48 and 50) are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al., US 8,849,772 (Huang, hereafter) in view of Shilane et al., US 10,838,990 (Shilane, hereafter).
Regarding claim 2 (or renumbered to claim 31),
Huang discloses a method for data processing, comprising: 
 (a) generating at least a first segment and a second segment from one or more input data streams, (See Huang: at least Fig. 2A-3 and col. 4, lines 13-30, generating a plurality of segments including a first and a second segments from a data streams (e.g. new and stored data segments) wherein the segments include sub-segments (i.e. one or more characters));
(b) computing (i) a first sketch of said first segment and (ii) a second sketch of said second segment, wherein said first sketch comprises a first set of features that are representative of or unique to said first segment and said first set of features are computed (See Huang: at least col. 3, lines 8-33, generating sketches for plurality of segments (first and second segments) based on characteristics (i.e. features) associated with each segment);
  (c) processing said first sketch and said second sketch to generate a similarity metric indicative of whether said second segment is similar to said first segment (See Huang: at least Fig. 7, col. 3, lines 8-33, and col. 9, lines 30-67, processing the sketches to generate similarity metrics such as similarity fractions/percentage); and 
(d) when said similarity metric is equal to or greater than a similarity threshold, performing a differencing operation on said first segment relative and said second segment to determine a difference between said first segment and said second segment at a chunk level (Huang: at least Fig. 7, col. 3, lines 8-33, and col. 9, line 30 through col. 10, line 4, when the similarity percentage/fraction for two segments is over a threshold, performing deltas/differencing operation for the matched segments at character level).
Although, Huang discloses computing sketches for segments of data streams using sets of values characterizing the data segment (i.e. sets of features) and segments include a plurality of characters (i.e. sub-segments), Huang does not expressly teach dividing said first and second segments into plurality of chunks and computing features of the first and second subset of chunks to generate first and second sketches for the first and second segments.
On the other hand, Shilane discloses regions of data (e.g. data segments) comprising a plurality of data chunks. Shilane further discloses generating sketch for regions based on sketches data chunks. The sketch for a chunk is computed based on features of the chunks (See Shilane: at least col. 6, lines 5-23, col. 11, lines 9-45, and Fig. 22-23 and associated text). 
Therefore, it would have been obvious to one of ordinary skill in the art before the time the invention was effectively filed to modify the teachings of Huang with Shilane’s teaching in order to have said first segment comprises a first plurality of chunks and said second segment comprises a second plurality of chunks; and said first set of features are computed using a first subset of chunks selected from said first plurality of chunks, and said second set of features are computed using a second subset of chunks selected from said second plurality of chunks. The motivation for doing so would have been to improve compression of data chunks of the similar segments by first identifying similar data segments and then recognizing and compressing data chunks of the similar segments based on the sketches of the data chunks. 
Regarding claim 3 (as renumbered to claims 32),
the combination of Huang and Shilane discloses wherein said first subset of chunks or said second subset of chunks are selected using a fitting algorithm (See Huang: at least col., 5, lines 17-37 and Shilane: at least col. 6, lines 5-23, col. 11, lines 9-45, and col. 24, lines 10-29).
Regarding claim 4 (as renumbered to claims 33),
the combination of Huang and Shilane discloses wherein said fitting algorithm comprises a minimum hash function (See Huang: at least col., 5, lines 17-37 and Shilane: at least col. 6, lines 5-23, col. 11, lines 9-45, and col. 24, lines 10-29).
Regarding claim 5 (as renumbered to claims 34),
the combination of Huang and Shilane discloses wherein said differencing operation comprises generating a reference set of hashes on said first plurality of chunks and generating a second set of hashes on said second plurality of chunks (See Huang: at least col., 5, lines 45-67 and col. 7, lines 26-49 and Shilane: at least col. 6, lines 5-23, col. 11, lines 9-45, and col. 24, lines 10-29). 
Regarding claim 7 (as renumbered to claims 36),
the combination of Huang and Shilane discloses wherein said differencing operation further comprises comparing said second set of hashes to said reference set of hashes in a sequential order to determine a match (See Huang: at least col., 5, lines 45-67 and col. 7, lines 26-49 and Shilane: at least col. 6, lines 5-23, col. 11, lines 9-45, and col. 24, lines 10-29).  
Regarding claim 8 (as renumbered to claims 37),
the combination of Huang and Shilane discloses wherein said differencing operation further comprises generating and storing a single pointer that references collectively to a series of sequential chunks from a subset of said second plurality of chunks, upon determining that the series of sequential chunks have hashes that find a match from said reference set of hashes, and a follow-on subsequent chunk to said series of sequential chunks has a hash that does not find a match from said reference set of hashes (See Huang: at least Fig. 6 and its associated text and Shilane: at least Fig. 14-16, col. 5, lines 20-37, col. 8, lines 34-44, generating and storing a single bucket/reference for a group of similar chunks upon determining a match between the sketches/hashes of chunks and stored sketches/hashes).  
Regarding claim 9 (as renumbered to claims 38),
the combination of Huang and Shilane discloses wherein said single pointer is used in part to produce a sparse index comprising of a reduced set of pointers 
(See Shilane: at least col. 16, lines 1-28, Fig. 9, and Fig. 14-16, sketch index).  
Regarding claim 10 (as renumbered to claims 39),
the combination of Huang and Shilane discloses wherein said similarity threshold is at least 50% (See Huang: at least col., 2, lines 48-54, col. 3, lines 8-33, and col. 9, lines 30-67, Huang discloses similarity thresholds of such 75% or 80%. It would have been obvious to one having ordinary skill in the art at the time the invention was made to make similarity threshold at least 50/%, since it has been held that discovering an optimum value of a result effective variable involves only routine skill in the art. In re Boesch, 617 F.2d 272, 205 USPQ 215 (CCPA 1980)).
Regarding claim 11 (as renumbered to claims 40),
the combination of Huang and Shilane discloses wherein said first plurality of chunks or said second plurality of chunks have variable lengths (See Huang: at least col., 5, lines 30-32 and Shilane: at least col. 19, lines 47-48).  
Regarding claim 12 (as renumbered to claims 41),
the combination of Huang and Shilane discloses wherein said first subset of chunks is less than 10% of said first plurality of chunks (See Huang: at least col., 2, lines 48-54, col. 3, lines 8-33, and col. 9, lines 30-67 and Shilane: at least col. 6, lines 5-23, col. 11, lines 9-45, generating sketch based on sketch of some of the chunks. It would have been obvious to one having ordinary skill in the art at the time the invention was made to make said first subset of chunks is less than 10% of said first plurality of chunks, since it has been held that discovering an optimum value of a result effective variable involves only routine skill in the art. In re Boesch, 617 F.2d 272, 205 USPQ 215 (CCPA 1980)).
  Regarding claim 13 (as renumbered to claims 42),
the combination of Huang and Shilane discloses wherein said first subset of chunks is less than 1% of said first plurality of chunks (See Huang: at least col., 2, lines 48-54, col. 3, lines 8-33, and col. 9, lines 30-67 and Shilane: at least col. 6, lines 5-23, col. 11, lines 9-45, generating sketch based on sketch of some of the chunks. It would have been obvious to one having ordinary skill in the art at the time the invention was made to make said first subset of chunks is less than 1% of said first plurality of chunks, since it has been held that discovering an optimum value of a result effective variable involves only routine skill in the art. In re Boesch, 617 F.2d 272, 205 USPQ 215 (CCPA 1980)).
Regarding claim 14 (as renumbered to claims 43),
the combination of Huang and Shilane discloses wherein said second subset of chunks is less than 10% of said second plurality of chunks (See Huang: at least col., 2, lines 48-54, col. 3, lines 8-33, and col. 9, lines 30-67 and Shilane: at least col. 6, lines 5-23, col. 11, lines 9-45, generating sketch based on sketch of some of the chunks. It would have been obvious to one having ordinary skill in the art at the time the invention was made to make said second subset of chunks is less than 10% of said second plurality of chunks, since it has been held that discovering an optimum value of a result effective variable involves only routine skill in the art. In re Boesch, 617 F.2d 272, 205 USPQ 215 (CCPA 1980)).
Regarding claim 15 (as renumbered to claims 44),
the combination of Huang and Shilane discloses wherein said second subset of chunks is less than 1% of said second plurality of chunks (See Huang: at least col., 2, lines 48-54, col. 3, lines 8-33, and col. 9, lines 30-67 and Shilane: at least col. 6, lines 5-23, col. 11, lines 9-45, generating sketch based on sketch of some of the chunks. It would have been obvious to one having ordinary skill in the art at the time the invention was made to make said second subset of chunks is less than 1% of said second plurality of chunks, since it has been held that discovering an optimum value of a result effective variable involves only routine skill in the art. In re Boesch, 617 F.2d 272, 205 USPQ 215 (CCPA 1980)).
Regarding claims 16-19 and 21 (as renumbered to claims 45-48 and 50),
the scopes of the claims are substantially the same as claims 2-5 and 7 (as renumbered to claims 31-34 and 36), respectively, and are rejected on the same basis as set forth for the rejections of claims 2-5 and 7 (as renumbered to claims 31-34 and 36), respectively.

Claims 6 and 20 (as renumbered to claims 35 and 49) are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al., US 8,849,772 in view of Shilane et al., US 10,838,990 and further in view of Vijayan et al., US 2013/0232309 (Vijayan, hereafter).
Regarding claim 6 (as renumbered to claims 35),
the combination of Huang and Shilane discloses reference set of hashes; however, it does not explicitly teach wherein said reference set of hashes are weak hashes.  
On the other hand, Vijayan discloses weak hashes (See Vijayan: at least para 31 and 70). Therefore, it would have been obvious to one of ordinary skill in the art before the time the invention was effectively filed to modify the teachings of the combination of Huang and Shilane with Vijayan’s teaching in order to implement above function. The motivation for doing so would have been to save computing resources because weak hash functions typically require less resources.
Regarding claim 20 (as renumbered to claims 49),
the scope of the claim is substantially the same as claim 6 (as renumbered to claims 35), and is rejected on the same basis as set forth for the rejection of claim 6 (as renumbered to claims 35).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Shilane et al., US 8,918,390 disclosing a delta compression process in a data storage system selects a data chunk to delta compress and generates a sketch for the selected data chunk. The method and system search for a set of candidate data chunks with a matching sketch and rank the set of candidate data chunks by degree of sketch matching. 
Shabi et al., US 2020/0341669 disclosing (a) selecting, by applying a deterministic selection criterion, a sub-block of a block of data that contains multiple sub-blocks; (b) performing a lookup, into a deduplication table, of a digest generated by hashing the selected sub-block, the lookup matching an entry indexed by the digest in the deduplication table, the entry identifying a previously processed block; and (c) effecting storage of the block, including pointing to the previously processed block. 
Wallace et al., US 10,135,462 disclosing deduplicating sub-chunks in a data storage system selecting a data chunk to deduplicate and generating a sketch for the selected data chunk. A similar data chunk is searched for using the sketch. A set of fingerprints corresponding to sub-chunks of the similar data chunk is loaded. The set of fingerprints for the similar data chunk is compared to a set of fingerprints of the selected data chunk and the selected chunk is encoded as a set of references to identical sub-chunks of the similar data chunk and at least one unmatched sub-chunk.


Points of Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HARES JAMI whose telephone number is (571)270-1291.  The examiner can normally be reached on M-F 9:00a-5:00p.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Vital can be reached on 571-272-4215.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Hares Jami/           Primary Examiner, Art Unit 2162                                                                                                                                                                                                        08/23/2022