DETAILED ACTION
Remarks
The instant application having Application Number 17/387,895 filed on July 28, 2021 has a total of 20 claims pending in the application; there are 1 independent claims and 19 dependent claims, all of which are presented for examination by the examiner.  
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
Examiner Notes
Examiner cites particular columns and line numbers in the references as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.
The examiner requests, in response to this Office action, support are shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111(c).

Drawings
The applicant’s drawings submitted are acceptable for examination purposes.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:

1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.


Claims 2, 3, 5, 6, 7, 13, 14 and 19 are rejected under 35 U.S.C. § 103 as being unpatentable over Singhai et al. (US 20170123689 A1, ‘Singhai’, hereafter) in view of Bhaskar et al. (US 2014/0223029 A1, ‘Bhaskar’, hereafter) and further in view of Wallace et al. (US Patent No. 9,514,146 B1), ‘Wallace’, hereafter).

Regarding claim 2. Singhai teaches a method for data processing, comprising: 
(a) receiving one or more input data streams from one or more client applications (Singhai, [0121] and Fig. 5 discloses, receive data stream where the received data stream herein is interpreted as the first input data stream from one or more client applications over a computer network. Further, Fig. 1, [0090] discloses, a computer network is used as a method of communication between devices); 
(d) comparing said first set of fingerprints with said second set of fingerprints to generate a similarity score indicative of a degree of similarity between said first segment and said second segment (The signature fingerprint computation engine then analyze the fingerprints by parsing and comparing the fingerprints of the data blocks of the incoming data stream to one or more fingerprints associated with a plurality of reference data blocks and/or reference data sets stored in storage and determines if a match exists, Singhai, [0097]); and 
(e) when said similarity score is equal to or greater than a similarity threshold (the matching engine may determine that content of the set of data blocks share a degree of similarity with one or more reference data sets stored in a data store based on an identifier (e.g. resemblance hash). A degree of similarity may include a threshold of similar content between a set of data blocks of an incoming data stream and that of reference data sets stored in storage. In one embodiment, a degree of similarity can be determined by comparing resemblance hashes (i.e. sketches) of data blocks to that of reference data sets. If a similarity exits, the method may advance to block. … The incoming data set may preserve a degree of similarity with that of the reference data set (i.e. previously saved version of the document) based on satisfying a threshold (i.e. a sketch of the current version of the document ‘incoming data set’ is within resemblance of that of the previous version ‘reference data set’ sketch), Singhai, [0158]), processing said first set of chunks of said first segment and said second set of chunks of said second segment by performing a differencing operation to determine a difference between said first segment and said second segment at a chunk level (the matching engine in cooperation with the signature fingerprint computation engine applies a similarity-based algorithm to detect similarities between incoming data and data previously stored in storage. In some embodiments, the matching engine identifies similarity between incoming data and data previously stored by comparing resemblance hashes (e.g. hash sketches) associated with the incoming data and the data previously stored in storage, Singhai, [0097], [0098], [0184]). 
Singhai does not teach
(b) generating at least a first segment and a second segment from said one or more input data streams, wherein said first segment comprises a first set of chunks and said second segment comprises a second set of chunks; 
However, Bhaskar teaches
(b) generating at least a first segment and a second segment from said one or more input data streams, wherein said first segment comprises a first set of chunks and said second segment comprises a second set of chunks (Output block assembler reconstructs input data block based on literal segments received from data recovery portion and matched segments received from de-compressor byte cache. …. Each match descriptor specifies where the matched bytes are in de-compressor byte cache, the length of the match and the location of the match segment in decompressed block. Output block assembler simply has to construct the matched part of the block by simply copying the matched byte segments from de-compressor byte cache and placing them in the correct locations of decompressed block, Bhaskar [0088-0089]); 
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention was made having the teachings of Singhai and Bhaskar before him/her, to modify Singhai that teaches similarity-based content matching for storage applications and data deduplication with the teaching of Bhaskar that teaches staged data compression, where each stage reflects a progressive increase in granularity, resulting in a scalable approach that exhibits improved efficiency and compression performance.  One would have been motivated to do so for the benefit of efficient scalable approach for high compression gain lossless long-range compression of data traffic (e.g., Internet traffic) (Bhaskar, Abstract and [0007]).
Singhai and Bhaskar do not teach
(c) computing (i) a first set of fingerprints of said first set of chunks and (ii) a second set of fingerprints of said second set of chunks; 
However, Wallace teaches
(c) computing (i) a first set of fingerprints of said first set of chunks and (ii) a second set of fingerprints of said second set of chunks (Wallace, Column 26, Lines  28-32, discloses a hash of a chunk, an encrypted hash of a chunk …, wherein the hash of a chunk, and the encrypted hash of a chunk are interpreted as the plurality of chunk hashes for said subset of chunks); 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Wallace that teaches techniques for improving data compression of a storage system in an online manner into the combination of Singhai that teaches similarity-based content matching for storage applications and data deduplication, and Bhaskar that teaches staged data compression are provided, where each stage reflects a progressive increase in granularity, resulting in a scalable approach that exhibits improved efficiency and compression performance. Additionally, this would reduce issues associated with storage lack of capacity in online storage system.  The motivation for doing so would be to improving data compression of a storage system in an online manner (Wallace, Column 1, Lines 24-25). 
Regarding claim 3. Singhai as modified teaches, wherein said differencing operation comprises: generating a reference set of hashes based on said first set of chunks and generating a second set of hashes based on said second set of chunks, and comparing said second set of hashes to said reference set of hashes in a sequential order (Singhai [0099], [0118], [0148]). 
Regarding claim 5. Singhai as modified teaches, wherein said differencing operation further comprises generating and storing a single pointer that references collectively to a series of sequential chunks from said second set of chunks, upon determining that (a) the series of sequential chunks have hashes that find a match from said reference set of hashes, and (b) a follow-on subsequent chunk to said series of sequential chunks has a hash that does not find a match from said reference set of hashes (Singhai [0099-0100], [0097]).  
Regarding claim 6. Singhai as modified teaches, wherein said single pointer is used in part to produce a sparse index comprising of a reduced set of pointers(Singhai [0099-0100], [0097]).  
Regarding claim 7. Singhai as modified teaches, wherein said similarity threshold is at least 50% (Singhai [0010], [0158]).  
Regarding claim 13. Singhai as modified teaches, wherein said first set of chunks or said second set of chunks have variable lengths (data blocks varies in length, Bhaskar [0050], [0080], [0107]).  
Regarding claim 14. Singhai as modified teaches, wherein said first set of fingerprints are computed based on a plurality of hashes associated with a first subset of chunks selected from said first set of chunks, and said second set of fingerprints are computed based on a plurality of hashes associated with a second subset of chunks selected from said second set of chunks (Bhaskar [0043]).
Regarding claim 19. Singhai as modified teaches, wherein the first set of fingerprints or the second set of fingerprints comprise a plurality of hashes generated using one or more hashing algorithms (hash function algorithm, Singhai [0130], [156], [0099]).  

Claim 4 is rejected under 35 U.S.C. § 103 as being unpatentable over Singhai et al. (US 2017/0123689 A1, ‘Singhai’, hereafter) in view of Bhaskar et al. (US 2014/0223029 A1, ‘Bhaskar’, hereafter) in view of Wallace et al. (US Patent No. 9,514,146 B1), ‘Wallace’, hereafter) and further in view of Auh (US 2018/0075262 A1).

Regarding claim 4. Singhai, Bhaskar and Wallace do not teach, wherein said second set of hashes are weak hashes.  
However, Auh teaches wherein said second set of hashes are weak hashes (a digest operation called MD4 which may produce 128 bit long hashes of source data. The MD4 hash may be considered a severely weak hashing algorithm as compared to a hash operation such as 512 bit SHA2 due to its susceptibility to collisions which may be an undesirable trait in hashing algorithms, Auh [0369]).  
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention was made having the teachings of Singhai, Bhaskar, Wallace and Auh before him/her, to further modify Singhai with the teaching of Auh’s data centric model of computer software design is where user data may be prioritized over applications.  One would have been motivated to do so for the benefit of re-defining and restructuring certain common data oriented logical operations, resulting users privacy, security, convenience and/or capabilities (Auh, Abstract, [0003]).

Claims 8-12 are rejected under 35 U.S.C. § 103 as being unpatentable over Singhai et al. (US 2017/0123689 A1, ‘Singhai’, hereafter) in view of Bhaskar et al. (US 2014/0223029 A1, ‘Bhaskar’, hereafter) in view of Wallace et al. (US Patent No. 9,514,146 B1), ‘Wallace’, hereafter) and further in view of Tobin et al. (US Patent No. 9,753,935 B1), ‘Tobin’, hereafter).

Regarding claim 8. Singhai, Bhaskar and Wallace do not teach, wherein said second segment is of a same size as said first segment.  
However Tobin teaches wherein said second segment is of a same size as said first segment (Tobin, Col 2, line 42 – Col 3, line 43 and Col 18, line 58 – Col 19, line 6).  
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention was made having the teachings of Singhai, Bhaskar, Wallace and Tobin before him/her, to further modify Singhai with the teaching of Tobin’s database system that includes components for storing time-series data and executing custom, user-defined computational expressions in substantially real-time .  One would have been motivated to do so for the benefit of receiving and processing requests associated with time-series data and provide results to a user device. The database comprises a time-series database storing a first data segment file and a second data segment file, where the first data segment file comprises data associated with a first time-series, where a size of the first data segment file is within a first size range, where a size of the second data segment file is within the first size range (Tobin, Abstract, Col 2, lines 19-41).
Regarding claim 9. Singhai as modified teaches, wherein said second segment is of a different size than said first segment  (Tobin, Col 2, line 42 – Col 3, line 43 and Col 18, line 58 – Col 19, line 6).
Regarding claim 10. Singhai as modified teaches, wherein said first segment and said second segment each has a size ranging from about 1 megabyte (MB) to about 4 MB (Tobin, Col 18, line 58 – Col 19, line 6).  
Regarding claim 11. Singhai as modified teaches, wherein said first set of chunks and said second set of chunks have different number of chunks (Tobin, Col 18, line 58 – Col 19, line 6).  
Regarding claim 12. Singhai as modified teaches, wherein said first set of chunks and said second set of chunks have a same number of chunks (Tobin, Col 18, line 58 – Col 19, line 6).  

Claims 15-18 are rejected under 35 U.S.C. § 103 as being unpatentable over Singhai et al. (US 2017/0123689 A1, ‘Singhai’, hereafter) in view of Bhaskar et al. (US 2014/0223029 A1, ‘Bhaskar’, hereafter) in view of Wallace et al. (US Patent No. 9,514,146 B1), ‘Wallace’, hereafter) and further in view of Doerner et al. (US 20170177266 A1, ‘Doerner’, hereafter).

Regarding claim 15. Singhai, Bhaskar and Wallace do not teach wherein said first subset of chunks is less than 10% of said first set of chunks.  
However, Doerner teaches wherein said first subset of chunks is less than 10% of said first set of chunks (The indirect pattern recognition may be provided using hash based signatures of less than an entire item. For example, the indirect pattern recognition may be based on a subset of chunks that is less than all the chunks associated with data stored in, Doerner [0150]).  
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Doerner that teaches deduplication of data object which uses hashing and hashing functions and algorithms into the combination of Singhai that teaches similarity-based content matching for storage applications and data deduplication, Bhaskar that teaches staged data compression are provided, where each stage reflects a progressive increase in granularity, and Wallace that teaches techniques for improving data compression of a storage system in an online manner, resulting in a scalable approach that exhibits improved efficiency and compression performance. Additionally, this would reduce issues associated with storage lack of capacity.  The motivation for doing so would be to reduce waste of process cycles during data deduplication process (Doerner, par. [0008]).
Regarding claim 16. Singhai as modified teaches, wherein said first subset of chunks is less than 1% of said first set of chunks (Doerner [0150]).  
Regarding claim 17. Singhai as modified teaches, wherein said first subset of chunks and said second subset of chunks are selected from said first and second sets of chunks respectively using one or more fitting algorithms based on said plurality of hashes generated for said first and second sets of chunks (Doerner, Fig. 14, par. [0101], [0137], algorithms functions wherein the algorithms functions are interpreted as to comprise the fitting algorithms is interpreted to being selected from said plurality of chunks using one or more fitting algorithms on a plurality of hashes generated for said plurality of chunks).  
Regarding claim 18. Singhai as modified teaches, wherein said one or more fitting algorithms comprises a minimum hash function (Doerner, Fig. 14, par. [0101], [0137], algorithms functions wherein the algorithms functions are interpreted as to comprise the fitting algorithms is interpreted to being selected from said plurality of chunks using one or more fitting algorithms on a plurality of hashes generated for said plurality of chunks).  

Claims 20 and 21 are rejected under 35 U.S.C. § 103 as being unpatentable over Singhai et al. (US 2017/0123689 A1, ‘Singhai’, hereafter) in view of Bhaskar et al. (US 2014/0223029 A1, ‘Bhaskar’, hereafter) in view of Wallace et al. (US Patent No. 9,514,146 B1), ‘Wallace’, hereafter) and further in view of Goldfarb et al. (US 2017/0364700 A1, ‘Goldfarb’, hereafter).

Regarding claim 20. Singhai, Bhaskar and Wallace do not teach, wherein said one or more hashing algorithms are selected from the group consisting of Secure Hash Algorithm 0 (SHA-0), Secure Hash Algorithm 1 (SHA-1), Secure Hash Algorithm 2 (SHA-2), and Secure Hash Algorithm 3 (SHA-3).  
However, Goldfarb teaches wherein said one or more hashing algorithms are selected from the group consisting of Secure Hash Algorithm 0 (SHA-0), Secure Hash Algorithm 1 (SHA-1), Secure Hash Algorithm 2 (SHA-2), and Secure Hash Algorithm 3 (SHA-3) (hash functions algorithms SHA-1, SHA-2, SHA-3 can be used to manipulated selection of data on a database, Goldfarb [0099]).  
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Goldfard that teaches immutable logging of access requests to distributed file systems into the combination of Singhai that teaches similarity-based content matching for storage applications and data deduplication, Bhaskar that teaches staged data compression are provided, where each stage reflects a progressive increase in granularity, and Wallace that teaches techniques for improving data compression of a storage system in an online manner, resulting in a scalable approach that exhibits improved efficiency and compression performance. The motivation for doing so would be to increase security and integrity of data in a data store (Goldfard, par. [0004]).
Regarding claim 21. Singhai as modified teaches, wherein said first set of fingerprints and said second set of fingerprints are generated using two or more different hashing algorithms selected from said group (hash function algorithm, Singhai [0130], [156], [0099]).


Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon, if any, is considered pertinent to applicant’s disclosure.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to HASANUL MOBIN whose telephone number is (571)270-1289.  The examiner can normally be reached on 8AM to 5:00PM EST M-F.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached on 571-272-4034.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.  Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/HASANUL MOBIN/
Primary Examiner, Art Unit 2168