DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments regarding Vijayan not teaching the newly amended limitations of “segment filter” have been fully considered and are persuasive.  Therefore, the rejection of claims 1, 3, 11-13, 15 & 16 under 35 U.S.C. § 102(a)(1) has been withdrawn.  However, upon further consideration, new grounds of rejection are made in view of the Cheung reference, previously used in the rejection of claims 6, 7, 14 & 19 under 35 U.S.C. § 103.

The additional argument by Applicant that “Vijayan uses similar terms (e.g., chunks, header)” is not persuasive and is addressed below.

Applicant’s specification at paragraph [0064] describes, in pertinent part, “In addition to the segment header 208 with its segment filter, each stream segment 206 contains a set of ‘chunks’ 210.  A chunk 210 can again be can be regarded as a data structure.”  (Emphasis added).



Vijayan teaches that each chunk includes a header and payload.  (Vijayan, paragraph [0288]).  The headers include a variety of information such as file identifiers, volumes, offsets, or other information associated with the payload items, a chunk sequence number, etc.  (Vijayan, paragraph [0289]).  A data stream has a stream header and stream payload.  (Vijayan, paragraph [0292]).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 3, 6, 7, 11-16, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Vijayan, et al. (cited in the IDS filed 10/01/2019, US 2014/0201170, hereinafter “Vijayan”) in view of Cheung (cited in the Form 892 mailed 01/27/2021, US 2012/0159098, hereinafter “Cheung”).

	Regarding claim 1, Vijayan teaches
A non-transitory computer-readable storage medium comprising a set of computer-readable instructions stored thereon, which, when executed by a computer system, cause the computer system to perform operations [Vijayan, ¶¶ 288-292], comprising: 
receiving, in computer memory of the computer system, a data item to be stored [Vijayan, ¶ 288, last sentence]; 
Vijayan, ¶ 288, first and second sentences]; 
obtaining a first identifier from the data item [Vijayan, ¶ 291, stream header contains file identifier of single-instance or non-single-instance data]; 
writing the first identifier of the data item to the [segment filter of the] stream segment header of the stream segment to which the data item is written [Vijayan, ¶ 291, stream header contains file identifier of single-instance or non-single-instance data]; 
obtaining a second identifier from the data item, the second identifier being different from the first identifier [Vijayan, ¶¶ 288-289, file identifier included in chunk identifier]; and 
writing the second identifier of the data item to a header of the data chunk of the stream segment to which the data item is written [Vijayan, ¶¶ 288-289, file identifier included in chunk identifier], 
the first identifier in the stream segment header identifying whether any data items in any of the plurality of data chunks of the stream segment include the first identifier [Vijayan, ¶ 292, last four sentences] and 
the second identifier in the header of the data chunk of the stream segment identifying whether any data items in the data chunk include the second identifier [Vijayan, ¶ 289, first three sentences].

Vijayan does not explicitly teach the stream segment header including a segment filter.  However, Cheung teaches the stream segment header including a segment filter Cheung ¶¶ 0131 & 0132, “A Bloom filter is a data structure well known to persons skilled in the relevant art(s). A Bloom filter is a compact set that may be used by program code to reliably determine if an item is not a member of a set.”].

Vijayan and Cheung are analogous art because they are in the same field of endeavor, data storage and retrieval.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Vijayan with the Bloom filter usage taught in Cheung to achieve the well-known result of efficiently testing for the presence or non-presence of member in a set in big data and data stream applications.

	Regarding claim 3, the method claim corresponds to the limitations of claim 1, and is rejected for the same reasons discussed above.

	Regarding claim 6, the combination of Vijayan and Cheung teaches the method of claim 3, wherein the first identifier of the data item is written into a probabilistic data structure contained in the stream segment header of the stream segment to which the data item is written [Cheung, ¶ 0131, data chunk identifiers are referenced in a stream map chunk are included in a Bloom filter].

Regarding claim 7, the combination of Vijayan and Cheung teaches the method of claim 3, wherein the second identifier of the data item is written into a probabilistic data structure contained in the header of the data chunk of the stream segment to which Cheung, ¶ 0131, data chunk identifiers are referenced in a stream map chunk are included in a Bloom filter].

	Regarding claim 11, the combination of Vijayan and Cheung teaches the method of claim 3, further comprising: writing data concerning the data item to a stream partition such that the stream partition indexes the stream segment to which the data item is written [Vijayan, ¶ 289, second sentence].

	Regarding claim 12, Vijayan teaches
A non-transitory computer-readable storage medium comprising a set of computer- readable instructions stored thereon, which, when executed by a computer system, cause the computer system perform operations, comprising: 
querying a stream segment header of a stream segment with a query identifier [Vijayan, ¶ 292, last four sentences], wherein the stream segment comprises a stream segment header and one or more data chunks [Vijayan, ¶ 288, first and second sentences], wherein each data chunk comprises data items and the stream segment header [and the associated segment filter] comprises identifiers of the data items in the data chunks [Vijayan, ¶ 292, last four sentences], wherein the querying of the stream segment header identifies whether any of the data items in the data chunks of the stream segment have the query identifier [Vijayan, ¶ 292, last four sentences]; 
querying the data chunks of the stream segment with the  query identifier to identify which data chunks of the stream segment have the query identifier [Vijayan, ¶ 289, first three sentences]; and 
Vijayan, ¶ 289, first three sentences].

	Vijayan does not explicitly teach an associated segment filter; and based on determining at least one data item in the data chunks of the segment filter have the query identifier.
	However, Cheung teaches an associated segment filter [Cheung ¶¶ 0131 & 0132, “A Bloom filter is a data structure well known to persons skilled in the relevant art(s). A Bloom filter is a compact set that may be used by program code to reliably determine if an item is not a member of a set.”]; and 
based on determining at least one data item in the data chunks of the segment filter have the query identifier [Cheung ¶¶ 0131 & 0132].

Vijayan and Cheung are analogous art because they are in the same field of endeavor, data storage and retrieval.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Vijayan with the Bloom filter usage taught in Cheung to achieve the well-known result of efficiently testing for the presence or non-presence of member in a set in big data and data stream applications.


querying the data items in the data chunks to identify which data items in the data chunks have the query identifier [Vijayan, ¶ 289, first three sentences]; and 
retrieving all or deleting all of the data items in all of the data chunks of the stream segment that have said query identifier [Vijayan, ¶ 289, last sentence].

	Regarding claim 14, the combination of Vijayan and Cheung teaches the non-transitory computer-readable storage medium of claim 12, wherein the identifiers of the data items are contained in a probabilistic data structure in the stream segment header [Cheung, ¶ 0131, data chunk identifiers are referenced in a stream map chunk are included in a Bloom filter].

Regarding claim 15, the combination of Vijayan and Cheung teaches the non-transitory computer-readable storage medium of claim 12, wherein the identifier of the data item is a first identifier [Vijayan, ¶ 291, stream header contains file identifier of single-instance or non-single-instance data], each data chunk has a header which includes second identifiers of the data items in the data chunk [Vijayan, ¶¶ 288-289, file identifier included in chunk identifier], the second identifier being different from the first identifier [Vijayan, ¶¶ 288-289, file identifier included in chunk identifier], and wherein the instructions further comprise instructions for: 
Vijayan, ¶ 289, first three sentences].

	Regarding claim 16, the combination of Vijayan and Cheung teaches the non-transitory computer-readable storage medium of claim 15, further comprising instructions for querying the data items in the data chunk to identify which data items in the data chunk have the second query identifier [Vijayan, ¶ 289, first three sentences].

	Regarding claim 19, the combination of Vijayan and Cheung teaches the non-transitory computer-readable storage medium of claim 15, wherein the second identifiers of the data items are contained in a probabilistic data structure in the header of the data chunks [Cheung, ¶ 0131, data chunk identifiers are referenced in a stream map chunk are included in a Bloom filter].

Claims 2, 4, 5, 17 & 18 are rejected under 35 U.S.C. 103 as being unpatentable over Vijayan in view of Cheung, and further in view of Katzenberger (cited in the IDS filed 10/01/2019, USPN 5,970,496, hereinafter “Katzenberger”).

	Regarding claim 2, the combination of Vijayan and Cheung teaches the non-transitory computer-readable storage medium of claim 1, but does not explicitly teach wherein the first identifier is a coarse identifier that applies to a first number of data items in the data chunks of the stream segment, and wherein the second identifier is a 

	However, Katzenberger teaches wherein the first identifier is a coarse identifier that applies to a first number of data items in the data chunks of the stream segment, and wherein the second identifier is a fine identifier that applies to a second, smaller number of data items in the data chunks of the stream segment [Katzenberger, Figure 2 and column 6, lines 15-27, describing parent/child hierarchical relationships between chunks/nodes].

	Vijayan, Cheung, and Katzenberger are analogous art because they are in the same field of endeavor, data storage and retrieval.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Vijayan and Cheung with the hierarchical data relations taught in Katzenberger to achieve the well-known result of logical entity relationships between data elements.

	Regarding claim 4, the method claims corresponds to the limitations of claim 2, and is rejected for the same reasons discussed above.

	Regarding claim 5, the combination of Vijayan, Cheung, and Katzenberger teaches the method of claim 3 wherein the first identifier is a tenant identifier, which identifies an organization, and the second identifier is a user identifier, which identifies Katzenberger, column 6, lines 28-37, parent/child relationships in data structures broadly describe data types, and the exact information represented is determined by a given implementation].

	Regarding claim 17, the non-transitory computer-readable storage medium claim corresponds to the limitations of claim 2, and is rejected for the same reasons discussed above.

	Regarding claim 18, the non-transitory computer-readable storage medium claim corresponds to the limitations of claim 5, and is rejected for the same reasons discussed above.

Claims 8 & 9 are rejected under 35 U.S.C. 103 as being unpatentable over Vijayan in view of Cheung, and further in view of the article by Wu, et al., entitled “Improving Accessing Efficiency of Cloud Storage Using De-Duplication and Feedback Schemes” (cited in the Form 892 mailed 01/27/2021, published in March 2014, hereinafter “Wu”).

	Regarding claim 8, the combination of Vijayan and Cheung teaches the method of claim 3, but does not explicitly teach carrying out consistent hashing on the first identifier obtained from the data item; and identifying the stream segment to which the data item is to be written based, at least in part, on the result of the consistent hashing on the first identifier obtained from the data item.

	However, Wu teaches carrying out consistent hashing on the first identifier obtained from the data item; and identifying the stream segment to which the data item is to be written based, at least in part, on the result of the consistent hashing on the first identifier obtained from the data item [Wu, page 209, § D, DHT uses consistent hashing for key remapping].

Vijayan, Cheung, and Wu are analogous art because they are in the same field of endeavor, data storage and retrieval.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Vijayan and Cheung with the consistent hashing techniques taught in Wu to achieve the well-known result of efficiently remapping keys (e.g., identifiers) to other data sets during changes in datasets.

	Regarding claim 9, the combination of Vijayan, Cheung, and Wu teaches the method of claim 8, wherein the stream segment is part of a log stream, there being plural log streams each containing different one or more stream segments, and wherein identifying the stream segment to which the data item is to be written based on the result of the consistent hashing comprises identifying the log stream having the stream segment based on the result of the consistent hashing [Wu, page 209, § D, DHT uses consistent hashing for key remapping].

Claims 10 & 20 are rejected under 35 U.S.C. 103 as being unpatentable over Vijayan in view of Cheung, and further in view of Bentley (cited in the IDS filed 10/01/2019, US2003/0036888, hereinafter “Bentley”).

	Regarding claim 10, the combination of Vijayan and Cheung teaches the method of claim 3, but does not explicitly teach wherein: writing the data item to a data chunk of a stream segment comprises writing the data item as a line entry in the data chunk, the line entry comprising a line header, and comprising: writing a schema ID to the line header, the schema ID identifying how to extract at least one of the first identifier and the second identifier from the data item.

	However, Bentley teaches the method of claim 3, wherein: writing the data item to a data chunk of a stream segment comprises writing the data item as a line entry in the data chunk, the line entry comprising a line header, and comprising: writing a schema ID to the line header, the schema ID identifying how to extract at least one of the first identifier and the second identifier from the data item [Bentley, ¶ 52].

Vijayan, Cheung, and Bentley are analogous art because they are in the same field of endeavor, data storage and retrieval.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Vijayan and Cheung with the data compression and encryption techniques taught in Bentley to achieve the well-known result of compressing and/or 

Regarding claim 20, the combination of Vijayan, Cheung, and Bentley teaches the non-transitory computer-readable storage medium of claim 12, wherein the data items are line entries in the data chunks, each line entry comprising a line header, each line header comprising a schema ID which identifies how to extract an identifier from the data item, and wherein the computer-readable instructions are such that querying the data items in a data chunk to identify which data items in the data chunk have the query identifier comprises inspecting the schema ID from the line header of a line entry and using the identified schema to extract the identifier from the line entry [Bentley, ¶ 52].

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Scott A. Waldron whose telephone number is (571)272-5898.  The examiner can normally be reached on Monday - Friday 9:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Neveen Abel-Jalil can be reached on (571)270-0474.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private 




/Scott A. Waldron/Primary Examiner, Art Unit 2152                                                                                                                                                                                                        04/30/2021