DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This office action is in response to the application filed on April 10, 2019, in which claims 1-20 are presented for examination.

Information Disclosure Statement
The information disclosure statement filed on April 10, 2019 and January 29, 2021 complies with the provisions of 37 CFR 1.97, 1.98 and MPEP § 609. It has been placed in the application file. The information referred to therein has been considered as to the merits.

Claim Objections
Claims 1-2, 5-7, 9, 10, 11, 12, 15, 16, 17, 19, and 20 are objected to because of the following informalities: claims 1-2, 5-7, 9, 10, 11, 12, 15, 16, 17, 19, and 20 recite the following variable “S”, “n”, “F1...Fn”, “U{F1...Fn}”, “UF1...UFN”, “F1...Fa”. These variables are not defined in the claims. Applicant is advised to defined these variable set forth in the claims. Appropriate correction is required.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4-6, 8-9, 11, 14-16, and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Hamilton et al., (hereinafter “Hamilton”) US 2010/0318759 in view Wu et al., (hereinafter “Wu”) US 8,886,605.
As to claim 1, Hamilton discloses a method, comprising:
measuring an amount of physical storage space used, or expected to be used, by a portion of a dataset S of segments (see [0038], evaluate characteristics (availability) of storage locations in chunk store and distribute chunks of backup data accordingly),
wherein measuring the amount of physical storage space comprises:
receiving information that identifies an ad-hoc group of size 'n' of files F1...Fn that makes up a subset of the dataset S (see [0026], receive and maintain backup information corresponding to one or more files, system images by dividing each file or other data into one or more chunks (segments, blocks));
determining a number of unique segments in the dataset S (see [0025] and [0026], determine only unique segments or blocks of the one or more files not previously stored at the chunk store by generating a signature for each block data);
identifying a respective unique segment set UF1...UFN for each of the 'n' files in the ad-hoc group of files (see [0025], identify unique blocks from the one or more blocks based upon the generated signatures and signatures of chunks stored in a distributed chunk store);
F1...Fn} on the unique segment sets UF1...UFN (see [0026], a signature list is utilized to compare contents of two arbitrarily different files to ascertain if chunk store already stores a particular chunk in order to identify unique chunks of a file or other information not stored in chunk store such that a minimal amount of data is transferred to the chunk store). 
However, Hamilton does not explicitly disclose the claimed “determining a sum of sizes of the unique segment sets UF1...UFN, wherein the sum is the amount of physical storage space used or expected to be uses by the ad-hoc group of size of files F1...Fn”.
Meanwhile, Wu discloses the claimed “determining a sum of sizes of the unique segment sets UF1...UFN, wherein the sum is the amount of physical storage space used or expected to be uses by the ad-hoc group of size of files F1...Fn”.(see col.3, lines 45-58col.7, lines 62-col.8, line 2, determining if the number of disk seek necessary to retrieve and restore files comprised of the set of data segments exceed a predetermined threshold, if so, the client sends all data segment contents of the data segments set along with their fingerprints to the deduplication server to ensure a group of contiguous data segments for storage within the SIS content volume).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Hamilton to determine a sum of sizes of the unique segment sets UF1...UFN, wherein the sum is the amount of physical storage space used or expected to be uses by the ad-hoc group of size of files in order to reduce the access time to the newly stored set of data segments thereby improving the overall efficiency of data access.

memory in the at least one content storage volume, wherein the storing is performed in response to determining that the fingerprint corresponding to the first file-data segment is not equivalent to the fingerprint of any previously-stored data segment).

As to claim 5, the combination Hamilton and Wu discloses the invention as claimed. In addition, Hamilton discloses the method as recited in claim 1, further comprising reporting, to a consumer, the amount of physical storage space used or expected to be used by the ad-hoc group of size 'n' of files F1...Fa. (see [0038], evaluate characteristics (availability) of storage locations in chunk store and distribute chunks of backup data accordingly),

As to claim 6, the combination Hamilton and Wu discloses the invention as claimed. In addition, Hamilton discloses the method as recited in claim 1, wherein each segment in the dataset S of segments is represented as a respective fingerprint (see [0029], the segmentation component  employs a fingerprint function to ascertain chunk boundaries).

As to claim 8, the combination Hamilton and Wu discloses the invention as claimed. In addition, Hamilton discloses the method as recited in claim 1, wherein the dataset is a segmented and deduplicated dataset (see [0029], the segmentation component  employs a fingerprint function to ascertain chunk boundaries).

F1...UFN comprises determining an average size of at least some segments in each of the 'n' files (see [0030], the segmentation component identifies chunk boundaries to be byte positions at which the fingerprint function satisfies a condition).

As to claims 11, 14-16, and 18-19, claims 11, 14-16 and 18-19 are non-transitory storage medium having stored therein instructions for executing the method of claims 1, 4-6 and 8-9 above. They are rejected under the same rationale.

Claims 2-3, 7, 10, 12, 13, 17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Hamilton et al., (hereinafter “Hamilton”) US 2010/0318759 in view Wu et al., (hereinafter “Wu”) US 8,886,605 and further in view of Varadan et al., (herein after “Varadan”) US 8,700,578
As to claim 2, the combination Hamilton and Wu discloses the invention as claimed, except for the claimed “wherein determining a number of unique segments in the dataset S comprises performing a non-random statistical sampling of the segments in the dataset S”.
	On the other hand, Varadan discloses the method as recited in claim 1, wherein determining a number of unique segments in the dataset S comprises performing a non-random statistical sampling of the segments in the dataset S (see col.3, lines 14-19, statistical sampling involve the use of a bloom filter).
	Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined system of Hamilton 

As to claim 3, the combination Hamilton and Wu discloses the invention as claimed, except for the claimed “wherein the non-random statistical sampling comprises content-based sampling”.
On the other hand, Varadan discloses the discloses the method as recited in claim 2, wherein the non-random statistical sampling comprises content-based sampling (see col.10, line 67-col.11, line 2, use of content-based fingerprints applying a bloom filter to a fingerprint of each of the deduplicated segments associated with a second file system namespace).
	Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined system of Hamilton and Wu to use a bloom filter to perform a content-based sampling in order to accurately determine the storage usage of file system namespace.

As to claim 7, the combination Hamilton and Wu discloses the invention as claimed, except for the claimed “wherein the amount of physical storage space used or expected to be used by the ad-hoc group of size 'n' of files F1...F~ is determined without counting all of the segments in the ad-hoc group of size 'n' of files F1...Fa
On the other hand, Varadan discloses the method as recited in claim 1, wherein the amount of physical storage space used or expected to be used by the ad-hoc group of size 'n' of files F1...F~ is determined without counting all of the segments in the ad-hoc group of size 'n' of files F1...Fa (see col.1, lines 63-67 and col.6, lines 24-31, estimating the physical space that is uniquely utilized by a collection of logical objects in a deduplicated storage system and in estimate a storage usage by a particular MTree, estimator is configured to identify and access a summary.sub.x where x is the MTree identified by an MTree ID provided to estimator).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to have modified the combined system of Hamilton and Wu to determine the amount of physical storage space used or expected to be used by the ad-hoc group of size 'n' of files in order to accurately estimate storage usage of storage system.

As to claim 10, the combination Hamilton and Wu discloses the invention as claimed, except for the claimed “wherein, for each of the 'n' files, determining an average size of at least some segments in the file comprises estimating the average size of segments in that file as being an arithmetic mean of all segments in that file as being an arithmetic mean of all segments in that file”.
On the other hand, Varadan discloses the method as recited in claim 9, wherein, for each of the 'n' files, determining an average size of at least some segments in the file comprises estimating the average size of segments in that file as being an arithmetic mean of all segments in that file as being an arithmetic mean of all segments in that file (see col.1, lines 63-67 and col.8, lines 22-28, the size of each segment is implementation specific, likewise the size of each fingerprint varies, depending on the type of hashing function. However, although they vary in sizes, an average size of a segment is roughly 8 KB, and a typical fingerprint is roughly 20 bytes in order to estimating the physical space that is uniquely utilized by a collection of logical objects in a deduplicated storage system).


As to claims 12-13, 17 and 20, claims 12-13, 17 and 20 are non-transitory storage medium having stored therein instructions for executing the method of claims 2-3, 7 and 10 above. They are rejected under the same rationale.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are  rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
At step 1, claim 1 recite a method comprising a combination of concrete devices, and therefore is a machine (physical storage space), which is a statutory category of invention.
At step 2A, prong one, claim 1 recites receiving information…, determining a number of unique segment…; performing a set union operation…; and determining a sum of size……
The limitation of determining a number of unique segments and performing a set union operation, as drafted, is a process that, under the broadest reasonable interpretation, covers a 
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in a mathematical calculation but for the recitation of generic computer components, then it falls within the “Mathematical concepts” grouping of abstract ideas. Accordingly, claim 1 recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claims only recites the additional elements “physical storage space". The physical storage space, in these steps is recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer component. The "determining a sum of the unique segment sets", does not add a meaningful limitation to the method, and is an extra-solution activity that does not meaningfully limit the claim. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a physical storage space to be used by a portion of a dataset of segments amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an 

At step 1, claim 11 recite a method comprising a combination of concrete devices, and therefore is a machine (processor), which is a statutory category of invention.
At step 2A, prong one, claim 11 recites receiving information…, determining a number of unique segment…; performing a set union operation…; and determining a sum of size……
The limitation of determining a number of unique segments and performing a set union operation, as drafted, is a process that, under the broadest reasonable interpretation, covers a mathematical calculations but for the recitations of generic computer components (processor). That is, nothing in the claim element preclude the step from practically being performed by mathematical calculation. For example “determining” and “performing” in the context of this claim encompasses a calculation of determining a number of unique segments using hash function and performing a set on union operation.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in a mathematical calculation but for the recitation of generic computer components (processor, non-transitory storage medium), then it falls within the “Mathematical concepts” grouping of abstract ideas. Accordingly, claim 11 recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claims only recites the additional elements “processor, non-transitory storage medium". The processor, non-transitory storage medium, in these steps are recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer component. The "determining a sum of the unique segment sets", does not add a 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a physical storage space to be used by a portion of a dataset of segments amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.

Claim 2 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 2 recites the same abstract idea of "a mathematical calculation". The claim recites the additional limitations of "wherein determining a number of unique segments in the dataset S comprises performing a non-random statistical sampling of the segments in the dataset ", which elaborates in the abstract idea of a mathematical calculation, and therefore, does not amount to significantly more than the abstract idea.

Claim 12 is dependent on claim 11 and includes all the limitations of claim 11. Therefore, claim 12 recites the same abstract idea of "a mathematical calculation". The claim recites the additional limitations of "wherein determining a number of unique segments in the dataset S comprises performing a non-random statistical sampling of the segments in the dataset ", which elaborates 

Claim 3 is dependent on claim 2 and includes all the limitations of claim 1. Therefore, claim 3 recites the same abstract idea of "a mathematical calculation”. The claim recites the additional limitations of "wherein the non-random statistical sampling comprises content-based sampling ", which elaborates in the abstract idea of a mathematical calculation, and therefore, does not amount to significantly more than the abstract idea. 

Claim 13 is dependent on claim 12 and includes all the limitations of claim 11. Therefore, claim 13 recites the same abstract idea of "a mathematical calculation”. The claim recites the additional limitations of "wherein the non-random statistical sampling comprises content-based sampling ", which elaborates in the abstract idea of a mathematical calculation, and therefore, does not amount to significantly more than the abstract idea. 

Claim 4 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 4 recites the same abstract idea of "a mathematical calculation" The claim recites the additional limitations of "wherein the method is performed in memory", which elaborates in the abstract idea of a mathematical calculation, and therefore, does not amount to significantly more than the abstract idea.




Claim 5 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 5 recites the same abstract idea of "a mathematical calculation". The claim recites the additional limitations of "reporting, to a consumer, the amount of physical storage space used or expected to be used by the ad-hoc group of size 'n' of files ", which elaborates in the abstract idea of a mathematical calculation, and therefore, does not amount to significantly more than the abstract idea.

Claim 15 is dependent on claim 11 and includes all the limitations of claim 11. Therefore, claim 15 recites the same abstract idea of "a mathematical calculation". The claim recites the additional limitations of "reporting, to a consumer, the amount of physical storage space used or expected to be used by the ad-hoc group of size 'n' of files ", which elaborates in the abstract idea of a mathematical calculation, and therefore, does not amount to significantly more than the abstract idea.

Claim 6 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 6 recites the same abstract idea of "a mathematical calculation". The claim recites the additional limitations of "wherein each segment in the dataset S of segments is represented as a respective 

Claim 16 is dependent on claim 11 and includes all the limitations of claim 1. Therefore, claim 16 recites the same abstract idea of "a mathematical calculation". The claim recites the additional limitations of "wherein each segment in the dataset S of segments is represented as a respective fingerprint", which elaborates in the abstract idea of a mathematical calculation, and therefore, does not amount to significantly more than the abstract idea.

Claim 7 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 7 recites the same abstract idea of "a mathematical calculation". The claim recites the additional limitations of "wherein the amount of physical storage space used or expected to be used by the ad-hoc group of size 'n' of files F1...F~ is determined without counting all of the segments in the ad-hoc group of size 'n' of files F1...Fa", which elaborates in the abstract idea of a mathematical calculation, and therefore, does not amount to significantly more than the abstract idea.

Claim 17 is dependent on claim 11 and includes all the limitations of claim 11. Therefore, claim 17 recites the same abstract idea of "a mathematical calculation". The claim recites the additional limitations of "wherein the amount of physical storage space used or expected to be used by the ad-hoc group of size 'n' of files F1...F~ is determined without counting all of the segments in the ad-hoc group of size 'n' of files F1...Fa", which elaborates in the abstract idea of a mathematical calculation, and therefore, does not amount to significantly more than the abstract idea.



Claim 18 is dependent on claim 11 and includes all the limitations of claim 11. Therefore, claim 18 recites the same abstract idea of " a mathematical calculation". The claim recites the additional limitations of "wherein the dataset is a segmented and deduplicated dataset ", which elaborates in the abstract idea of a mathematical calculation, and therefore, does not amount to significantly more than the abstract idea.

Claim 9 is dependent on claim 1 and includes all the limitations of claim 1. Therefore, claim 9 recites the same abstract idea of "a mathematical calculation". The claim recites the additional limitations of "wherein determining a sum of sizes of the unique segment sets UF1...UFN comprises determining an average size of at least some segments in each of the 'n' files ", which elaborates in the abstract idea of a mathematical calculation, and therefore, does not amount to significantly more than the abstract idea.

Claim 19 is dependent on claim 11 and includes all the limitations of claim 11. Therefore, claim 19 recites the same abstract idea of "a mathematical calculation". The claim recites the additional limitations of "wherein determining a sum of sizes of the unique segment sets UF1...UFN comprises determining an average size of at least some segments in each of the 'n' files ", which elaborates 

Claim 10 is dependent on claim 9 and includes all the limitations of claim 1. Therefore, claim 10 recites the same abstract idea of "a mathematical calculation". The claim recites the additional limitations of "wherein, for each of the 'n' files, determining an average size of at least some segments in the file comprises estimating the average size of segments in that file as being an arithmetic mean of all segments in that file as being an arithmetic mean of all segments in that file", which elaborates in the abstract idea of a mathematical calculation, and therefore, does not amount to significantly more than the abstract idea.

Claim 20 is dependent on claim 19 and includes all the limitations of claim 11. Therefore, claim 20 recites the same abstract idea of "a mathematical calculation". The claim recites the additional limitations of "wherein, for each of the 'n' files, determining an average size of at least some segments in the file comprises estimating the average size of segments in that file as being an arithmetic mean of all segments in that file as being an arithmetic mean of all segments in that file", which elaborates in the abstract idea of a mathematical calculation, and therefore, does not amount to significantly more than the abstract idea.





Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20180075055 (involved in determining a subset of references in a central reference table for inclusion in an updated partial instantiation of the table based on data segment size information and data segment utilization frequency information. The updated instantiation of the table is transmitted to a computing system of client systems such that partial instantiation of a table local to the system is different from the partial instantiation of the table local to another system, and does not include an entry corresponding to a data segment based on transmission of updated instantiations).
US 20150154220 (involved in determining second computing system to which to transmit an updated partial instantiation of the reference table. The updated partial instantiation of the reference table is transmitted to the second computing system. A partial instantiation of the reference table local to the second computing system includes the entry corresponding to the first data segment and a partial instantiation of the reference table local to a third computing system does not include the entry corresponding to the first data segment).
US 20110099200 (involved in communicatively linking a set of computer devices via a communication network, where each computer device includes a data storage storing a set of data objects and a fingerprint generation module. A portion of the data objects is processed with the fingerprint generation module to generate a fingerprint for the portion of the objects. The fingerprints are stored in a searchable manner in the data store. A copy of one of the data objects associated with one of the computer devices is retrieved using a data manager based on the generated fingerprint) 

US 9679040 (involved in distributing a subsequent incremental metadata update from a cloud controller to other cloud controllers for a distributed file-system that notifies the other cloud controllers of new-file data and includes deduplication updates related to the new-file data for enabling the other cloud controllers to update reference counts and entries in own respective deduplication tables to reflect addition of the new-file data to the distributed file-system, where the cloud controller maintains a deduplication table that tracks deduplicated data for the distributed file-system).
US 8983952 (involved in receiving a data stream (205) and metadata (210) corresponding to the data stream. The data stream manager selects and reorders a subset of a set of extents based on the metadata, and partitions a first portion of the data stream into a set of segments (235) using a first segment size. The data stream manager partitions a second portion of the data stream into a set of segments using a second segment size different from the first segment size, where a portion of the reordered subset of extents is combined to be included in a single segment).
US 8370315 (involved in inserting the calculated fingerprints into index, where the fingerprints to be unique are buffered in a write buffer, so that a series of insert operations can occur at a predetermined time so as to minimize time that the index is inaccessible for lookup 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JEAN M CORRIELUS whose telephone number is (571)272-4032.  The examiner can normally be reached on Monday-Friday 6:30a-10p(Midflex).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Vital can be reached on (571)272-4215.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.






For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.