Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Examiners Amendment

An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given in a telephone interview with the Applicant's representative Geoffrey Staniford (registration number 43151)
The application has been amended as follows:

In the claims

1 - 10. 	(Canceled)   

11.	     A computer-implemented method of performing deduplicated backups in a computer network comprising:
	performing a garbage collection process of a file system to map each data segment fingerprint to a unique bit position in a perfect hash vector (PHV), wherein each bit represents whether or not a segment is live or dead based on its binary value of 0 or 1 liveness, wherein liveness comprises a referencing of the segment by any current content of the file system; 
for each region of a plurality of regions, and container identifiers of the segments as keys;  
	defining two bits per segment in the region-based vector, wherein a first bit is set to the a binary live/dead live or dead value, and a second bit indicates whether or not an ingest writes a duplicate of the segment indicating a deduplication decision of the segment; 
	grouping ingested data into one or more regions of the plurality of regions based on the container identifier and region identifier;
	first calculating a liveness of each region to classify a corresponding region as live versus dead by determining that a percentage of live segments in [[a]] the corresponding region is greater than a defined liveness threshold;
	allowing a programmed deduplication operation to be performed to prevent duplicate data segments being stored, and setting the first bit to [[a]] the live state binary value;
	second calculating a number of deduplicated segments of each region to determine that a number of segments in the corresponding region subject to the deduplication operation is less than a defined deduplication threshold; and
	overriding the allowing step to prevent the deduplication operation being performed to thereby allow duplicate data segments to be stored, and setting the second bit to a no deduplication state.

12 - 13. 	(Canceled) 	


	tallying a number of live segments and a number of dead segments in the corresponding region based on the fingerprint marking;
	subtracting the number of dead segments from the number of live segments to obtain a difference that determines the percentage of live segments;
	defining the corresponding region as dead if the difference is less than the defined liveness threshold; and
	defining the corresponding region as sufficiently live if the difference is greater than or equal to the defined liveness threshold.

15 - 16.	(Canceled)   

17.	(Original)   The method of claim 11 further comprising maintaining the PHV between the garbage collection process and a subsequent garbage collection process to represent region liveness of the computer network.

18.	(Original)   The method of claim 17 wherein the computer network comprises at least part of a deduplication backup system including a data storage server running a Data Domain file system (DDFS).



20. 	(Currently amended)   A computer program product, comprising a non-transitory computer-readable medium having a computer-readable program code embodied therein, the computer-readable program code adapted to be executed by one or more processors to implement a garbage collection assisted deduplication backup process in a computer network by which, when executed by a processor, cause the processor to perform a method to implement a garbage collection assisted deduplication backup process in a computer network, comprising: 
	dividing data to be stored in network storage media into a plurality of segments;
	calculating a hash fingerprint for each segment of the plurality of segments;
	maintaining an index table wherein each entry maps a fingerprint to a region of a plurality of regions, and a container identifier;
	maintaining, in the fingerprint, a first bit indicating a liveness status of a corresponding segment, the first bit set to a live binary value or a dead binary value, and a second bit indicating a deduplication decision of the corresponding segment;
	first determining, after in index lookup to the index table, that a percentage of live segments in the region is greater than a defined liveness threshold;
	allowing a programmed deduplication operation to be performed to prevent duplicate data segments being stored and setting the first bit to a live state the live binary value;

	overriding the allowing step to prevent the deduplication operation being performed to thereby allow the duplicate segments to be stored and setting the second bit to a no deduplication state
	performing a garbage collection process of a file system to map each data segment fingerprint to a unique bit position in a perfect hash vector (PHV), wherein each bit represents a segment liveness, wherein liveness comprises a referencing of the segment by any current content of the file system; 
	converting the PHV to a region-based vector using region identifiers for each region of a plurality of regions, and container identifiers of the segments as keys;  
	defining two bits per segment in the region-based vector, wherein a first bit is set to a binary live or dead value, and a second bit indicating a deduplication decision of the segment; 
	grouping ingested data into one or more regions of the plurality of regions based on the container identifier and region identifier;
	first calculating a liveness of each region to classify a corresponding region as live versus dead by determining that a percentage of live segments in the corresponding region is greater than a defined liveness threshold;
	allowing a programmed deduplication operation to be performed to prevent duplicate data segments being stored, and setting the first bit to the live binary value;
second calculating a number of deduplicated segments of each region to determine that a number of segments in the corresponding region subject to the deduplication operation is less than a defined deduplication threshold; and
	overriding the allowing step to prevent the deduplication operation being performed to thereby allow duplicate data segments to be stored, and setting the second bit to a no deduplication state.

21. (New)   The computer program product of claim 20 wherein the first calculating step comprises:
	tallying a number of live segments and a number of dead segments in the corresponding region based on the fingerprint marking;
	subtracting the number of dead segments from the number of live segments to obtain a difference that determines the percentage of live segments;
	defining the corresponding region as dead if the difference is less than the defined liveness threshold; and
	defining the corresponding region as sufficiently live if the difference is greater than or equal to the defined liveness threshold.

22.	(New)	     A system for performing deduplicated backups in a computer network comprising:
	a processor comprising a garbage collection component of a file system mapping each data segment fingerprint to a unique bit position in a perfect hash vector (PHV), wherein each 
	a garbage collection assistance component of the processor:
		converting the PHV to a region-based vector using region identifiers for each region of a plurality of regions, and container identifiers of the segments as keys;  
		defining two bits per segment in the region-based vector, wherein a first bit is set to a binary live or dead value, and a second bit indicating a deduplication decision of the segment; 
		grouping ingested data into one or more regions of the plurality of regions based on the container identifier and region identifier;
		first calculating a liveness of each region to classify a corresponding region as live versus dead by determining that a percentage of live segments in the corresponding region is greater than a defined liveness threshold;
		allowing a programmed deduplication operation to be performed to prevent duplicate data segments being stored, and setting the first bit to the live binary value;
		second calculating a number of deduplicated segments of each region to determine that a number of segments in the corresponding region subject to the deduplication operation is less than a defined deduplication threshold; and
		overriding the allowing step to prevent the deduplication operation being performed to thereby allow duplicate data segments to be stored, and setting the second bit to a no deduplication state.


	tallying a number of live segments and a number of dead segments in the corresponding region based on the fingerprint marking;
	subtracting the number of dead segments from the number of live segments to obtain a difference that determines the percentage of live segments;
	defining the corresponding region as dead if the difference is less than the defined liveness threshold; and
	defining the corresponding region as sufficiently live if the difference is greater than or equal to the defined liveness threshold.

24.	(New)   The system of claim 22 wherein the garbage collection assistance component further maintains the PHV between the garbage collection process and a subsequent garbage collection process to represent region liveness of the computer network.

25.	(New)   The system of claim 22 wherein the computer network comprises at least part of a deduplication backup system including a data storage server running a Data Domain file system (DDFS).

26.	(New)   The system of claim 25 wherein the file system implements a log structured file system in which data and metadata are written sequentially to a log that is implemented as a circular buffer.


Reasons for Allowance

	The following is an examiner’s statement of reasons for allowance:
	The closest prior art of record Fred Douglass, Abhinav Duggal, Philip Shilane, Tony Wong, The Logic of Physical Garbage Collection in Deduplicating Storage, 2/27/2017-3/2/2017, Usenix, https://www.usenix.org/system/files/conference/fast17/fast17-douglis.pdf hereinafter referenced as Usinex discloses a system estimating liveliness of containers by calculating number of live fingerprints divided by total number of fingerprints in the container and select containers below a liveliness threshold. Berrington et al. US2015/0234710 teaches predetermined number of blocks to have been modified since last checkpoint subject to deduplication.
	After further consideration of the prior records of art, the prior art alone or in combination do not in combination with the other limitations of the independent claim teach or disclose the inventive concept of using a PHV to determine deduplication of live data segments and overriding deduplication decision. The limitations of the inventive concept in combination with the other limitations of the independent claims make it novel and unobvious over the prior art of record.





Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALLEN S LIN whose telephone number is (571)270-0612.  The examiner can normally be reached on M-F 9-5.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alford Kindred can be reached on (571)272-4037.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ALLEN S LIN/Examiner, Art Unit 2153