DETAILED ACTION
This office action is in response to a Request for Continued Examination filed April 26, 2021 for application 16/993,888.
Claims 1, 4, 7, and 10 have been amended.   Claims 2-3 have been cancelled.   No claims are new.  Thus claims 1 and 4-16 have been examined.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Acknowledgement is made of applicant’s claim for foreign priority based on an application filed in Japan.   Examiner notes the priority documents to JP2019-230475 have been received by the USPTO.
The objections and rejections from the prior correspondence that are not restated herein are withdrawn.


Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 04/26/2021 has been entered.
 
 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 4-6, 8, 10, 12, and 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over Shilane’990 (Shilane et al., US 10,838,990 B1) and further in view of Shilane’902 (Shilane et al., US 9,116,902 B1).

Regarding claim 1, Shilane’990 teaches A storage system comprising: a drive having a physical storage area (Shilane’990, Fig. 1, Storage Units 108 and 109, and column 6, lines 58-64); and a controller (Shilane’990, FIG 24, storage system 2400 and related paragraphs column 25, lines 10-42) configured to process data input into and output from the drive (Shilane’990, FIG 24, File Service Interface 2404.  See also Shilane’990, column 28, lines 17-23), wherein: the controller includes a cache area configured to store data to be read out of or written into the drive (Shilane’990, FIG. 14 intermediate storage such as flash or memory 1402, where intermediate storage is an example of cache memory used to hold data read or written to Storage Disk 1403.   See also Shae’990, column 21, lines 54-63.), 
and the controller: groups a plurality of non-identical pieces of data stored in the cache area based on a similarity degree among the plurality of pieces of data (Shilane’990,  selects a group, which has an access frequency of which is smaller than a first set value and which has the similar degree among data included in the group that is larger than a second set value (Shilane’990 Fig. 18, and supporting paras Column 21, lines 10-33 that discloses data with similar data chunks are stored together in one chunk.   Shilane’990 column 23, lines 10-18 discloses reorganizing data so that similar data chunks may be overwritten with non-similar chunks.  Thus the less frequently used data moved to a new location if they are not accessed more frequently than other data that is accessed ‘more frequently’.   Thus the “set value” is construed to be the value required for inclusion in a given group of ‘more frequently’ accessed data for inclusion in a given group as per the methods of Fig. 18, 20, 21, etc... )
compresses data of the selected group in group units, and stores the compressed data in the drive (Shilane’990, column 9, lines 63-67, discloses A and A’ and B and B’ are reorganized and then compressed into compressed file 215 for storage (for example to Storage Units 208 and 209.  Shilane’990, Column 21, lines 63-67, discloses that similar chucks such as A1, A2... A100 may be compressed (grouped) in one region and similar chunks such as B1, B2,... B7 are compressed (grouped) and stored in compression region B.   Shilane’990, column 4, lines 62-64 discloses that a compression region is a block of storage area.  Thus the less frequently accessed data of Shilane’990 is previously compressed and stored in the drive.)
wherein for another group whose access frequency is higher than the first set value, all data of the another group is maintained in the cache   (Shilane’990 column 23, lines 10-19 discloses that data that is access ‘more frequently’ may be taken from one storage area and combine with another data chunk stored in another storage area  may be reorganized and stored together in the storage area, thus potentially creating a third storage area/region with a group containing data whose access frequency is higher than the first set value.  Examiner notes the limit ‘more frequently’ which establishes the cutoff value to determine if data should be combined may be construed as the ‘first set value’.   Shilane’990 column 21, lines 54-63, discloses that all data in a compression region is cached on a regional basis, thus all of the data in the region of ‘more frequently’ accessed data is brought into the region in its entirety when it is cached.)  
Shilane’990 does not explicitly teach the similar degree among data included in the group that is larger than a second set value, wherein all data of the another group is maintained in the cache area.
Shilane’902, of a similar field of endeavor, further teaches the similar degree among data included in the group that is larger than a second set value. (Shilane’902, column 16, lines 17-18, ‘Location status information for each candidate base chunk with at least one similar feature or a minimum degree of similarity can be determined’, where the second set value is the ‘at least minimum degree of similarity’.),
all data of the another group is maintained in the cache area (Shilone’902 column 5, lines 23-24 clarifies ‘The working memory 106 can include a cache 181 to store frequently used data’, thus confirming that that ‘frequently used’ data such as the second group will be 
Shilane’990 and Shilane’902 are in a similar field of endeavor as both related to identifying similar data and compressing it according to a delta compression method.  Thus it would have been obvious before the effectively filed data of the claimed invention to incorporate the minimum degree of similarity of Shilane’902 when making caching decisions into the solution of Shilane’990. One would be motivated to do so in order to (Shilane’902, column 16, lines 24-28) insure that at least one feature is in common when you consider delta compression thereby improving the overall compression and storage space utilization efficiency.   

Regarding claim 4, the combination of Shilane’990 and Shilane’902 teaches all of the limitations of claim 1 above.  Shilane’990 further teaches wherein when a dirty cache ratio in the cache area is equal to or larger than a predetermined value, the controller selects and compresses the group of data stored in the cache area (Shilane’990, column 18, lines 19-21 and 29-30 teaches that data may be grouped and compressed when it is moved from one tier to another lower tier when a tier has reached a threshold capacity.   Thus the highest tier may be a cache memory, and the next lower tier may be a storage disk, and the data is moved from the cache to the disk when the cache capacity reaches a predetermined % value.  As is well known in the art, dirty data is data in the cache that has been modified but has not yet been written to disk and must be written to disk.  Thus dirty data is not available for use by other cache request and represents a reduction in capacity.  Since dirty data is one form of reduction 


Regarding claim 5, the combination of Shilane’990 and Shilane’902 teaches all of the limitations of claim 1 above.  Shilane’990 further teaches wherein the data to be grouped includes a plurality of pieces of data stored in the cache area based on separate write requests (Shilane’990, column 20, lines 21-36 teaches that a newly arrived chunk to be stored (an example of a write request) may be grouped and gathered with a previously stored (written) request and written once the bucket contains sufficient data, thus a plurality of pieces of data within a chunk (data stored in the cache area) are based on separate write requests.).  


Regarding claim 6, the combination of Shilane’990 and Shilane’902 teaches all of the limitations of claim 1 above.  Shilane’990 further teaches wherein the controller manages the similarity degree among data included in the group compressed and stored in the drive (Shilane’990, column 20, line 60 through column 21, line 10 ‘determine if similar chunks exist on the storage system’ ), selects a group having a similarity degree among data equal to or less than a reference value from a plurality of groups  (Shilane’990, column 20, line 60 through column 21, line 10 ‘If similar chunks exists’.  See also Shilane’990, column 2, lines 53 ‘chunks similar to the new chunk based on the matched sketch’ that discloses the similarity measure in column 21 lines 1-2 is based on a matched sketch to the existing chunk, where the similarity including compressed data and stored in the drive and reads and decompresses the compressed data included in the selected group from the drive, and stores the decompressed data in the cache area for regrouping and recompression (Shilane’990, column 21, lines 2-6 discloses the existing chunks for the matching sketch are read in, decompressed, merged with the newly arrived data, compressed, and written out to the storage system.).  

Regarding claim 8, the combination of Shilane’990 and Shilane’902 teaches all of the limitations of claim 1 above.  Shilane’990 further teaches wherein the controller evaluates the similarity degree based on a hash value of the data (Shilane’990, column 2, lines 9-15, ‘A sketch can be generated by identifying “features” of a data chunk...  In one example, a rolling hash function (e.g., a Rabin fingerprint) is applied over all overlapping small regions of the data chunk (e.g., a 32-byte window) and the features are selected from maximal hash values generated in the process. ’ ), and an input value used to calculate the hash value is a plurality of character strings of the same length included in the data (Shilane’990, column 2, lines 9-15 ‘a rolling hash function (e.g., a Rabine fingerprint) is applied over all overlapping small regions of the data chunk (e.g., a 32-byte window)’.   See also Shilane’990, column 26, lines 53-54 ‘one or more data patterns characterizing a chunk’.  As is well known in the art, a Rabine fingerprints applies to strings, thus Shailane’990 discloses an input string used to calculate the hash value is a plurality of 32-byte character strings in a rolling window.), 
Shailane’902, further discloses and an appearance frequency of each of the character strings in the data is equal to or more than a predetermined number (Shilane’902, column 16, andidate base chunk with at least one similar feature or a minimum degree of similarity can be determined’. ).  

Regarding claim 10,  A data compression method for a storage system (Shilane’990 [Abstract] ‘Techniques for improving data compression of a storage system’), wherein the storage system includes: a drive having a physical storage area (Shilaene’990, column 6, lines 58-59, ‘Storage units 108-109 may include a single storage device such as a hard disk, a tape drive, ’);
The remainder of claim 10 recites limitations described in claim 1 above, and thus are rejected based on the teachings and rationale as described in claim 1 above.

Regarding claim 12, the combination of Shilane’990 and Shilane’902 teaches all of the limitations of claim 1 above. Shilane’990 further teaches wherein the controller (Shilane’990, FIG 24, deduplication storage system 2400 and related paragraphs column 25, lines 10-42 ) See also Shilane’990, column 28, lines 17-23) calculates the similarity degree among the plurality of non-identical pieces of data stored in the cache area (Shilane’990, column 20, line 60 through column 21, line 10 ‘If similar chunks exists’.  See also Shilane’990, column 2, lines 53 ‘chunks similar to the new chunk based on the matched sketch’ that discloses the similarity measure in column 21 lines 1-2 is based on a matched sketch to the existing chunk, where the similarity value of the existing chunk is an example of an reference value from a plurality of groups.    Shilane’990 column 2, lines 38-43 discloses ‘only the original data chunk and a difference (i.e., the delta) between the two similar data chunks are stored rather than two 
 
Regarding claim 14, the combination of Shilane’990 and Shilane’902 teaches all of the limitations of claim 10 above. Shilane’990 further teaches comprising the step of performing, by the controller, deduplication before the step of grouping of the plurality of non-identical pieces of data (Shilane’990 column 19, lines 45-60 ‘For each chunk, a sketch is calculated.  This could be done after deduplication if it is a deduplicating storage system’.   Thus the step of deduplication occurs before the step of identifying similar blocks for grouping.).  

Regarding claim 15, the combination of Shilane’990 and Shilane’902 teaches all of the limitations of claim 10 above. 
The remainder of claim 15 recites limitations described in claim 12 above, and thus are rejected based on the teachings and rationale as described in claim 12 above.


Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Shilane’990 and further in view of Volvovski (Volvovski et al., US 2020/0097181 A1).
Regarding claim 7. A storage system comprising: a drive having a physical storage area (Shilane’990, Fig. 1, Storage Units 108 and 109, and column 6, lines 58-64); and a controller (Shilane’990, FIG 24, storage system 2400 and related paragraphs column 25, lines 10-42 ) configured to process data input into and output from the drive (Shilane’990, FIG 24, File Service Interface 2404.  See also Shilane’990, column 28, lines 17-23), wherein: the controller includes a cache area configured to store data to be read out of or written into the drive (Shilane’990, FIG. 14 intermediate storage such as flash or memory 1402, where intermediate storage is an example of cache memory used to hold data read or written to Storage Disk 1403.), 
and the controller: groups a plurality of non-identical pieces of data stored in the cache area based on a similarity degree among the plurality of pieces of data (Shilane’990, column 10, lines 10-16 discloses that similar files such as A and A’ may be stored together.  Shilane’990 column 2, lines 38-43 discloses ‘only the original data chunk and a difference (i.e., the delta) between the two similar data chunks are stored rather than two entire data chunks for the delta compression technique.’  Thus Shilane’990 groups non-identical A and A’, and the common text and the delta between them is stored to represent A’.), 
selects a group, compresses data of the selected group in group units, and stores the compressed data in the drive (Shilane’990, column 9, lines 63-67, discloses A and A’ and B and B’ are reorganized and then compressed into compressed file 215 for storage (for example to Storage Units 208 and 209.  Shilane’990, Column 21, lines 63-67, discloses that similar chucks such as A1, A2... A100 may be compressed (grouped) in one region and similar chunks such as B1, B2,... B7 are compressed (grouped) and stored in compression region B.   Shilane’990, column 4, lines 62-64 discloses that a compression region is a block of storage area.)
wherein the selection of the group is performed based on an access frequency to data in the group and the similarity degree among a plurality of data included in the group; 
and wherein the controller completes decompression (Shilane’990 column 11, lines 1-9 discloses that when compressed data is restored it is decompressed by decompressor 123 to recover the original data.)
However, Shilane’990 does not explicitly teach for data included in the same group, arranges data having a high access frequency at a closer position to a head side of the group than data having a low access frequency. and wherein the controller completes decompression and the compressed group from data at the head side of the group
Volvovski, of a similar field of endeavor, further teaches for data included in the same group (Volvovski [0038] discloses co-locating data based on access frequency. ), arranges data having a  high access frequency at a closer position to a head side of the group than data having a low access frequency (Volvovski[0038] discloses placing highly modified data together in high performance areas of the storage device (e.g. such as outer diameter of a hard drive).    Thus Volvovski organizes data based on access frequency and places the data with the highest access frequency at the outer diameter) which is an example of close position to a head side of the group since it has a lower track number.)
and the compressed group from data at the head side of the group (Volvovski [0038] discloses the system performs compaction, which per Volvovski [0019] may involve compression, which per Volvovski [0038] is data where the most frequently accessed data is placed with the at the outer diameter, which is at track 0.  When the group is decompressed, the data is read from the beginning with data at track 0 which is the head side of the group.).
	Shilane’990, Shilane’902, and Volvovski are in a similar field of endeavor as all relate to locating similar data in data strings to improve compression outcome of stored data.   Thus it would have been obvious before the claimed invention was effectively filed to incorporate placing the most frequently accessed compressed data at the outer tracks into the solution of Shilane’990 and Shilane’902 that may be implemented using hard disks.  One would be motivated to do so in order to Volvovski [0038] increase storage efficiency and performance.


Claims 9 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Shilane’990 in view of Shailane’902 and as described in claim 1 above, and further in view of Gonczi (Gonczi et al., US 2020/0341668 A1).
Regarding claim 9, the combination of Shilane’990 and Shilane’902 teaches all of the limitations of claim 1 above.  Shilane’990 further teaches wherein the controller performs deduplication determination on data related to a write request, performs deduplication processing when deduplication is necessary (Shilaen’990, column 19, lines 48-60 that discloses deduplication may be performed before analyzing similarity), 
  and performs the grouping, the compression, and the storage of the data in the drive when deduplication is unnecessary.
Gonczi, of a similar field of endeavor, further teaches  and performs the grouping, the compression, and the storage of the data in the drive when deduplication is unnecessary (Gonczi, FIG .8 and related paragraphs [0120]-[0123].  Note that if there is a full block deduplication available, it is performed at step 1408 and processing stops.  Otherwise, at step 1411 a test is performed to determine if partial deduplication can be performed when at least one sub-block matches at least one sub-block of the target, then partial deduplication is performed.   Thus Shilane’990 in view of Gonczi would group, compress and store partial chunks (examples of sub-blocks) as discussed in claim 1 above at step 1414 of Gonczi.).  
Shilane’990, Shilane’902, and Gonczi are in a similar field of endeavor as all relate to locating similar data in data strings to improve compression outcome of stored data.  Thus it would have been obvious before the claimed invention was effectively filed to incorporate the metadata and its tracking as disclosed by Gonczi into the solution of Shilane’990 and Shilane’902.   The metadata that tracks the similarity metric may be tracked as metadata in an entry of the SSM structure  which (Gonczi[0003]) allows multiple host systems to access the single data storage system that contains shared data stored therein.   Thus duplicate data, even on a sub-block level (or partial write level) can be shared.


Regarding claim 11, the combination of Shilane’990 and Shilane’902 teaches all of the limitations of claim 1 above.  However the combination does not explicitly teach wherein the grouping of the plurality of non-identical pieces of data is performed by the controller after deduplication processing
Gonczi, of a similar field of endeavor, further teaches wherein the grouping of the plurality of non-identical pieces of data is performed by the controller after deduplication processing (Gonczi, FIG .8 and related paragraphs [0120]-[0123].  Note that if there is a full block deduplication available, it is performed at step 1408 and processing stops.  Otherwise, at step 1411 a test is performed to determine if partial deduplication can be performed when at least one sub-block matches at least one sub-block of the target, then partial deduplication is performed.   Thus Shilane’990 and Silane’902 in view of Gonczi would group, compress and store partial chunks (examples of sub-blocks) as discussed in claim 1 above at step 1414 of Gonczi.).
The motivation to combine Gonczi into the combination of  Shilane’990 and Shilane’902  is the same as set forth in claim 9 above.





s 13 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Shilane’990 in view of Shilane’902 as described in claim 12 above, and further in view of Yang (Yang et al., US 2012/0137059 A1).
Regarding claim 13, the combination of Shilane’990 and Shilane’902 teaches all of the limitations of claim 12 above.   However, the combination does not explicitly teach wherein the controller calculates the similarity degree based on a number of common words between the non-identical pieces of data stored in the cache memory.
Yang, of a similar field of endeavor, further discloses wherein the controller calculates the similarity degree based on a number of common words between the non-identical pieces of data stored in the cache memory (Yang [0127] discloses that when a newly arrived data block arrives, the system compares the sub-signatures of the block to the sub-signatures of the reference blocks and the if the matching sub-signatures exceeds a similarity threshold the system proceeds to 1910 where delta compression is performed, otherwise it stores the new block as an independent block in the cache.   See also Yang [0123] that discloses a 4KB block may calculate 4K-2 sub-signatures from all sets of three consecutive bytes in the 4KB block. Thus the 3 consecutive bytes are an example of a word, and each sub-signature represents a hash of a rolling set of words within the block.   The similarity of the newly arrived block is calculated based on the number of common sub-blocks (words) between the non-identical arriving word and the available cached data.)
Shilane’990, Shilane’902, and Yang are in a similar field of endeavor as all relate to compressing storage data using a delta cache.   Thus it would have been obvious to a person of ordinary skill in the art before the effectively filed date of the claimed invention to incorporate 

Regarding claim 16, the combination of Shilane’990 and Shilane’902 teaches all of the limitations of claim 15 above. 
The reminder of claim 16 recites limitations described in claim 13 above, and thus are rejected based on the teachings and rationale as described in claim 13 above.


Response to Remarks
Examiner thanks applicant for their claim updates and remarks in their response of 01/04/2021.  Applicant argument on page 12 that Gonczi shows that data is arranged not by access frequency but by the appearance frequency of symbol is persuasive.  Therefore, the rejection to claim 7 has been withdrawn.  However, upon further consideration, a new ground(s) for rejection is made based using Shilane’990 and Shilane’902 as disclosed in the office action above and further in view of newly cited Volvovski.   

Applicant argues on page 10 ‘Shilane 990 discloses extracting data whose access frequency is high in the group and compressing data whose access frequency is low.  By grouping data where access frequency is high in the same group regardless of similarity degree 
Examiner respectfully disagrees.  While it is true that the claim does not require regrouping based upon access frequency, it also does not preclude performing regrouping operations based on access frequency as taught by Shilane ‘990 in view of Shilane ‘902.  Examiner notes that the order of grouping is not explicitly claimed, thus “compressing a group whose access frequency is low first” is not claimed.  Applicant claims two separate groups, one group that is accessed less frequently than a first set value and has a similarity degree larger than a second set value, and a second group whose access higher than the first set value and is maintained in the cache.   No reference to the similarity of the data in the second set is claimed other than the access frequency.
As shown in the rejection above, the regrouping operations of Shilane provide clear scenarios where a given group has a known access frequency level established by ‘more frequently’ and creates another group that contains similar data that is not accessed ‘more frequently’ (thus is access less frequently), as required by the claim.

Applicant argues on page 12 that Gonczi shows that data is arranged not by access frequency but by the appearance frequency of symbol.
Applicant’s argument with respect to claim 1 is persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) for reception is made based on Shilane’990 in view of Shilane’902 and further in view of newly cited Volvovski and Gonczi.  See the office action above for further details.



Conclusion
  The prior art made of record and not relied upon is considered pertinent to the applicant’s disclosure: Wayback copy of www.active-undelete.com:80/hdd_basic.htm taken 3/25/2018 that discloses Hard Disk Drive terminology to clarify that track 0 is at the front/head of the disk. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JANICE M. GIROUARD whose telephone number is (469)295-9131.  The examiner can normally be reached on M-F 9:30 - 7:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached on 571-270-7519.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-






/J.M.G./Examiner, Art Unit 2138                                                                                                                                                                                                        
/William E. Baughman/Primary Examiner, Art Unit 2138