DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 8-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  
The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because they would fail, under their broadest reasonable interpretation to exclude software per se. There are no hardware devices being recited in any manner in any of claims 8-20 and the functionality being recited may be understood to be software functionality.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-3, 5-10 and 12-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by US Pre-Grant Publication 2019/0379394 to Hallak.

With regard to independent claim 8,
	Hallak teaches a system having executable logic that implements a method (Hallak: ¶¶0052-0053 – logic/instructions executed to perform taught method) comprising:
 	for a plurality of data units, determining a plurality of clusters of data units based on an extent of similarity of content between the plurality of data units (Hallak: ¶0027 – similarity between data units is used for clustering. See fig. 1 compression steps S120-S130.); 
	10for each of the plurality of clusters, selecting one or more of the data units as a reference portion for the cluster (Hallak: ¶0028 – “selected” reference blocks used for compression of similar blocks. See fig. 1 process for compression steps S140-S160.); and 
	for each cluster, compressing each data unit of the cluster based at least in part on the one or more reference portions of the cluster. (Hallak: ¶0028 – reference blocks used for compression of similar blocks. See fig. 1 process for compression steps S140-S160.)

With regard to dependent claim 9, which depends upon independent claim 8,
	Hallak teaches the system of claim 8, wherein, for each cluster, each data unit of the cluster is compressed using a compression technology that uses the one or more reference portions of the cluster as a dictionary. (Hallak: abstract – reference of a block used for delta comparison with other blocks and lookups of a given block, i.e. “dictionary”. See ¶0028 – reference blocks used for compression of similar blocks. See also fig. 1 process for compression steps S140-S160.)

With regard to dependent claim 10, which depends upon independent claim 8,
	Hallak teaches the system of claim 8, wherein determining the plurality of clusters includes, for each of the 20plurality of data units: 
	generating a hash value for the data unit (Hallak: ¶¶0009-0010 – similarity hash generated and used to identify similar blocks. See also above citations directed to similarity.); and 
	determining an extent of similarity of the data unit to other data units of the plurality of data units based at least in part on the generated hash value. (Hallak: ¶¶0009-0010 – similarity hash determined and used to identify similar blocks. See also above citations directed to similarity.)

With regard to dependent claim 12, which depends upon independent claim 8,
	Hallak teaches the system of claim 8, wherein the method further comprises: 
	30receiving a write operation to an additional data unit not included in the plurality of data units (Hallak: ¶0043-0044 – write requests cause mapping operations to take place. See ¶0028 – reference blocks used for compression of similar blocks. See also fig. 1 process for compression steps S140-S160, as well as ¶0059, which states that reference to a first element is understood to encompass a second instance without limitation.); M&S Ref. No.: EMS-897US
	assigning the additional data unit to a first of the plurality of clusters; and compressing the additional data unit using the reference portion of the first cluster. (Hallak: ¶0043-0044 – write requests cause mapping operations to take place. See ¶0028 – reference blocks used for compression of similar blocks. See also fig. 1 process for compression steps S140-S160, as well as ¶0059, which states that reference to a first element is understood to encompass a second instance without limitation.)

With regard to dependent claim 13, which depends upon dependent claim 12,
	Hallak teaches the system of claim 12, wherein the additional data unit is assigned to the first cluster based 5at least in part on its proximity in data space to the representative portion of the first cluster. (Hallak: abstract – delta between a given data block and a reference portion is found. See also above citations directed to clustering.)

With regard to dependent claim 14, which depends upon independent claim 8,
	Hallak teaches the system of claim 8, further comprising, for at least a first reference portion: 
	determining whether to maintain the first reference portion in compressed form or uncompressed form based at least on: how much memory space is consumed by the first 10reference portion in uncompressed form; and how frequently the first reference portion is used. (Hallak: ¶0030 – block selected for compression based upon uncompressed size. See also above citations directed to reference portion.)


	Claims 1-3 and 5-7 are each similar to claims 8-10 and 12-14 respectively and are each rejected under a similar respective rationale.

	Claims 15-17 and 18-20 are each similar to claims 8-10 and 12-14 respectively and are each rejected under a similar respective rationale.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims  4 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Hallak in view of US Pre-Grant Publication 2021/0019556 to Ganguly.



With regard to dependent claim 11, which depends upon independent claim 8,
	Hallak teaches the system of claim 8.
	Hallak does not fully and explicitly teach wherein, for each of the plurality of clusters, selecting one or more of the data units as a reference portion for the cluster includes running a clustering algorithm in training mode to select the one or more reference portions.  
	Ganguly teaches a system wherein, for each of a plurality of clusters, selecting one or more data units as a reference portion for the cluster includes running a clustering algorithm in training mode to select one or more reference portions. (Ganguly: ¶0025 – training a neural network or other type of model through machine learning. See abstract – clustering and compression performed. See also ¶0021 – measurement of distance from corresponding families, i.e. “reference portions”.)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the machine learning model training functionality of Ganguly into the clustering compression system of Hallak by programming the instructions of Hallak (Hallak: ¶¶0052-0053) to train a machine learning model, as taught by Ganguly. Both systems are directed to similarity-based clustering and data compression (Hallak: abstract, fig. 1; Ganguly: abstract). An advantage obtained through training a machine learning model would have been desirable to implement in the clustering compression system of Hallak. In particular, the motivation to combine the Hallak and Ganguly references would have been to improve the performance of a data processing system through effective clustering. (Ganguly: abstract, ¶0050)

	Claim 4 is similar in scope to claim 11 and is being rejected under a similar rationale.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:

	-US Pre-Grant Publication 2017/0123677 to Singhai for reference data sets
	-US Pre-Grant Publications 2019/0065519, 2020/0387743 and 2018/0234234 to Ohtsuji, Wick and Hurley respectively, each for similarity-based clustering

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAL L BOGACKI whose telephone number is (571)270-5125. The examiner can normally be reached Monday - Thursday 9:30am - 7:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JAMES K TRUJILLO can be reached on (571)272-3677. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MICHAL BOGACKI
Examiner
Art Unit 2157



/M.L.B./           Examiner, Art Unit 2157     

/James Trujillo/           Supervisory Patent Examiner, Art Unit 2157