DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 2/12/21 has been entered.

   1.   REJECTIONS BASED ON PRIOR ART
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  	
	Claim Rejections - 35 USC ' 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1 and 3-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Delaney (US PGPUB # 2018/0074723 A1) in view of Gregor (US PGPUB # 2016/0232440 A1) and Eilert (US PGPUB # 2010/0318718 A1) and Sun (US PGPUB # 2020/0117539 A1). 

	With respect to claim 1, the Delaney reference teaches an apparatus comprising: 
a host interface circuit configured to receive a memory access request, wherein the memory access request is associated with a data set;  (see fig. 1, HBA 110; and paragraph 19, where a host 104 includes a host bus adapter (HBA) 110 in communication with a storage controller 108.a, 108.b of the storage system 102. The HBA 110 provides an interface for communicating with the storage controller 108.a, 108.b, and in that regard, may conform to any suitable hardware and/or software protocol) 
at least one non-volatile memory storage circuit configured to store a transformed data set; (paragraph 21, where storage system 102 executes the data transactions on behalf of the hosts 104 by writing, reading, or otherwise accessing data on the relevant storage devices 106) and 
a translation circuit (see fig. 1, storage controller) configured to: 
based on a write memory access, convert an original version of the data set to the transformed data set, (paragraph 22, where storage controllers 108a-b perform in-line read decompression and write compression) and 

	However, the Delaney reference does not explicitly teach a translation circuit comprising a machine learning circuit; and that the reconstructed data set that includes an approximation of the data set that differs from the data set. (emphasis added); and  wherein the at least one non-volatile memory storage circuit is configured to store a persistent state of the machine learning circuit used to convert the original version of the data set to the transformed data set.
	The Gregor reference teaches it is conventional to have a translation circuit comprising a machine learning circuit. (see fig. 1; and paragraph 20, where encoder neural network 102 compresses data items received during training)
At the time of the invention was effectively filed, it would have been obvious to a person of ordinary skill in the art to modify the Delaney reference to have a translation circuit comprising at least one machine learning circuit, as taught by the Gregor reference.
The suggestion/motivation for doing so would have been to employ one or more layers of nonlinear units to predict an output for a received input.  (Gregor, paragraph 3)
	However, the combination of the Delaney and Gregor references does not explicitly teach that the reconstructed data set that includes an approximation of the data set that differs from the data set (emphasis added); and wherein the at least one non-volatile memory storage circuit is configured to store a persistent state of the 
	The Eilert reference teaches it is conventional to have that the reconstructed data set that includes an approximation of the data set that differs from the data set. (paragraph 29, where there is a read of compressed via decompression; and lossy compression results in higher levels of compression but there may be changes in the data. Lossy compression [analogous to the ‘approximation of the data that differs from the data set’ as claimed] may be appropriate for storage of video data where small changes in the data pattern wouldn't result in a significant degradation in user experience)
At the time of the invention was effectively filed, it would have been obvious to a person of ordinary skill in the art to modify the combination of the Delaney and Gregor references to have the reconstructed data set that includes an approximation of the data set that differs from the data set, as taught by the Eilert reference.
The suggestion/motivation for doing so would have been to have higher compression would result in fewer bits of storage.  (Eilert, paragraph 29)
	However, the combination of the Delaney, Gregor, and Eilert references does not explicitly teach the at least one non-volatile memory storage circuit is configured to store a persistent state of the machine learning circuit used to convert the original version of the data set to the transformed data set.
	The Sun reference teaches it is conventional to have the at least one non-volatile memory storage circuit be configured to store a persistent state of the machine learning circuit used to convert the original version of the data set to the transformed data set. 
At the time of the invention was effectively filed, it would have been obvious to a person of ordinary skill in the art to modify the combination of the Delaney, Gregor, and Eilert references to have the at least one non-volatile memory storage circuit be configured to store a persistent state of the machine learning circuit used to convert the original version of the data set to the transformed data set, as taught by the Sun reference.
The suggestion/motivation for doing so would have been to have the DNN weights to be stored in a SSD that can provide high reliability while still maintaining memory performance.  (Sun, paragraph 58)
Therefore it would have been obvious to combine the Delaney, Gregor, Eilert, and Sun references for the benefits shown above to obtain the invention as specified in the claim.

With respect to claim 3, the combination of the Delaney, Gregor, Eilert, and Sun references teaches the apparatus of claim 1, wherein the translation circuit comprises a plurality of machine learning circuits, and wherein the translation circuit is configured to select one of the plurality to convert the data set based, at least in part upon an amount of fidelity desired by the host and the amount of fidelity provided by the 

With respect to claim 4, the combination of the Delaney, Gregor, Eilert, and Sun references teaches the apparatus of claim 3, wherein the fidelity desired by the host is a fixed value for a set of storage parameters, wherein the storage parameters are selected from a group consisting essentially of: namespace identifier, host identifier, logical block address ranges, non-volatile memory set identifier, non-volatile memory express submission queue identifier, stream identifier, Ethernet media access control identifier, network addresses, transport parameter, and date and time. (Delaney, paragraph 26, where compressed data may include data that is compressed according to one or more compression algorithms, such as Lempel-Ziv-Oberhumer (LZO) compression; and paragraph 47, where a second compression algorithm may be determined to be more effective than a previous compression algorithm that was used to compress the data; effectiveness may be measured, for example, by the amount that the compression algorithm is able to reduce the memory footprint of the data)

With respect to claim 5, the combination of the Delaney, Gregor, Eilert, and Sun references teaches the apparatus of claim 3, wherein the desired fidelity is dynamically adjustable; (Delaney, paragraph 47, where a second compression algorithm may be determined to be more effective than a previous compression algorithm that was used to compress the data; effectiveness may be measured, for example, by the amount that the compression algorithm is able to reduce the memory footprint of the data. Accordingly, the storage controller may read the already compressed data from the memory)
wherein the desired fidelity is dynamically adjusted based, at least in part, upon a data type associated with the memory access request; (Delaney, paragraph 36, where In more detail, when the available resources exceed a threshold, the host write requests may be processed via inline compression processing as described in block 214. When the available resources are below the threshold, host write requests may be processed via background compression processing as described in block 216) and 
wherein the desired fidelity is dynamically adjusted based, at least in part upon a software application associated with the memory access request. (Delaney, paragraph 36, where In more detail, when the available resources exceed a threshold, the host write requests may be processed via inline compression processing as described in block 214. When the available resources are below the threshold, host write requests may be processed via background compression processing as described in block 216)

With respect to claim 6, the combination of the Delaney, Gregor, Eilert, and Sun references teaches the apparatus of claim 1, wherein the transformed data set has 

With respect to claim 7, the combination of the Delaney, Gregor, Eilert, and Sun references teaches the apparatus of claim 1, wherein the translation circuit is configured to de-duplicate the data set, where de-duplication of the data is performed at a block level, relative to data already stored in the at least one non-volatile memory storage circuit. (Delaney, paragraph 37, where data may also be processed by de-duplication techniques instead of or in addition to performing the compression techniques)

With respect to claim 8, the combination of the Delaney, Gregor, Eilert, and Sun references teaches the apparatus of claim 1, wherein, based on a read memory access, the host interface is configured to return the transformed data set, wherein the transformed data set is smaller than or the same size as the original version of the data set, and the transformed data set is an approximation of the data set. (Delaney, paragraph 22, where storage controllers 108a-b perform in-line read decompression and write compression [which makes data smaller]; and see fig. 1, HBA 110; and paragraph 19, where a host 104 includes a host bus adapter (HBA) 110 in communication with a storage controller 108.a, 108.b of the storage system 102. The HBA 110 provides an interface for communicating with the storage controller 108.a, 

With respect to claim 9, the combination of the Delaney, Gregor, Eilert, and Sun references teaches the apparatus of claim 1, wherein the at least one non-volatile memory storage circuit comprises: a field that associates an addressing value included in the memory access with the transformed data set, and a field that indicates which of the machine learning circuits was used to create the transformed data set. (Delaney, paragraph 26, where compressed data may include data that is compressed according to one or more compression algorithms, such as Lempel-Ziv-Oberhumer (LZO) compression; and paragraph 47, where a second compression algorithm may be determined to be more effective than a previous compression algorithm that was used to compress the data; effectiveness may be measured, for example, by the amount that the compression algorithm is able to reduce the memory footprint of the data. Accordingly, the storage controller may read the already compressed data from the memory)

With respect to claim 10, the combination of the Delaney, Gregor, Eilert, and Sun references teaches the apparatus of claim 1, wherein the translation circuit includes a flash translation layer circuit configured to create a version of the transformed data set that is equal to the original version of the data set, and create a reconstructed data set that is equal to the original version of the data set; (Delaney, paragraph 22, where 
 wherein the translation circuit is configured to select, based upon a fidelity requirement, between employing the flash translation layer circuit or the machine learning circuit to process the transformed data set. (Delaney, paragraph 26, where compressed data may include data that is compressed according to one or more compression algorithms, such as Lempel-Ziv-Oberhumer (LZO) compression; and paragraph 47, where a second compression algorithm may be determined to be more effective than a previous compression algorithm that was used to compress the data; effectiveness may be measured, for example, by the amount that the compression algorithm is able to reduce the memory footprint of the data. Accordingly, the storage controller may read the already compressed data from the memory)

With respect to claim 11, the combination of the Delaney, Gregor, Eilert, and Sun references teaches the apparatus of claim 1, wherein the machine learning circuit includes a neural network; and 
wherein the translation circuit is configured to dynamically adjust a number of layers in the neural network based, at least in part, upon a fidelity requirement. (Gregor, paragraph 3; and Delaney, paragraph 47, where a second compression algorithm may be determined to be more effective than a previous compression algorithm that was used to compress the data; effectiveness may be measured, for example, by the amount that the compression algorithm is able to reduce the memory footprint of the 

With respect to claim 12, the combination of the Delaney, Gregor, Eilert, and Sun references teaches the apparatus of claim 1, wherein the translation circuit is configured to: 
employ an observed reconstruction delta to train the machine learning circuit, and employ the observed reconstruction delta to determine the fidelity that can be achieved by selecting a particular machine learning circuit from a plurality of machine learning circuits; (Gregor, paragraph 25, where some implementations the data item generation system 100 may train the encoder neural network 102 and the decoder neural network 104 to autoencode input data items. For example, the data item generation subsystem 100 may train the encoder neural network 102 and decoder neural network 106 to generate an updated neural network output 110 that is a reconstruction of the input data item 108) and 
wherein the at least one non-volatile memory storage circuit is also configured to store a persistent state of the at least one machine learning circuit with the transformed data set. (Gregor, paragraph 26)

With respect to claim 13, the combination of the Delaney, Gregor, Eilert, and Sun references teaches the apparatus of claim 1, wherein the learning circuit includes a common encoder neural network and multiple decoder neural networks. (Gregor, paragraph 5) 
With respect to claim 14, the combination of the Delaney, Gregor, Eilert, and Sun references teaches the apparatus of claim 1, wherein the translation circuit is configured to determine to perform a lossy conversion of the original version of the data set to the transformed data set, at least in part, upon a fidelity target; (Delaney, paragraph 26, where compressed data may include data that is compressed according to one or more compression algorithms, such as Lempel-Ziv-Oberhumer (LZO) compression; and paragraph 47, where a second compression algorithm may be determined to be more effective than a previous compression algorithm that was used to compress the data; effectiveness may be measured, for example, by the amount that the compression algorithm is able to reduce the memory footprint of the data. Accordingly, the storage controller may read the already compressed data from the memory) and 
wherein the translation circuit is configured to dynamically adjust an amount of loss based, at least in part, upon a fidelity target. (Delaney, paragraph 47, where a second compression algorithm may be determined to be more effective than a previous compression algorithm that was used to compress the data; effectiveness may be measured, for example, by the amount that the compression algorithm is able to reduce the memory footprint of the data. Accordingly, the storage controller may read the already compressed data from the memory)

Claims 15-19 are the system implementation of claims 1-14, and rejected under the same rationale as above.  The Examiner notes the Delaney reference teaches the 

Claims 20-22 are another method implementation of claims 1-14, and rejected under the same rationale as above.  

   2.   ARGUMENTS CONCERNING PRIOR ART REJECTIONS
Rejections - USC 102/103
	Applicant’s arguments (see remarks dated 1/12/21) with respect to the claim(s) 1 and 3-22 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of the combination of the Delaney, Gregor, Eilert, and Sun references to teach the newly added limitation of “the at least one non-volatile memory storage circuit is configured to store a persistent state of the machine learning circuit used to convert the original version of the data set to the transformed data set” as shown in the rejections above.  

   3.  CLOSING COMMENTS
	Conclusion
        a.   STATUS OF CLAIMS IN THE APPLICATION
	The following is a summary of the treatment and status of all claims in the application as recommended by M.P.E.P. ' 707.07(i):
        a(1)  CLAIMS REJECTED IN THE APPLICATION
	Per the instant office action, claims 1 and 3-22 have received a first action on the 
      b.   DIRECTION OF FUTURE CORRESPONDENCES 
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Prasith Thammavong whose telephone number is (571) 270-1040 can normally be reached on Monday through Friday, 1-9:30 PM EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Adam Queler can be reached on (571) 272-4140.  The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
/PRASITH THAMMAVONG/
Primary Examiner, Art Unit 2137