DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The Examiner suggests amending the claim(s) to include a “non-transitory computer readable storage medium” to overcome the rejection above.   



	
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 1 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim(s) recite(s) sorting/categorizing entries for deduplication and updating categories based on criteria. This judicial exception is not integrated into a practical application because all of the above limitations encompass steps that a person would perform when sorting or batching based on a select criteria and each step can be practically be performed in the mind as a mental step. Nothing in the claim precludes the steps of categorizing and adjusting from being performed in the human mind as mental steps grouping abstract ideas - that is, directed to a judicial exception under Prong 1 of Step 2A. 
Because the claim is recites a judicial exception, Prong 2 of Step 2A determines whether the recited judicial exception is integrated into a practical application. For example, a claim may integrate the exception into a practical application if an additional element reflects an improvement in the functions of a computer, or an improvement to other technology or technical field.	
Though the claim recites a cache and machine learning system, the machine learning system dynamically adjust characteristics associated with the entries and cache are recited at a high level of generality i.e., as a generic system performing generic computer functions of categorizing and adjusting. These additional elements, considered in the context of categorizing and adjusting as a whole, do not integrate the abstract idea into a practical application. Rather, these additional limitations merely use a computer (server, a user device) to perform generic computer activity, categorizing and adjusting. Such elements are not sufficient to integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. Accordingly, claim 1 is not integrated into a practical application.
Under Step 2B, claim 1 does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional computer elements, which are recited at a high level of generality, provide conventional computer functions that do not add meaningful limits to practicing the abstract idea. The machine learning system as recited does not provide any requisite of what the system comprises and how its components achieve the categorizing and updating. 
Accordingly, the additional limitations, considered individually and in combination, do not provide an inventive concept.
Examiner notes that the Applicant’s preamble does not afford patentable weight to the Applicant’s claims because this claim’s preamble is not “necessary to give life, meaning, and vitality” to the claim.
Claims 2-9 do not include language that would preclude the steps of categorizing and updating, of claim 1 from practically being performed in the human mind, nor with respect to the individual claims.  Further limitations defining the updating and categorizing do not integrate into a practical application. The claims do not include additional elements that are sufficient to amount to significantly more than the abstract idea because the additional computer elements, which are recited at a high level of generality, provide conventional computer functions that do not add meaningful limits to practicing the abstract idea. The machine learning system as recited does not provide any requisite of what the system comprises and how its components achieve the categorizing and updating

Claims 10-18 are the system claims corresponding to the method claims 1-9 and rejected under the same reasons set forth in connection with the 101 rejection of claims 1-9.
Claim 19-20 are the computer program product claim corresponding to the method claims 1-2 and is rejected under the same reasons set forth in connection with the 101 rejection of claims 1-2.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-3, 10-12 and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mallaiah. 20140114932 herein Mallaiah in view of Acuna 20120323861 herein Acuna.
Per claim 1, Mallaiah discloses: processing I/O workload of the storage system (fig. 3, ¶0035; a deduplication probability threshold is determined based on a performance metric of a storage system. In some cases, the performance metric may include or be a function of an SLO. In one embodiment, the performance metric is a read and/or a write response time of the data storage system. In another embodiment, the performance metric is, or is a function of, a read and/or a write throughput rate of the data storage system) categorizing deduplication entries stored in a deduplication cache into a set of deduplication groups based on a data deduplication probability associated with the deduplication entries; (fig. 3, ¶0038; statistical information derived from one or more previous deduplication operations may indicate that data objects smaller than 200 KB have a 15% probability of having a deduplication benefit, data objects having a size from 200 KB to 800 KB have a 30% probability of having a deduplication benefit, data objects having a size from 800 KB to 2 MB have a 55% probability of having a deduplication benefit, and data objects larger than 2 MB have a 65% probability of having a deduplication benefit. The statistics may be broken down into fewer or more data object size categories ) dynamically adjust deduplication characteristics associated with the set of deduplication groups based on an I/O workload associated with the 10storage system, wherein the adjusting comprises dynamically adjusting the categorizations (fig. 3, ¶0038; The one or more characteristics of the data object that are used to determine the probability of deduplication for the data object can include characteristics such as the size of the data object (e.g., number of bytes), the type of the data object (e.g., a spreadsheet), the owner of the data object, the last modified date/time of the data object, or the update frequency of the data object).
Mallaiah dynamically adjust deduplication characteristic but does not specifically discloses: and using a machine learning system dynamically adjust deduplication characteristics associated with the set of deduplication groups based on an I/O workload associated with the 10storage system; and training the machine learning system to categorize the deduplication entries by using data resulting from the process.
However, Acuna disclose: and using a machine learning system dynamically adjust deduplication characteristics associated with the set of deduplication groups based on an I/O workload associated with the 10storage system (¶0045; a CIM client and CIM agent for caching of queries in conjunction with deduplication. In one embodiment, the method 700 begins (step 702) CIM Agent builds most-used recipe chains (step 704). The building of the most-used recipe chains may be accomplished by either manually feeding the recipes to the CIM agent or the CIM Agent learning the recipes using machine learning) and training the machine learning system to categorize the deduplication entries by using data resulting from the process (0046; If the query by CIM Client is a CQL query, the method 700 will cache and deduplicate the precompiled query and result by doing a predicate analysis (step 716). The method 700 will categorize (manually feed or learn using machine learning) the query from CIM Client in different groups (step 718)…. Similarly, parts of query processing may be deduplicated (using checksum as indexes as shown above); the examiner notes that the combination of Mallaiah and Acuna disclose machine learning categorization wherein the categorization is in view of deduplication entries/characteristics).
It would have been obvious to one having ordinary skill in the art at to combine the teachings of Mallaiah and Acuna’s machine learning to Mallaiah’s deduplication using light weight statistics to learn patterns and behavior to improve efficiency (¶0023-0024).
Per claim 2, Mallaiah discloses: iteratively updating the deduplication characteristics associated with the set of deduplication groups as data is processed for inline deduplication in the storage system, wherein 15the set of deduplication groups is updated dynamically (¶0048; this data may be gathered only from post-processing deduplication operations. In other cases, this data may also be gathered from inline deduplication operations. The characteristic-based statistics may include historical deduplication results that can be parsed based on many different types of data object characteristics including sizes of data objects, types of data objects, owners of the data objects, last modified date of data objects, update frequencies of data objects, and/or other data object characteristics). 
Per claim 3, Mallaiah discloses: wherein iteratively updating the deduplication characteristics associated with the set of deduplication groups as data is processed for inline deduplication in the storage system comprises: 46Patent ApplicationDocket Number: 110349.02Applicants: Yubing Wang, et al.EMC CONFIDENTIALupdating the deduplication characteristics upon processing of at least one data entry for inline deduplication (¶0058 and 0060; Based on how many data objects are being inline deduplicated and the availability of resources in storage server 540, storage server 540 may determine that the performance requirements of storage server 540 can still be met even if the deduplication probability threshold is lowered. In other words, based on recent performance, sufficient resources may be available to still meet the performance requirements even if the deduplication probability threshold is dropped from 60% to 50%. Once this adjustment is made, more data objects will be deduplicated inline because all received data objects having a deduplication probability of at least 50%, rather than the previous 60%, will now be inline deduplicated). 
Claims 10-12 are the system claims corresponding to the method claims 1-3 and are rejected under the same reasons set forth in connection with the rejection of claims 1-3.
Claims 19-20 are the computer program product s corresponding to the method claims 1-3 and are rejected under the same reasons set forth in connection with the rejection of claims 1-3.


Claims 4-5 and 13-14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mallaiah. 20140114932 herein Mallaiah and Acuna 20120323861 herein Acuna in view of Raymond 20110246741 herein Raymond.
Per claim 4, the combined teachings of Mallaiah and Acuna do not specifically discloses: wherein each deduplication group is defined by a weighted 5Gaussian distribution.
However Raymond discloses: wherein each deduplication group is defined by a weighted 5Gaussian distribution (¶0024; normal distribution (Guassian) defining the deduplication sets).
It would have been obvious to one having ordinary skill in the art at to combine the teachings of Mallaiah, Acuna’s and Raymond because Raymond hash dictionary provides faster inline deduplication (¶0010).
Per claim 5, Raymond discloses: wherein iteratively updating the deduplication characteristics associated with the set of deduplication groups as data is processed for inline deduplication in the storage system comprises: 10dynamically adjusting parameters associated with the weighted Gaussian distribution (¶0020 and ¶0024; graph 100 showing a normal distribution of repeating hash values used in one example of a data deduplication process. In the graph 100, the repeating hash values are shown under the line 110 with a normal distribution with a standard deviation shown at 114 (e.g., "a" of 1 sigma)).
Claims 13-14 are the system claims corresponding to the method claims 4-5 and are rejected under the same reasons set forth in connection with the rejection of claims 4-5.


Claims 6-9 and 15-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Mallaiah. 20140114932 herein Mallaiah and Acuna 20120323861 herein Acuna in view of Colgrove 20130346720 herein Colgrove.
Per claim 6, the combined teachings of Mallaiah and Acuna do not specifically discloses: wherein each deduplication entry in the deduplication cache includes a digest associated with a portion of data and a block mapping metadata associated with the portion of data.
However Colgrove discloses: wherein each deduplication entry in the deduplication cache includes a digest associated with a portion of data and a block mapping metadata associated with the portion of data (¶0069-70; In another embodiment, the mapping table may comprise information used to deduplicate data (deduplication table related information). The information stored in the deduplication table may include mappings between one or more calculated hash values for a given data component and a physical pointer to a physical location in one of the storage devices 176a-176m holding the given data component. In addition, a length of the given data component and status information for a corresponding entry may be stored in the deduplication table).
It would have been obvious to one having ordinary skill in the art at to combine the teachings of Mallaiah, Acuna’s and Raymond because Colgroves mapping structure it reduces I/O traffic and data movement (¶0060; a direct map manipulation may greatly reduce I/O traffic and data movement within the storage devices 176a-176m. The combined time for both servicing the storage access request and performing the dependent read operations from SSDs may be less than servicing a storage access request from a spinning HDD).
Per claim 7, Colgrove discloses: wherein the deduplication entries are categorized into the set of deduplication groups to maximize the data deduplication in the storage system (¶0185 and ¶0186; The attributes update logic 1864 within the control logic 1860 may determine which entries in the tables 1830 and 1840 may be updated during an identified event, such as the events listed above corresponding to block 414 of method 400. The table entries movement logic 1866 may determine how entries within a deduplication table (e.g., fingerprint tables corresponding to the deduplication table) are stored and moved within the table. In addition, the logic 1866 may determine a manner for storage and movement of stored data in physical locations in storage devices 176a-176m. Similarly, the logic 1866 may determine how virtual-to-physical mappings are performed).
Per claim 8, Mallaiah discloses: wherein maximizing data deduplication comprises retaining the 20deduplication entries in the deduplication cache for a period of time based on the data deduplication probability associated with data represented by the deduplication entries, wherein a 47Patent Application Docket Number: 110349.02Applicants: Yubing Wang, et al.EMC CONFIDENTIALsubset of the deduplication entries is retained in the deduplication cache longer as the data deduplication probability increases (fig. 5 ¶0063; the number of data objects being inline deduplicated in a specified time period may be monitored and the deduplication probability threshold may be adjusted to maintain a target rate of data object inline deduplication. In one variation, the rate may be measured based on a quantity of data being inline deduplicated rather than a number of data objects).
Per claim 9, Mallaiah discloses: wherein maximizing data deduplication comprises retaining the 20deduplication entries in the deduplication cache for a period of time based on the data deduplication probability associated with data represented by the deduplication entries, wherein a 47Patent Application Docket Number: 110349.02Applicants: Yubing Wang, et al.EMC CONFIDENTIALsubset of the deduplication entries is retained in the deduplication cache longer as the data deduplication probability increases (fig. 5 ¶0024 and ¶0045; the probability of deduplication for a particular data object is determined to be sufficient to justify inline deduplication if it exceeds a specified deduplication probability threshold for the data storage system. The threshold for the system can be set and/or adjusted based on performance of the data storage system and system resource availability. In this way, the limited data storage computing resources available to perform deduplication inline are used for only those data objects that have one or more characteristics that suggest there will be a deduplication benefit associated with the data object. As discussed below, various characteristics may be used to determine the deduplication probability for a data object).
Claims 15-18 are the system claims corresponding to the method claims 6-9 and are rejected under the same reasons set forth in connection with the rejection of claims 6-9.
Response to Arguments
Applicant's arguments filed 3/11/22 have been fully considered but they are not persuasive.
The applicant argues: Acuna (paragraph [0008]) teaches, "Multiple access paths to identical data are determined for the most-used data access chains". In contrast to Acuna, amended Claim 1 recites, "using a machine learning system to dynamically adjust deduplication characteristics associated with the set of deduplication groups based on an the I/O workload associated with the storage system, wherein the adjusting comprises dynamically adjusting the categorizations; and training the machine learning system to categorize the deduplication entries by using data resulting from the processing". Acuna teaches, "recipes/data access chains" and "Multiple access paths to identical data are determined for the most-used data access chains" where "the CIM Agent learning the recipes using machine learning". Acuna fails to teach or suggest, "training the machine learning system to categorize the deduplication entries by using data resulting from the processing".
The examiner respectfully disagrees and asserts that the combination of Mallaiah and Acuna discloses “training the machine learning system to categorize the deduplication entries by using data resulting from the processing.” Acuna discloses a deduplication system wherein the entries are categorized based on a probability threshold. Mallaiah, discloses categorizing client queries when caching and deduplicating using machine language. Examiner notes that while the claim recites training, the claim does not set for the metes and bounds of the training. The mere presence of analyzing the entries and determining the probabilities of deduplication benefit as seen in Mallaiah or categorizing it based on a certain criteria as seen in Acuna is training the machine language.  Based on the claim language, the training is qualified by categorizing the entries. Therefore, the combined teachings of Acuna’s machine learning categorization of entries combined with Mallaiah’s  deduplication categories discloses “training the machine learning system to categorize the deduplication entries by using data resulting from the processing.” If it the applicants intent to claim a specific training process, the applicant is encouraged to do so in light of the specification.
The applicant argues: The step of "using a machine learning system to dynamically adjust deduplication characteristics associated with the set of deduplication groups based on an the I/O workload associated with the storage system, wherein the adjusting comprises dynamically adjusting the categorizations; and training the machine learning system to categorize the deduplication entries by using data resulting from the processing" does not recite any of the judicial exceptions enumerated in the 2019 PEG. For instance, the claim does not recite any mathematical relationships, formulas, or calculations. Further, the claim does not recite a mental process because the steps are not practically performed in the human mind. Finally, the claim does not recite any method of organizing human activity such as a fundamental economic concept or managing interactions between people.
The examiner respectfully disagrees and asserts that nothing in the claim precludes the steps of processing categorizing, adjusting and training from being performed in the human mind as mental steps grouping abstract ideas - that is, directed to a judicial exception. The mere addition of training a machine learning system is not sufficient to integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea of sorting and categorizing.

Remark
Examiner respectfully requests, in response to this Office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line number(s) in the specification and/or drawing figure(s). This will assist Examiner in prosecuting the application.
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to BABOUCARR FAAL whose telephone number is (571)270-5073. The examiner can normally be reached M-F 8:30-5:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tom VO can be reached on 5712723642. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/BABOUCARR FAAL/Primary Examiner, Art Unit 2138