DETAILED ACTION
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
2.	This communication is in response to the Applicant’s submission filed 21 March 2022 [hereinafter Response], where:
Claims 1, 3, 8, 10, 15, and 17 have been amended.
Claims 2, 9, and 16 have been cancelled.
Claims 1, 3-8, 10-15, and 17-20 are pending.
Claims 1, 3-8, 10-15, and 17-20 are rejected.
Information Disclosure Statement
3.	An information disclosure statements were submitted on 15 February 2022 and 21 March 2022. The submissions comply with the provisions of 37 CFR 1.97. Accordingly, the Examiner considered the information disclosure statements.
Claim Rejections - 35 U.S.C. § 103
4.	The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
5.	The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. § 103 are summarized as follows:
1. 	Determining the scope and contents of the prior art.
2. 	Ascertaining the differences between the prior art and the claims at issue.
3. 	Resolving the level of ordinary skill in the pertinent art.
4. 	Considering objective evidence present in the application indicating obviousness or nonobviousness.
6.	This application currently names joint inventors. In considering patentability of the claims the Examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the Examiner to consider the applicability of 35 U.S.C. § 102(b)(2)(C) for any potential 35 U.S.C. § 102(a)(2) prior art against the later invention.
7.	Claims 1, 4, 5, 7, 8, 11, 12, 14, 15, 18, and 19 are rejected under 35 U.S.C. § 103 as being unpatentable over US Patent 9542296 to Engers [hereinafter Engers] in view of Bermejo et al., “Incremental Wrapper-based Subset Selection with Replacement: an advantageous alternative to sequential forward selection,” JCCM (2009) [hereinafter Bermejo] and US Published Application 20180060192 to Eggert et al. [hereinafter Eggert].
Regarding claim 1, Engers teaches [a] failure prediction system for predicting when persistent storage failures of clients will occur (Engers, abstract, teaches that based on a failure prediction model, a predicted probability of a failure of the selected storage device determined (that is, a failure prediction system)), comprising:
persistent storage for storing (Engers, 15:6-11, teaches [u]sers of the block data storage service may each create one or more persistent storage volumes that each have a specified amount of block data storage space, and may initiate use of such a persistent storage volume (that is, persistent storage for storing)):
baseline training data (Engers, 2:32-35, teaches dynamic statistical model can be established using data from hosts on the provider network. Such data may include not only SMART data reported from the disk devices, but also I/O data and kernel log reports from the hosts (that is, baseline training data)), and
refinement training data (Engers, 3:42-43, teaches [s]imulations may be executed to determine weights, and the weights can be updated as new data becomes available (that is, the updating of the weights is a refinement, in which the new data is refinement training data)); and
a predictor programmed to:
generate an initial prediction model using an initial machine learning algorithm and the baseline training data (Engers, 2:32-37, teaches dynamic statistical model (that is, a machine learning algorithm) can be established (that is, generate an initial prediction model) using data from hosts on the provider network. Such data may include not only SMART data reported from the disk devices, but also I/O data and kernel log reports from the hosts. Furthermore, the data may include historic data where disks have previously failed (that is, the baseline training data); for further example, Engers, 2:46-50, teaches that [w]ith the accessed data, in one embodiment machine-learning can be used to cluster the data by various axes (that is, data clustering is an initial machine learning algorithm));
generate a refined model using:
the refinement training data,
a second machine learning algorithm, and
the initial model (Engers, 4:33-38, teaches that [a]n expert system may take available information pertaining to actual failures of devices and use the information as input (that is, the refinement training data) to a rules-based system to generate updated event probabilities (that is, “to update” pertains to the initial model). The available information may be provided to, for example, a Bayesian process (that is, a second machine learning algorithm) to determine an updated probability for the event), . . . :
* * *
generate a prediction using:
the refined model (Engers, 4:39-41, teaches that [w]ithin this operating environment (that is, the operating environment includes the refined model), failure prediction engine 100 may determine a predicted probability of a failure (that is, generate a prediction using)), and
live data from a client of the clients (Engers, 4:41-48, teaches that [f]ailure prediction engine 100 may gather data from other components of the operating environment. Such as data store 150. Data store 150 may collect information from storage devices 130 and other resources 140, among others. The failure prediction engine 100 may also collect information stored in log files and other locations. The information may also be obtained by querying devices for data that is not currently being stored in a log file (that is, generate a prediction using: . . . live data from a client of the clients));
make a determination that the prediction implicates an action (Engers, 7:32-36, teaches calculating the predicted probability of failure based at least in part on the historical and current data associated with the failure of the selected storage devices and a failure prediction model (that is, make a determination that the prediction implicates an action)); and
initiate performance of the action based on the determination (Engers, claim 6, teaches identify[ing] for replacement (that is, initiate performance of the action) the one of the plurality of storage device in response to determining that the predicted probability of failure (that is, based on the determination) of the one of the plurality of storage devices meets a criterion (that is, initiate performance of the action based on the determination)).
Though Engers teaches the feature that machine-learning can be used to cluster the data (including attributes or features) by various axes, where “clusters” are inclusive of feature classification of data, Engers, however, does not explicitly teach -
* * *
. . . , wherein generating the refined model comprises:
generating the refinement training data based on a subset of a plurality of features of the baseline training data comprises a portion of the plurality of features included in the baseline training data . . . ; 
* * *
But Bermejo teaches -
* * *
. . . , wherein generating the refined model comprises:
generating the refinement training data based on a subset of a plurality of features of the baseline training data (Bermejo, left column of p. 1, “I. Introduction,” first paragraph, teaches Feature (or variable, or attribute) Subset Selection (FSS) is the process of identifying the input variables which are relevant to a particular learning (or data mining) problem (that is, given the baseline training data, the “FSS” is generating the refinement training data based on a subset of a plurality of features of the baseline training data)), wherein the subset of a plurality of features comprises a portion of the plurality of features included in the baseline training data (Bermejo, left column of p. 2, “1. Introduction,” first full paragraph, teaches Incremental Wrapper-based FSS (IWSS) uses relevance criterion in order to decide when a new attribute must be included in the selected subset. (that is, a portion of the plurality of features included in the baseline training data)) . . . ;
* * *
Engers and Bermejo are from the same or similar field of endeavor. Engers teaches storage devices attributes for failure monitoring based on a failure prediction model. Bermejo teaches feature subset selection in classification oriented datasets with a very large number of attributes. Thus, it would have been obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to modify the failure prediction via machine learning of Engers with the feature subset selection of Bermejo.
The motivation for doing so is to improve the performance of learned models by removing the most irrelevant and redundant features. (Bermejo, left column of p. 1, “1. Introduction,” first paragraph).
Though Engers and Bermejo teach the features of training a model and applying a FSS algorithm to generate a subset of baseline data, the combination of Engers and Bermejo, however, does not explicitly teach -
. . . , wherein generating the refined model comprises:
generating the refinement training data . . . , wherein the subset of a plurality of features comprises a portion of the plurality of features . . . that are associated with an initial prediction generated using the initial prediction model; and
refining the initial model, to obtain the refined model, using: 
the second machine learning algorithm, and the refinement training data;
* * *
But Eggert teaches:
. . . , wherein generating the refined model comprises:
generating the refinement training data . . . , wherein the subset of a plurality of features comprises a portion of the plurality of features . . . that are associated with an initial prediction generated using the initial prediction model (Eggert Fig. 9 teaches new, enhanced version of the failure prediction model can be generated and loaded into new and existing storage devices during field use:

    PNG
    media_image1.png
    909
    862
    media_image1.png
    Greyscale

Eggert ¶ 0067 teaches field FPM performance data, FPM failure notifications, and the FA results are utilized by the generator unit to build an enhanced, second FPM (that is, the subset of a plurality of features comprises a portion of the plurality of features . . . that are associated with an initial prediction generated using the initial prediction model)); and 
refining the initial model, to obtain the refined model, using: 
the second machine learning algorithm, and the refinement training data (Eggert ¶ 0027 teaches modeling allows each individual device to report potential upcoming failures to a local user, in a manner similar to the way in which SMART systems operate. In addition, as noted above the real world failure information is passed back to a central location (e.g., the processing server), which uses the information from real world failures and predicted failures to improve the model (that is, “predicted failures” is an initial prediction). The improved model (that is, the second machine learning algorithm) is thereafter uploaded to the drives); and
* * *
Engers, Bermejo, and Eggert are from the same or similar field of endeavor. Engers teaches storage devices attributes for failure monitoring based on a failure prediction model. Bermejo teaches feature subset selection in classification oriented datasets with a very large number of attributes. Eggert teaches generating an updated, second failure prediction model responsive to the transferred data as well as from data from at least a second data storage device. Thus, it would have been obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to modify the combination of Engers and Bermejo providing failure prediction model generation having feature subset selection with the updated secondary model of Eggert.
The motivation for doing so is to significantly improve the effectiveness of the failure prediction model by using large data sets available from the field. (Eggert ¶ 0029).
Examiner notes that the Applicant’s preamble does not afford patentable weight to the Applicant’s claims because the claim preamble is not “necessary to give life, meaning, and vitality” to the claim. Moreover, because the Applicant’s preamble merely states the purpose or intended use of the invention rather than any distinct definition of any of the claimed invention’s limitations, the preamble is not considered a limitation and is of no significance to claim construction.
Regarding claims 8 and 15, Engers teaches [a] method for operating a persistent storage failure prediction system (Engers, 4:31-36, teaches an expert system that utilizes logical inferences based on the available information may be used. An expert System may take available information pertaining to actual failures of devices and use the information as input to a rules-based system to generate updated event probabilities), and [a] non-transitory computer readable medium comprising computer readable program code, which when executed by a computer processor enables the computer processor to perform a method for operating a persistent storage failure prediction system (Engers, 13:7-17), comprising:
generating an initial prediction model using an initial machine learning algorithm and baseline training data (Engers, 2:32-37, teaches dynamic statistical model (that is, a machine learning algorithm) can be established (that is, generate an initial prediction model) using data from hosts on the provider network. Such data may include not only SMART data reported from the disk devices, but also I/O data and kernel log reports from the hosts. Furthermore, the data may include historic data where disks have previously failed (that is, the baseline training data); for further example, Engers, 2:46-50, teaches that [w]ith the accessed data, in one embodiment machine-learning can be used to cluster the data by various axes (that is, data clustering is an initial machine learning algorithm));
generating a refined model using:
refinement training data,
a second machine learning algorithm, and
the initial model (Engers, 4:33-38, teaches that [a]n expert system may take available information pertaining to actual failures of devices and use the information as input (that is, the refinement training data) to a rules-based system to generate updated event probabilities (that is, “to update” pertains to the initial model). The available information may be provided to, for example, a Bayesian process (that is, a second machine learning algorithm) to determine an updated probability for the event), . . . ;
* * *
generating a prediction using:
the refined model (Engers, 4:39-41, teaches that [w]ithin this operating environment (that is, the operating environment includes the refined model), failure prediction engine 100 may determine a predicted probability of a failure (that is, generating a prediction using)), and
live data from a client of the clients (Engers, 4:41-48, teaches that [f]ailure prediction engine 100 may gather data from other components of the operating environment. Such as data store 150. Data store 150 may collect information from storage devices 130 and other resources 140, among others. The failure prediction engine 100 may also collect information stored in log files and other locations. The information may also be obtained by querying devices for data that is not currently being stored in a log file (that is, generate a prediction using: . . . live data from a client of the clients));
making a determination that the prediction implicates an action (Engers, 7:32-36, teaches calculating the predicted probability of failure based at least in part on the historical and current data associated with the failure of the selected storage devices and a failure prediction model (that is, making a determination that the prediction implicates an action));
initiating performance of the action based on the determination (Engers, claim 6, teaches identify[ing] for replacement (that is, initiating performance of the action) the one of the plurality of storage device in response to determining that the predicted probability of failure (that is, based on the determination) of the one of the plurality of storage devices meets a criterion (that is, initiate performance of the action based on the determination)).
Though Engers teaches the feature that machine-learning can be used to cluster the data (including attributes or features) by various axes, where “clusters” are inclusive of feature classification of data, Engers, however, does not explicitly teach -
* * *
. . . , wherein generating the refined model comprises:
generating the refinement training data based on a subset of a plurality of features of the baseline training data comprises a portion of the plurality of features included in the baseline training data . . . ; 
* * *
But Bermejo teaches -
* * *
. . . , wherein generating the refined model comprises:
generating the refinement training data based on a subset of a plurality of features of the baseline training data (Bermejo, left column of p. 1, “I. Introduction,” first paragraph, teaches Feature (or variable, or attribute) Subset Selection (FSS) is the process of identifying the input variables which are relevant to a particular learning (or data mining) problem (that is, given the baseline training data, the “FSS” is generating the refinement training data based on a subset of a plurality of features of the baseline training data)), wherein the subset of a plurality of features comprises a portion of the plurality of features included in the baseline training data (Bermejo, left column of p. 2, “1. Introduction,” first full paragraph, teaches Incremental Wrapper-based FSS (IWSS) uses relevance criterion in order to decide when a new attribute must be included in the selected subset. (that is, a portion of the plurality of features included in the baseline training data)) . . . ;
* * *
Engers and Bermejo are from the same or similar field of endeavor. Engers teaches storage devices attributes for failure monitoring based on a failure prediction model. Bermejo teaches feature subset selection in classification oriented datasets with a very large number of attributes. Thus, it would have been obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to modify the failure prediction via machine learning of Engers with the feature subset selection of Bermejo.
The motivation for doing so is to improve the performance of learned models by removing the most irrelevant and redundant features. (Bermejo, left column of p. 1, “1. Introduction,” first paragraph).
Though Engers and Bermejo teach the features of training a model and applying a FSS algorithm to generate a subset of baseline data, the combination of Engers and Bermejo, however, does not explicitly teach -
. . . , wherein generating the refined model comprises:
generating the refinement training data . . . , wherein the subset of a plurality of features comprises a portion of the plurality of features . . . that are associated with an initial prediction generated using the initial prediction model; and
refining the initial model, to obtain the refined model, using: 
the second machine learning algorithm, and the refinement training data;
* * *
But Eggert teaches:
. . . , wherein generating the refined model comprises:
generating the refinement training data . . . , wherein the subset of a plurality of features comprises a portion of the plurality of features . . . that are associated with an initial prediction generated using the initial prediction model (Eggert Fig. 9 teaches new, enhanced version of the failure prediction model can be generated and loaded into new and existing storage devices during field use:

    PNG
    media_image1.png
    909
    862
    media_image1.png
    Greyscale

Eggert ¶ 0067 teaches field FPM performance data, FPM failure notifications, and the FA results are utilized by the generator unit to build an enhanced, second FPM (that is, the subset of a plurality of features comprises a portion of the plurality of features . . . that are associated with an initial prediction generated using the initial prediction model)); and 
refining the initial model, to obtain the refined model, using: 
the second machine learning algorithm, and the refinement training data (Eggert ¶ 0027 teaches modeling allows each individual device to report potential upcoming failures to a local user, in a manner similar to the way in which SMART systems operate. In addition, as noted above the real world failure information is passed back to a central location (e.g., the processing server), which uses the information from real world failures and predicted failures to improve the model (that is, “predicted failures” is an initial prediction). The improved model (that is, the second machine learning algorithm) is thereafter uploaded to the drives); and
* * *
Engers, Bermejo, and Eggert are from the same or similar field of endeavor. Engers teaches storage devices attributes for failure monitoring based on a failure prediction model. Bermejo teaches feature subset selection in classification oriented datasets with a very large number of attributes. Eggert teaches generating an updated, second failure prediction model responsive to the transferred data as well as from data from at least a second data storage device. Thus, it would have been obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to modify the combination of Engers and Bermejo providing failure prediction model generation having feature subset selection with the updated secondary model of Eggert.
The motivation for doing so is to significantly improve the effectiveness of the failure prediction model by using large data sets available from the field. (Eggert ¶ 0029).
Examiner notes that the Applicant’s preamble does not afford patentable weight to the Applicant’s claims because the claim preamble is not “necessary to give life, meaning, and vitality” to the claim. Moreover, because the Applicant’s preamble merely states the purpose or intended use of the invention rather than any distinct definition of any of the claimed invention’s limitations, the preamble is not considered a limitation and is of no significance to claim construction.
Examiner notes that the term "computer processor" and “computer readable medium " recited in Applicant's claims is interpreted to be a well-known hardware structure. 
Regarding claims 4, 11, and 18, the combination of Engers, Bermejo, and Eggert teaches all of the limitations of claims 1, 8, and 15, respectively, as described in detail above.
wherein the refinement training data comprises training data obtained from at least two of the clients (Engers, 4:33-38, teaches that [a]n expert system may take available information pertaining to actual failures of devices (that is, ”devices” are a plurality, in which the data is obtained from at least two of the clients) and use the information as input (that is, the refinement training data or simply training data) to a rules-based system to generate updated event probabilities (that is, “to update” pertains to the initial model)).
Regarding claims 5, 12, and 19, the combination of Engers, Bermejo, and Eggert teaches all of the limitations of claims 4, 11, and 18, respectively, as described in detail above. 
wherein the live data comprises second training data obtained from only the client of the clients (Engers, 4:41-48, teaches that [f]ailure prediction engine 100 may gather data from other components of the operating environment, such as data store 150 (that is, second training data obtained from only the client of the clients). Data store 150 may collect information from storage devices 130 and other resources 140, among others).
Regarding claims 7 and 14, the combination of Engers, Bermejo, and Eggert teaches all of the limitations of claims 4 and 11, respectively, as described in detail above. 
Bermejo teaches -
wherein the baseline training data comprises more features than the refinement data (Bermejo, Figure 1, teaches an IWSS canonical algorithm:

    PNG
    media_image2.png
    406
    447
    media_image2.png
    Greyscale

Bermejo, left column at p. 2, “II. Incremental Wrapper-based FSS,” starting at last partial paragraph, teaches data resulting from evaluating that subset is stored in BestData (that is, being a “subset,” application of the baseline training data is wherein the baseline training data comprises more features than the refinement data)).
8.	Claims 3, 10, and 17 are rejected under 35 U.S.C. § 103 as being unpatentable over US Patent 9542296 to Engers [hereinafter Engers] in view of Bermejo et al., “Incremental Wrapper-based Subset Selection with Replacement: an advantageous alternative to sequential forward selection,” JCCM (2009) [hereinafter Bermejo] and US Published Application 20180060192 to Eggert et al. [hereinafter Eggert] and further in view of US Published Application 20190008461 to Gupta et al. [hereinafter Gupta]1.
Regarding claims 3, 10, and 17, the combination of Engers, Bermejo, and Eggert teaches all of the limitations of claims 1, 8, and 15, respectively, as described in detail above. 
Eggert teaches - 
wherein refining the initial model comprises:
introducing the refinement training data into the initial model (Eggert ¶ 0028 teaches further information flows to the processing circuit used to generate the enhanced [failure prediction model] from failed devices that have been physically returned, such as in the case of warranty repairs. . . . In this case, the model (that is, the initial model) is updated in two ways-from data reporting from the field showing what is being detected and predicted (both good and bad), and from warranty analysis of returned devices to see what false positives and confirmed failures are being identified (that is, with the refinement training data features of Bermejo and Eggert, Eggert teaches introducing the refinement training data into the initial model)); and
* * *
Though Engers, Bermejo, and Eggert teach the features that machine-learning can be used to cluster the data (including attributes or features) by various axes, where “clusters” are inclusive of feature classification of data and feature selection subsets to generate a second failure prediction model, the combination of Engers, Bermejo, and Eggert, however, does not explicitly teach -
generating the refined model by performing one process from a group of processes consisting of:
creating a new split above a first existing split in the initial model,
extending a second existing split in the initial model, and
splitting an existing leaf of the initial model into at least two child nodes in the refined model.
But Gupta teaches -
generating the refined model by performing one process from a group of processes (Gupta ¶ 0173 teaches decision tree 600 (i.e., Decision Tree-based algorithm) is shown in Fig. 6, wherein each end-node predicts a specific error (or source) (that is, generating the refined model by performing one process for a group of processes)) consisting of:
creating a new split above a first existing split in the initial model (Gupta ¶ 0173 teaches [t]he Decision Tree is generated using an algorithm which recursively splits a node (parent) (that is, creating a new split above a first existing split in the initial model)),
extending a second existing split in the initial model (Gupta ¶ 0173 teaches [t]he (child) nodes are continued (that is, extending) to be split until a termination criterion is met (that is, extending a second existing split in the initial model), and
splitting an existing leaf of the initial model into at least two child nodes in the refined model (Gupta ¶ 0173 teaches [t]he Decision Tree is generated using an algorithm which recursively splits a node (parent), containing training data, into two child nodes (Left and Right) based on a predictor variable, its threshold value, and an objective error criterion (that is, splitting an existing leaf of the initial model into at least two child nodes in the refined model)).
Engers, Bermejo, Eggert, and Gupta are from the same or similar field of endeavor. Engers teaches storage devices attributes for failure monitoring based on a failure prediction model. Bermejo teaches feature subset selection in classification oriented datasets with a very large number of attributes. Eggert teaches generating an updated, second failure prediction model responsive to the transferred data as well as from data from at least a second data storage device. Gupta teaches the use of decision trees for model generation to predict a specific error. Thus, it would have been obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to modify the combination of Engers, Bermejo, and Eggert providing failure prediction model generation having feature subset selection to generate an updated secondary model with the decision tree predictive model generation of incorporating node splitting of Gupta.
The motivation for doing so is to continuously or periodically monitor devices to confirm the accuracy and/or detect any degradation of performance of the device within based on a predictor importance. (Gupta ¶ 0094).
9.	Claims 6, 13, and 20 are rejected under 35 U.S.C. § 103 as being unpatentable over US Patent 9542296 to Engers [hereinafter Engers] in view of Bermejo et al., “Incremental Wrapper-based Subset Selection with Replacement: an advantageous alternative to sequential forward selection,” JCCM (2009) [hereinafter Bermejo] and US Published Application 20180060192 to Eggert et al. [hereinafter Eggert] and further in view of US Published Application 20080250265 to Chang et al. [hereinafter Chang].
Regarding claims 6, 13, and 20, the combination of Engers, Bermejo, and Eggert teaches all of the limitations of claims 4, 11, and 18, respectively, as described in detail above. 
Though Engers, Bermejo, and Eggert teach the features that machine-learning can be used to cluster the data (including attributes or features) by various axes, where “clusters” are inclusive of feature classification of data and feature selection subsets to generate a second failure prediction model, the combination of Engers, Bermejo, and Eggert, however, does not explicitly teach -
wherein the training data comprises at least one feature selected from a group of features consisting of:
workload features (Chang ¶ 0070 teaches [t]he distinction between external and internal features is or particular interest when workload characteristics evolve (that is, workload features));
self-monitoring, analysis and reporting technology features (Chang ¶ 0072 teaches the failure predicate may be obtained from an administrator or, potentially, by another system component that uses outlier detection (or, in general, unsupervised learning techniques) to agnostically label suspect behavior (that is, self-monitoring, analysis and reporting technology features));
disk health status features (Chang ¶ 0067 teaches [f]or example, say we want to issue an early alarm for processing hot-spots, which might be characterized by, e.g., “processing time>5 ms,” and which may be due to memory or buffer exhaustion (that is, disk health status features)); and
input-output stack statistical features (Chang ¶ 0070 the term external features is employed. An example would be the output rate of an upstream operator (i.e., the data arrival rate before any buffering and queueing in the downstream operator) (that is, input-output stack statistical features)).
Engers, Bermejo, Eggert, and Chang are from the same or similar field of endeavor. Engers teaches storage devices attributes for failure monitoring based on a failure prediction model. Bermejo teaches feature subset selection in classification oriented datasets with a very large number of attributes. Eggert teaches generating an updated, second failure prediction model responsive to the transferred data as well as from data from at least a second data storage device. Chang teaches measurement resources are allocated to the metrics that have been selected as key features by the prediction model. Accordingly, it would have been obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to modify the combination of Engers, Bermejo, and Eggert providing failure prediction model generation having feature subset selection to generate an updated secondary model with the feature identification of Chang.
The motivation for doing so is to increase the efficiency of failure prediction with a low overhead by employing an adaptive measurement sampling scheme. (Chang ¶ 0063).
Response to Arguments
10.	Examiner has fully considered Applicant’s arguments, in which Examiner responds below, accordingly.
11.	Regarding the rejection under Section 102, Applicant argues “amended independent claims 1, 8, and 15 are patentable over Engers. By virtue of their dependence, the remaining claims are patentable for at least the same reasons. Accordingly, withdrawal of this rejection is respectfully requested.” (Response at pp. 8-9).
Examiner agrees. In view of Applicant’s amendments to the claims, Examiner cites to the features taught by the prior art references of Bermejo and Eggert, as set out hereinabove.
The rejection clearly sets forth which claim limitations are taught by each of the prior art references, and the reason why it would be obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to combine their teachings.
12.	Regarding the rejection under Section 103, Applicant argues “Chang is completely silent with regards to ‘wherein the subset of a plurality of features comprises a portion of the plurality of features included in the baseline training data that are associated with an initial prediction generated using the initial prediction model,’” (Response at pp. 10-11).
Examiner agrees. With regard to the amended claim language, Examiner relies on the features taught by Bermejo and Eggert. The BRI of the instant claim language covers the teachings of these prior art references, as set out in detail in the rejections hereinabove. 
The rejection clearly sets forth which claim limitations are taught by each of the prior art references, and the reason why it would be obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to combine their teachings.
13.	Applicant argues that “that Gupta fails to supply that which Engers and Chang lack. That is, Gupta is also silent with regard to above-referenced limitation (ii) of the amended independent claims." (Response at pp. 11-12).
Examiner submits that the argument is now moot in view of the reliance on the teachings of Bermejo and Eggert with respect to the instant claims, which necessitated the new grounds of rejection presented in this Office action. These references are set out in the and the incorporation of these references into the rejections set out in detail hereinabove. 
The rejection clearly sets forth which claim limitations are taught by each of the prior art references, and the reason why it would be obvious to a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention to combine their teachings.
Conclusion
14.	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
15.	The prior art made of record and not relied upon is considered pertinent to Applicant's disclosure:
(Bermejo et al., “Speeding up incremental wrapper feature subset selection with Naïve Bayes classifier,” Knowledge-Based Systems (2014)) teaches combination of the [Naïve Bayes] classifier (which is known to be largely beneficial for FSS) with incremental wrapper [Feature Selected Subset] algorithms. The merit of this approach is analyzed both theoretically and experimentally, and the results show an impressive speed-up for the embedded FSS process..
(Chaves et al., “BaNHFaP: A Bayesian Network based Failure Prediction Approach for Hard Disk Drives,” IEEE (2016)) teaches Recursive Feature Elimination (RFE) , which is a technique that tries to eliminate irrelevant or redundant features, leading to a smaller but more representative data.
16.	Any inquiry concerning this communication or earlier communications from the Examiner should be directed to KEVIN L. SMITH whose telephone number is (571) 272-5964. Normally, the Examiner is available on Monday-Thursday 0730-1730. 
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the Examiner by telephone are unsuccessful, the Examiner’s supervisor, KAKALI CHAKI can be reached on 571-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/K.L.S./
Examiner, Art Unit 2122
/BRIAN M SMITH/Primary Examiner, Art Unit 2122                                                                                                                                                                                                        



    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 Examiner notes with appreciation the Applicant’s clarification regarding claims 3, 10, and 17, as pointed out in footnote 1 at page 10 of the Response.