Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Drawings
The drawings are objected to because they fail to provide labels or a key for the axes and plots of Fig. 4 as described in the specification.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.



Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised 
Claims 1, 7-8, 10, 16-17, and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Poovalapil US 2019/0385045 in view of Brownlee’s “How to Implement the Backpropagation Algorithm From Scratch In Python”.

1. A method for storage management, comprising:
obtaining a historical usage of storage capacity for a storage device, and a historical feature characterizing the historical usage of storage capacity;
generating a predicted usage of storage capacity for the storage device based on the historical feature and a predictor for predicting a usage of storage capacity; and
updating the predictor by comparing the historical usage of storage capacity with the predicted usage of storage capacity.
Poovalapil teaches obtaining a historical usage of storage capacity for a storage device, and a historical feature characterizing the historical usage of storage capacity;
"Examples of the types of data samples collected may include...a remaining rated write endurance (RRWE) indication...a remaining drive space indication, and a remaining storage capacity indication. The RRWE may be an indication of the program erase (PE) cycles performed on an SSD…S.M.A.R.T. attribute identifier 0x05 and name 'Percentage Used Estimate' that indicates the estimated percentage of the SSD's endurance that has been consumed…The remaining drive space indication is an indication of the number of free blocks as a percentage of the total blocks of the storage device 118. Once the remaining drive space has reached 0%, the drive may be characterized as having reached an endpoint in its lifetime. The remaining storage capacity indication is an indication of an aggregation of all storage devices 118 (e.g., SSDs) connected…The collected data samples reflect hardware status of the storage device 118 and historical usage of the storage device 118 by the information handling system 104 during its operation and may therefore be used to make predictions about future usage and endpoints of the storage device 118" [0021]

Consider RRWE a feature and Percentage Used Estimate as the usage of storage capacity.
deltas, or changes, or distances, are computed between data samples that are adjacent in time. For example, a data sample value obtained one day is subtracted from the data sample value obtained the previous day to determine a delta/change/distance in the value of the particular data sample type between the two days. The delta/change/distance may be used as input to the clustering and function approximation modules, which are described below. The rates of change of the various data samples may be used by the information handling system 104 to adaptively predict storage endpoints…” [0022]

	“function approximator module receives the data samples and their corresponding cluster labels as input and forwards them to the neural network 300…for function approximation…neural network 300 returns a polynomial function that fits the pattern present in the data samples…The output of the neural network 300 is the coefficients of the combination of the Gaussian functions.” [0024]

generating a predicted usage of storage capacity for the storage device based on the historical feature and a predictor for predicting a usage of storage capacity; and
The delta/change/distance may be used as input to the clustering and function approximation modules, which are described below. The rates of change of the various data samples may be used by the information handling system 104 to adaptively predict storage endpoints…” [0022]

“used to make predictions about future usage and endpoints of the storage device 118" [0021]
	
“function approximator module receives the data samples and their corresponding cluster labels as input and forwards them to the neural network 300…for function approximation…neural network 300 returns a polynomial function that fits the pattern present in the data samples…The output of the neural network 300 is the coefficients of the combination of the Gaussian functions.” [0024]

“104 outputs a graph of available spares or remaining drive space of the storage device 118 based on the predictive function generated at block 704…displaying available spare or remaining drive space and extending out to the available spare or remaining drive space endpoint” [0039]

“predict when the remaining storage capacity…will reach its end of life, e.g., the remaining storage capacity will reach 0% or some other threshold value” [0042]

“outputs a graph of remaining storage capacity of the information handling system 104 based on the predictive function” [0043]

“enable a user…to maintain storage hardware…to avoid data loss caused by unavailability of storage” [0044]

The prediction takes the metrics as inputs and produces a prediction of future usage, e.g. Percentage Used Estimate, and hence constitutes a predictor.
updating the predictor by comparing the historical usage of storage capacity with the predicted usage of storage capacity.
neural network 300 executes…receives the data samples, their corresponding cluster labels, and the cluster centers as input…approximates a function that may be used to predict an endpoint for a storage device” [0025]

	“104 outputs a predictive function encompassing the data samples” [0026]

	“input layer comprises a plurality of input nodes…which provide the data samples…output layer comprises a plurality of output nodes…output nodes Y 306 are endpoint percentages, e.g., percent of remaining available writes, spares, or drive space…Each connection between two nodes has an associated weight value…in one embodiment, the weights are initialized with random values and are adjusted, or trained, based on the storage device usage data samples using backpropagation” [0027]

	The weights of the predictor are updated using backpropagation, which applies an adjustment to each weight corresponding to the difference between the data produced by the model (predicted usage) and the known data (historical usage).

Reference is made to Brownlee to discuss backpropagation in detail:
“The principle of the backpropagation approach is to model a given function by modifying internal weightings of input signals to produce an expected output signal. The system is trained using a supervised learning method, where the error between the system’s output and a known expected output is presented to the system and used to modify its internal state.
Technically, the backpropagation algorithm is a method for training the weights in a multilayer feed-forward neural network. As such, it requires a network structure to be defined of one or more layers where one layer is fully connected to the next layer. A standard network structure is one input layer, one hidden layer, and one output layer.” [P2-3; Backpropagation Algorithm]
“Each neuron has a set of weights that need to be maintained. One weight for each input connection and an additional weight for the bias. We will need to store additional properties for a neuron during training, therefore we will use a dictionary to represent each neuron and store properties by names such as ‘weights‘ for the weights.
A network is organized into layers. The input layer is really just a row from our training dataset. The first real layer is the hidden layer. This is followed by the output layer that has one neuron for each class value.” [P4, Initialize Network]
“Error is calculated between the expected outputs and the outputs forward propagated from the network. These errors are then propagated backward through the network from the output layer to the hidden layer, assigning blame for the error and updating weights as they go.
The math for backpropagating error is rooted in calculus, but we will remain high level in this section and focus on what is calculated and how rather than why the calculations take this particular form.” [P8, Back Propagate Error].
	Finally, error is computed by finding the difference (a comparison) between the expected data and the data produced by the predictor, as in [P8-9, Error Backpropagation].
	Hence, although Poovalapil does not expressly disclose the details of backpropagation, the skilled artisan would have understood “backpropagation” [0027] as recited by Poovalapil to refer to the technique for adjusting the weights of a neural network using the input data set and the computed data set, as further detailed by Brownlee.
	Hence, it would have been obvious to the skilled artisan before the effective filing date of the claimed invention to perform backpropagation, as disclosed by Poovalapil and detailed by Brownlee, using the historical usage data, e.g. RRWE, and the product of the neural network in order to adjust the weights of the neural network as instructed.

7. The method of claim 1, further comprising:
obtaining a further historical usage of storage capacity for the storage device, and a further historical feature characterizing the further historical usage of storage capacity;
generating a further predicted usage of storage capacity for the storage device based on the further historical feature and the updated predictor; and
determining performance of the predictor by comparing the further historical usage of storage capacity and the further predicted usage of storage capacity.
	The combination teaches claim 1. Claim 7 further recites performing a second iteration of the steps of claim 1. Reapplying the same techniques to new data is obvious, as it is a duplication of parts [MPEP 2144.04].

	Claim 8 is rejected on similar grounds as claim 7, as it is noted that steps of a method claim are not required to be performed in a particular order, unless specifically recited. Accordingly, claim 8 merely recites a cycle of the same process, at least as presently recited. See MPEP 2144.04.
The Examiner suggests amending the limitations “obtaining…” and “predicting…” to specifically precede the first use of the predictor, as set forth as starting conditions in [SPEC, 0026].

	Claim 10 is rejected on similar grounds as claim 1, as it is directed to the device for storage management performing the method of claim 1. Claim 10 further recites processing unit and memory comprising code to perform the method of claim 1 (processing device having instructions to perform the tasks and functions implemented by circuitry or computer program embodied on a nontransitory computer readable medium [Poovalapil, 0046]].
	Claim 16 is rejected on similar grounds as claim 7, as it is the device performing the method of claim 7.
	Claim 17 is rejected on similar grounds as claim 8, as it is the device performing the method of claim 8. Claim 17 further recites processing unit and memory comprising code to perform the method of claims 1-7 (processing device having instructions to perform the tasks and functions implemented by 
	Claim 19 is rejected on similar grounds as claim 10, as it is the non-transitory computer readable medium embodying the method of claim 1 in the apparatus of claim 10.
	Claim 20 is rejected on similar grounds as claim 17, as it is the non-transitory computer readable medium embodying the method of claim 8 in the apparatus of claim 17.


Claims 2-4 and 11-13 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination as applied to claim 1 above, and further in view of Kholidy’s “Efficient Hybrid Prediction Approach for Predicting Cloud Consumer Resource Needs”.
2. The method of claim 1, wherein obtaining the historical feature comprises:
obtaining, for the storage device, raw data associated with the historical usage of storage capacity; and
determining, as the historical feature, a feature in the raw data, relevance of the feature to the historical usage of storage capacity exceeding a predetermined threshold.
	The combination teaches claim 1, wherein obtaining the historical feature comprises:
obtaining, for the storage device, raw data associated with the historical usage of storage capacity; and
Collecting RRWE and other metrics [Poovalapil, 0021].

Where the combination is silent to feature selection by comparison to a threshold, Kholidy’s “Efficient Hybrid Prediction Approach for Predicting Cloud Consumer Resource Needs” discloses:
determining, as the historical feature, a feature in the raw data, relevance of the feature to the historical usage of storage capacity exceeding a predetermined threshold.
	“Feature selection process selects a subset of important features from the dataset by removing the irrelevant features for simple and accurate data. Usually, prediction models are based on a continuous observation of a number of specific features. Selecting the right features for prediction modelling helps in reducing the potential erroneous predictions which can occur even if the prediction algorithm is optimal. Furthermore, this process improves the computation speed and reduces the memory requirement and hence increasing the efficiency of the prediction approach. For the target classes (CPU, memory, network utilization, response time, and throughput), we use specific feature metrics that come with Amazon Cloud-Watch tool [11] which are collected every 60 seconds by some customized Java batch scripts such as CPUUtilization, CPUCreditUsage, CPUCreditBalance, DiskReadOps, DiskWriteOps, DiskReadBytes, DiskWriteBytes, NetworkIn, NetworkOut, MemoryUtilized, MemoryAvailable, SwapUtilized. The DDHPA employs the PSO approach to select the best linear and nonlinear features and attributes from the input dataset to be used by ARIMA and MSVR models respectively. The main target is to determine the relevance of each input attribute in the dataset to the target prediction class e.g., CPUUtilization, CPUCreditBalance, Memory Utilized are more relevance to the CPU Utilization class than DiskReadOps and SwapUtilized. This phase is implemented by combining the PSO approach to the Kernel Adatron and Support Vector Machine (KA-SVM) algorithm [4].
The KA-SVM is used to evaluate the fitness values of the PSO by comparing the test data characteristics. We use KA-SVM algorithm to emulate SVM training procedures and to evaluate the fitness values of the PSO. Using KA with the SVM helps to find a maximal margin hyper-plane in a higher feature space. This is equivalent to the nonlinear boundaries of the decision in the input space. Furthermore, KA increases the classification accuracy by reducing the amount of training and testing data. The KA procedure is given at lines 5 to 22 in Algorithm 2. After applying the feature selection approach, the input features are normalized or scaled to values between 0 and 10. The normalization process should be applied to avoid attributes with large numeric ranges dominating those in smaller ones.” [P3-4, 4.2.1: Feature Selection].
A determination that metrics are relevant or irrelevant indicates a predetermined minimum threshold for relevance. See “If fitness of Xi is greater than that of pbesti” and “If fitness of Xk is greaer than that of gbest” [Algorithm 2].
	Hence, Kholidy discloses the use of feature selection technique to assess the relevance of one or more input metrics collected for targeting an output value and to cull irrelevant input metrics for the purpose of improving the accuracy and efficiency of the predictor.
	It would have been obvious to the skilled artisan before the effective filing date of the claimed invention to incorporate Feature Selection Techniques as disclosed by Kholidy to the selection of input metrics of Poovalapil in order to improve the accuracy and efficiency of the predictor.

3. The method of claim 1, wherein generating the predicted usage of storage capacity comprises: 
normalizing the historical feature; and
generating the predicted usage of storage capacity by inputting the normalized historical feature to the predictor.
	The combination teaches claim 1, wherein generating the predicted usage of storage capacity comprises: normalizing the historical feature (“input features are normalized or scaled to values between 0 and 10” [Kholidy, P4, C1]); and generating the predicted usage of storage capacity by inputting the normalized historical feature to the predictor [Poovalapil, 0027].
	Hence, it would have been obvious to the skilled artisan before the effective filing date of the claimed invention to further normalize the one or more input data metrics disclosed by Poolvalaipl as avoid attributes with large numeric ranges dominating those in smaller ones.” [Kholidy, P4, C1].

4. The method of claim 1, wherein comparing the historical usage of storage capacity with the predicted usage of storage capacity comprises:
normalizing the historical usage of storage capacity; and
comparing the normalized historical usage of storage capacity with the predicted usage of storage capacity.
	The combination teaches claim 1. Kholidy further discloses normalizing the historical usage of storage capacity; and
(“input features are normalized or scaled to values between 0 and 10” [Kholidy, P4, C1]). The historical usage is also an input, and hence is normalized for similar reasons.
comparing the normalized historical usage of storage capacity with the predicted usage of storage capacity.
 “normalization process should be applied to avoid attributes with large numeric ranges dominating those in smaller ones.” [Kholidy, P4, C1]
	Hence, it would have been obvious to the skilled artisan before the effective filing date of the claimed invention to further normalize the historical usage capacity as disclosed by Kholidy in order to “avoid attributes with large numeric ranges dominating those in smaller ones.” [Kholidy, P4, C1]) and to enable comparison on the same scale as the predicted usage of storage capacity.

	Claims 11-13 are rejected on similar grounds as claims 2-4, as each is directed to the device performing the method of claims 2-4, respectively.

s 5 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination as applied to claim 1 above, and supported by Song’s “Host load prediction with long short-term memory in cloud computing”.
5. The method of claim 1, wherein the predictor is a Long Short-Term Memory (LSTM) neural network.
	The combination teaches claim 1. Where the combination is silent, Song discloses applying LSTM as a predictor for a neural network [P6558-6559]. Song suggests employing an LSTM neural network for performing predictions for load on a host because LSTM has certain advantages such as capability to “learn long-term dependencies” [P6556].
	Hence, it would have been obvious to the skilled artisan before the effective filing date of the claimed invention to replace hidden layer C1-Cn of the neural network of the combination with LSTM blocks as suggested by Song in order to obtain advantageous features of LSTM NNs, e.g. “long-term dependencies…learning…how long to remember and when to forget history data… [P6556].
	Claim 14 is rejected on similar grounds as claim 5, as it is the apparatus performing the method of claim 5.

Claims 6, 9, 15, and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination as applied to claim 2 above, and supported by Amazon’s CloudWatch.
6. The method of claim 1, wherein the historical feature comprises at least one of: an average size of data written for the storage device within a predetermined time interval, an average number of write operations per second for the storage device within the predetermined time interval, a physical space size for the storage device used within the predetermined time interval, and a timestamp associated with the predetermined time interval.
	The combination teaches claim 1, and further teaches wherein the historical feature comprises at least one of:
an average size of data written for the storage device within a predetermined time interval, an average number of write operations per second for the storage device within the predetermined time interval (DiskWriteOps), a physical space size for the storage device used within the predetermined time interval (DiskWriteBytes), and a timestamp associated with the predetermined time interval.
	Specifically, Kholidy discloses “Disk IOs/sec” [P2, IV]; “DiskWriteOps” [P4, C1]; “DiskWriteBytes” and others for Disk [Table 1] as features correlated with the resource utilization of a disk (“CPU, memory, and disk storage” [P2, IV]. Hence, it would have been obvious to the skilled artisan before the effective filing date of the claimed invention to use, e.g. Disk IOs/sec, DiskWriteOps, and DiskWriteBytes as disclosed by Kholidy as features of the combination with a reasonable expectation of success, as already indicated in at least one test as depicted by Kholidy.
	
Supporting documentation is provided herein regarding the meanings of the metrics provided by Amazon Cloud-Watch in the disclosure of Kholidy. In particular, DiskWriteBytes refers to a total amount of write data written over an interval, and DiskWriteOps refers to a number of completed write operations over the interval, which can be rewritten as IOPS by dividing by the interval:
DiskWriteBytes: “Bytes written to all instance store volumes available to the instance.
This metric is used to determine the volume of the data the application writes onto the hard disk of the instance. This can be used to determine the speed of the application.
The number reported is the number of bytes received during the period. If you are using basic (five-minute) monitoring, you can divide this number by 300 to find Bytes/second. If you have detailed (one-minute) monitoring, divide it by 60.” [P4].
DiskWriteOps: “Completed write operations to all instance store volumes available to the instance in a specified period of time.
To calculate the average I/O operations per second (IOPS) for the period, divide the total operations in the period by the number of seconds in that period.” [P3]
Therefore, the skilled artisan would have understood Kholidy’s exemplary features to be reflective of at least one of an average number of write operations per second for the storage device within the predetermined time interval (DiskWriteOps), a physical space size for the storage device used within the predetermined time interval (DiskWriteBytes) as above.

Claim 9 is rejected on similar grounds as claim 6.
Claim 15 is rejected on similar grounds as claim 6, as it is the apparatus performing the method of claim 6.
	Claim 18 is rejected on similar grounds as claim 9, as it is the device performing the method of claim 9.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Frey “Cloud Storage Prediction with Neural Networks”

Any inquiry concerning this communication or earlier communications from the examiner should be directed to HEWY H LI whose telephone number is (571)272-8714.  The examiner can normally be reached on Mon-Fri 10-6.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/HEWY H LI/Examiner, Art Unit 2136                                                                                                                                                                                                        
/CHARLES RONES/Supervisory Patent Examiner, Art Unit 2136