Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 8/2/21 has been entered.

Response to Arguments
Applicant’s arguments with respect to claim(s) have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.
Claims 1-3, 6-11, 13-17, 19, 20 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Visconti (Pub. No. US 2020/0209946) in view of Gopalan (Pub. No. US 2020/0104189) in further view of Sharkh (2017 NPL “An evergreen cloud: Optimizing energy efficiency in heterogeneous cloud computing architectures”).
Claim 1, Visconti teaches “a system, comprising: at least one processor and a memory; wherein the at least one processor is configured to: receive a first set metric data of a virtual machine, the first set of metric data including CPU usage, disk I/O usage, and network usage of the virtual machine at equally-spaced time points over a first time period ([Fig. 6] hours of week [0055] In the example of FIG. 2, for an observation time divided into time divisions, a plurality of virtual machines executing on physical machines may be monitored (202). [0067] The resulting metric collection 316, including, in the example, virtual machine CPU, network, and disc utilization metrics, may be provided for power on/off pattern detection 318. As described, such usage patterns represent recurring periods in which the virtual machines may be powered on or off based on recommendations of exact time schedules 320.); when the forecasted CPU usage of the virtual machine is below a threshold, initiate actions to reduce resource consumption of the virtual machine ([0075] For example, a given virtual machine may be defined to be idle within an hour, if, for that hour, the virtual machine has a CPU utilization that is lower than a specified threshold, a network byte rate that is lower than a certain rate threshold, and/or a disc transfer rate that is lower than a predefined threshold. Thus, for example, a virtual machine may be considered idle during an hour in which the CPU utilization of the virtual machine is less than, e.g., 5%, a network byte rate used by the virtual machine is lower than, e.g., 200 kilobits per second, and/or a disk transfer rate of the virtual machine is lower than, e.g., 100 kilobytes per second. Of course, other suitable ranges and parameter thresholds may be used. [0053] By turning off groups of virtual machines using the power manager 102 as described herein, the cloud provider 110 may be provided with an ability, for example, to turn off some or all virtual machines executing on a particular physical machine.)”.
However, the combination may not explicitly teach the new limitations.
Gopalan teaches “monitor the virtual machine during a second time period to collect a second set of metric data, the first time period differs from the second time period ([Fig. 3A] current VM workload demand 300a [0025] The current demand 141 … can include the demand of the workload 121 across the different resource dimensions (e.g., CPU, memory, storage, network, etc.).); apply the selected time series forecasting model to the second set of metric data to forecast the CPU usage ([Fig. 3C] forecasted and aggregated workload demand applied subsequent to current demand)”.
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to apply the teachings of Gopalan with the teachings of Visconti in order to provide a system that teaches utilizing models. The motivation for applying Gopalan teaching with Visconti teaching is to provide a system that allows for improving over time the forecast of Visconti. Visconti, Gopalan are analogous art directed towards forecasting VM resources. Together Visconti, Gopalan teach every limitation of the claimed invention. Since the teachings were analogous art known at the filing time of invention, one of ordinary skill could have applied the teachings of Ahmed with the teachings of Gopalan by known methods and gained expected results. 
However, the combination may not explicitly teach selecting a forecasting model of high weight.
Gopalan does teach utilizing a plurality of machine learning algorithms ([0030] The forecasted demand data values can be determined according to historical data, machine learning models, and/or other data.).
Sharkh teaches in an analogous art selecting a forecasting model of highest accuracy based upon suitable needs such that teaches “train a plurality of time series forecasting models on the first set of metric data for the first time period to predict a future time when a virtual machine will be idle; select one of the plurality of time series forecasting models having a highest prediction accuracy ([7.2 Classifier and classification tool] Machine learning (ML) classifiers automatically analyze a large data set composed of several attributes and decide what information is most relevant. This builds the classifier’s ability to predict the values of a specific preselected attribute. This value (which could be qualitative or quantitative) is the classification. Classifiers are used in many application fields. A commonly used tool that has a variety of the most common classifiers readily implemented is Weka [28]. Weka is a software workbench that includes several ready to use ML techniques [28]. Once the data is formatted in the format readable by Weka (.arff format) which defines what is the relation name, the attributes and their possible values and the data rows themselves, the tool can pre-process and classify. The relation defined for this work to predict the number of future requests is VM-predictor. We have tested multiple classifiers to find the classifier most suitable to our DIP technique and the energy efficiency problem. Table 4 contains the classifier names as in Weka and the classification precision measured using root absolute error. It is seen from the table that classifiers differ in their achieved precision. The highest performing classifiers for this specific case is REPtree with a root absolute error of 7.8% and then meta bagging and KStar classifiers. Fig. 6 shows a sample of the visual results gained for individual prediction using the REPtree classifier. Most of the values lie in or around the line which has a slope of 1. This indicates the equality of the predicted and the actual values of the number of future requests. [1. Introduction] 2 – A new technique called Dynamic Idleness Prediction (DIP) is introduced where the future demands for VMs are considered when placing/scheduling the VM on a host. This technique is based on using an artificial intelligence classifier (in our case REPtree) to predict the nature of the load every VM will receive in a prespecified future period. [Table 4] REPtree selected)”.
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to apply the teachings of Sharkh with the teachings of Visconti, Gopalan in order to provide a system that teaches utilizing most accurate modeling method. The motivation for applying Sharkh teaching with Visconti, Gopalan teaching is to provide a system that allows for improving upon Gopalan forecasting accuracy. Visconti, Gopalan, Sharkh are analogous art directed towards forecasting VM resources. Together Visconti, Gopalan, Sharkh teach every limitation of the claimed invention. Since the teachings were analogous art known at the filing time of invention, one of ordinary skill could have applied the teachings of Sharkh with the teachings of Visconti, Gopalan by known methods and gained expected results. 
Claim 2, the  combination teaches the claim, wherein Sharkh teaches “The system of claim 1, wherein the first set of metric data further includes usage data, the usage data including type of operating system, size of virtual machine, type of virtual machine, type of cloud service, and/or location of data center of the virtual machine ([5.2. Classification parameters] It is critical in this type of experiments to gain an insight into the demanded resources by the request (CPU and memory, for example) as well the time bounds if any. However a more coherent profile for the VM can be constructed by collecting parameters like: User ID, User location, User VMs, Type of con-tract (rental term), VM reserved resources, VM start time/reserved time, Redundancy model, Redundancy activity frequency, Compo-nent type, Request types and frequency, Communication/data ex-change request Dependencies, and response time required.)”.
Rational to claim 1 is applied here.
Claim 6, the  combination teaches the claim, wherein Visconti teaches “the system of claim 1, wherein the initiation of actions to reduce resource consumption of the virtual machine comprises shutting down the virtual machine ([0053] By turning off groups of virtual machines using the power manager 102 as described herein, the cloud provider 110 may be provided with an ability, for example, to turn off some or all virtual machines executing on a particular physical machine.).
Claim 7, the  combination teaches the claim, wherein Visconti teaches “the system of claim 1, wherein the at least one processor is further configured to: generate a cost savings estimate for the reduction of the resource consumption ([0071] Cloud orchestration 416 may then be used to switch groups of virtual machines on and off, in accordance with the recommendation power schedules, as shown in block 418. For example, with reference back to FIG. 1, suitable cloud provider interfaces 121 may be utilized by the power handler 140 of the power manager 102 to instruct and execute collected power cycling of the described groups of virtual machines, in accordance with the predicted power schedules.)”.
Claim 8, “obtaining a plurality of time series forecasting models trained to predict a future idle time of a virtual machine, wherein each of the time series forecasting models are trained on metric data of the virtual machine over a training period, the metric data including CPU usage, network I/O usage, and disk I/O usage obtained at equally-spaced time intervals; selecting one of the plurality of time series forecasting models; receiving metric data during a production run of the virtual machine during a first time period; applying the selected time series forecasting model to the received metric data to forecast the idle time of the virtual machine within a time period immediately following the first time period; and initiating measures to shut down the virtual machine during the idle time” is similar to claim 1 and therefore rejected with the same references and citations.
Claim 10, “The method of claim 8, wherein the metric data includes usage data, the usage data including a type of operating system, size of virtual machine, type of virtual machine, type of cloud service, and/or location of data center of the virtual machine” is similar to claim 2 and therefore rejected with the same references and citations.
Claim 11, the combination teaches the claim, wherein Visconti teaches “the method of claim 8, wherein initiating measures to shut down the virtual machine include requesting permission from a user of the virtual machine to shutdown the virtual machine ([0070] Further in FIG. 4, a user 410, such as an administrator at the cloud consumer consuming the various virtual machines of cloud providers 302, 306 may apply tag base rules 412 to determine whether to provide approval of the generated power schedule and group formation predictions. As illustrated, the approval 414 may also include rule-based approval, not requiring human intervention.)”.
Claim 13, the  combination teaches the claim, wherein Visconti teaches “The method of claim 8, wherein the future idle time is based on CPU usage forecasted to be below an idle threshold for the virtual machine, wherein the idle threshold is based on historical metric data and the usage data of the virtual machine ([0075] For example, a given virtual machine may be defined to be idle within an hour, if, for that hour, the virtual machine has a CPU utilization that is lower than a specified threshold, a network byte rate that is lower than a certain rate threshold, and/or a disc transfer rate that is lower than a predefined threshold. Thus, for example, a virtual machine may be considered idle during an hour in which the CPU utilization of the virtual machine is less than, e.g., 5%, a network byte rate used by the virtual machine is lower than, e.g., 200 kilobits per second, and/or a disk transfer rate of the virtual machine is lower than, e.g., 100 kilobytes per second. Of course, other suitable ranges and parameter thresholds may be used. [0053] By turning off groups of virtual machines using the power manager 102 as described herein, the cloud provider 110 may be provided with an ability, for example, to turn off some or all virtual machines executing on a particular physical machine.)”.
Claim 14, the  combination teaches the claim, wherein Visconti teaches “the method of claim 8,further comprising restarting the virtual machine after the idle time ([0045] Thus, in general, and as described in more detail below, presence of the same or similar deployment metadata across one or more groups of virtual machines may be indicative of, or correlated with, the same or similar on/off power schedule. In other scenarios, it may occur that groups of virtual machines defined by the group manager 138 do not have exactly overlapping power intervals. Nonetheless, it may occur that efficiencies obtained from grouping virtual machines for power scheduling may outweigh overall reductions in reliability of adherence of individual virtual machines to intervals predicted for those virtual machines.)”.
Claim 15, the  combination teaches the claim, wherein Visconti teaches “the method of claim 8, wherein prior to initiating measures to shut down the virtual machine during the idle time, informing a user of the virtual machine of the forecasted idle time ([0070] Further in FIG. 4, a user 410, such as an administrator at the cloud consumer consuming the various virtual machines of cloud providers 302, 306 may apply tag base rules 412 to determine whether to provide approval of the generated power schedule and group formation predictions. As illustrated, the approval 414 may also include rule-based approval, not requiring human intervention.)”.
Claim 16, “a device, comprising: at least one processor and a memory; wherein the memory includes instructions that when executed on the at least one processor performs actions that: trains a plurality of time series forecasting models on a first set of metric data from a virtual machine to predict a future idle time of the virtual machine, the first set of metric data including a time series of equally-spaced data points representing CPU usage, network usage, and disk I/O usage of the virtual machine during a training period; selects one of the plurality of time series forecasting models; and during a production run of the virtual machine: collects a second set of metric data during a first time period; forecasts an idle time of the virtual machine using the select one of the plurality of time series forecasting models at a second time period, the second time period immediately following the first time period; and automatically shuts down the virtual machine at the idle time of the second time period.” is similar to claim 1 and therefore rejected with the same references and citations.
Claim 17, “the device of claim 16, wherein the memory includes further instructions that when executed on the at least one processor performs additional actions that: restarts the virtual machine after the idle time” is similar to claim 14 and therefore rejected with the same references and citations.
Claim 19, “The device of claim 16, wherein the metric data includes usage data, the usage data including a type of operating system, size of virtual machine, type of virtual machine, type of cloud service, and/or location of data center of the virtual machine” is similar to claim 2 and therefore rejected with the same references and citations.
Claim 20, “the device of claim 16, wherein automatically shutting down the virtual machine is performed upon concurrence of a user of the virtual machine” is similar to claim 11 and therefore rejected with the same references and citations.
Claims 3, 9 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Visconti in view of Gopalan in view of Sharkh in further view of Yang (Pub. No. US 2020/0125419).
Claim 3, the combination may not explicitly teach the limitations.
Yang teaches “the system of claim 1, wherein the plurality of time series forecasting models includes at least three of autoregressive integrated moving average (ARIMA), error, ([Recurrent Neural Network(RNN)] is a class of NN that can obtain strong outcomes on sequence modeling tasks whose current states are related to the previous ones. RNN can use their internal memory to process arbitrary sequences of inputs. Basically, each unit, representing a stage in the sequence, has its own input, output and memory cell. h (t) = σ (W xt + Uht−1 + b) (6) Eq. 6 shows how status is delivered between memory cell h from stage t − 1 to t. The current status is formed by aggregating all weighted input W xt, weighted previous status Uht−1 and bias b. To increase the non-linear characteristic, the activate function σ is also utilized. This can allow RNNs to break through the obstacle from the fixed input, thus is able to process arbitrary sequences of inputs and predict the output accordingly. GRU(Gated Recurrent Unit)[46], LSTM(Long Short-Term)[47] are proposed to further solve the vanishing gradient problem. The topological service composition and resource allocation can be naturally solved by such LSTM models where the structure connections between units are very close to the workflowbased orchestration.) trend ([Neural Network(NN)[38]] is inspired by the biological neural networks within brains. The connections between neural cells determine the knowledges. As shown in Eq.3, x is the output of last layer cell and y T j represents the weight of j th connection. The b is bias and σ depicts the activate function which enables NN to learn the non-linear relationship. The aggregation of such information is then passed to the next cell.), seasonality (ETS), trigometric box-cox transformation (TBATS), or a decomposable time series forecasting model ([Decision Tree] “… For example, assuming the time period can be divided into equal length discrete slots T = {t0, t1, ......, tN }, the base model F0(x) is initially trained by the pre-set profiles. Afterwards, at the time slot tn (0 < n ≤ N), the specific CART decision tree hn(x) will be trained by the profile sampling at tn−1. A stronger learner Fn(x)=F0(x)+hn(x) can be consequently generated to forecast the matching between task and machine in tn.)”.
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to apply the teachings of Yang with the teachings of Visconti, Gopalan, Sharkh in order to provide a system that teaches ARIMA, TES models as additional models as taught by Sharkh. The motivation for applying Zhang teaching with Visconti, Gopalan, Sharkh teaching is to provide a system that allows for improving over time the forecast of Visconti. Visconti, Gopalan, Sharkh, Yang are analogous art directed towards forecasting VM resources. Together Visconti, Gopalan, Sharkh, Yang teach every limitation of the claimed invention. Since the teachings were analogous art known at the filing time of invention, one of ordinary skill could have applied the teachings of Yang with the teachings of Visconti, Gopalan, Sharkh by known methods and gained expected results. 
Claim 9, “the method of claim 8, wherein the plurality of time series forecasting model includes at least two of autoregressive integrated moving average (ARIMA), error, trend, seasonality (ETS), trigometric box-cox transformation (TBATS), or a decomposable time series forecasting model” is similar to claim 3 and therefore rejected with the same references and citations.
Claims 4, 5, 12, 18 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Visconti in view of Gopalan in view of Sharkh in further view of Zhang (Pub. No. US 2020/0125419).
Claim 4, the combination may not explicitly teach the limitations.
Zhang teaches “the system of claim 1, wherein the plurality of time series forecasting models includes at least one of ARIMA, ETS, TBATS, or a decomposable time series forecasting mode ([0044] In an example implementation, the session monitoring service 140 could operate in two phases. First, the session monitoring service 140 could perform a training phase, where session history 134 is used to generate a prediction model. The prediction model could be generated using various statistical models or learning models like ( ARIMA) (TES (i.e. ETS)) (GBDT) or a random forest. Then, the session monitoring service 140 could receive current data regarding user sessions 209 from the reporting agent 213. The session monitoring service 140 could then use the session history 134 and the current user sessions 209 as the input sample to the prediction model, to predict the future user sessions 209 hosted by the virtual machine 200.)”.
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to apply the teachings of Zhang with the teachings of Visconti, Gopalan, Sharkh in order to provide a system that teaches ARIMA, TES models as additional models as taught by Yang. The motivation for applying Zhang teaching with Visconti, Gopalan, Sharkh teaching is to provide a system that allows for improving over time the forecast of Visconti. Visconti, Gopalan, Sharkh, Zhang are analogous art directed towards forecasting VM resources. Together Visconti, Gopalan, Sharkh, Zhang teach every limitation of the claimed invention. Since the teachings were analogous art known at the filing time of invention, one of ordinary skill could have applied the teachings of Zhang with the teachings of Visconti, Gopalan, Sharkh by known methods and gained expected results. 
Claim 5, “The system of claim 1, wherein the plurality of time series forecasting models includes two or more of ARIMA, ETS, TBATS, or a decomposable time series forecasting model” is similar to claim 4 and therefore rejected with the same references and citations.
Claim 12, “the method of claim 8, wherein the time series forecasting model is at least one of ARMIA, TBATS, ETS, or a decomposable time series mode” is similar to claim 4 and therefore rejected with the same references and citations.
Claim 18, “the device of claim 16, wherein the at least one plurality of time series forecasting models  includes two or more of a decomposable time series model, autoregressive integrated moving average (ARIMA), error, trend, seasonality (ETS), or trigometric box-cox transformation (TBATS)” is similar to claim 4 and therefore rejected with the same references and citations.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WYNUEL S AQUINO whose telephone number is (571)272-7478. The examiner can normally be reached 9AM-5PM EST M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lewis Bullock can be reached on 571-272-3759. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/WYNUEL S AQUINO/Primary Examiner, Art Unit 2199