DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 02/24/2021 has been entered.
 
This action is in response to the arguments filed on 02/24/2021.  Claims 1-20 have been considered and are pending.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


Claims 1-5, 6, 8-11, and 12-14 are rejected under 35 U.S.C. 103 as being unpatentable over ARIYOSHI et al. (US 2015/0112900 A1, hereinafter referred to as ARIYOSHI), in view of Fang et al. (US 6,928,398 B1, hereinafter referred to as Fang), and further in view of Heching et al. (US 7,054,828 B2, hereinafter referred to as Heching), and Jamal et al. (US 2012/0330881 A1, hereinafter referred to as Jamal), and Sorjamaa et al. (“Methodology for long-term prediction of time senes,” hereinafter referred to as Sorjamaa).

As to claim 1, ARIYOSHI teaches one or more computer storage media storing computer-useable instructions that, when used by one or more computing devices (see paragraphs [0166]-[0168], computer-readable recording medium, [0056], a time-series data storage device 2, and a time-series data prediction device 3), cause the one or more computing devices to perform operations to facilitate generation of prediction models, the method comprising: 
 identifying a predetermined plurality of parameter value sets, each parameter value set having parameter values that represent corresponding parameters inputting as a combination into  a time series prediction model to define a function of the corresponding time series prediction model  (see paragraphs [0023]-[0031],  clusters time-series data of an observation value of a predetermined observation target into clusters that are a plurality of similar groups…even if there is a change in the observation value variation pattern, the time-series data prediction device can accurately calculate the predicted value of the time-series data…predicts time-series data using the given time-series data and the prediction model generated for each of the clusters by the prediction model generation unit; [0050]-[0051]…the previous prediction model is a slightly previous prediction model in many cases, prediction can be performed with high accuracy when there is a large power change recently. When the pattern of power use has changed, for example, when summer vacation begins, the use of the average use prediction model makes it easy to follow the change; [0098]…generates an approximation model, which is approximated from a plurality of functions (basis functions) and weighting coefficients of the functions; [0113]);
 implementing each identified parameter value set into the corresponding time series prediction model to generate a prediction value in accordance with a set of observed time series data (see paragraphs [0045]-[0050]; [0073]-[074]; [0098]…prediction model generation section 43 generates a prediction model from the approximation model to predict each observation value of the time-series data of the prediction use period); 
using the prediction values to select a parameter value set, from among the parameter value sets, that results in a least amount of prediction error (see paragraphs [0050]…(step S40), [0089] and [0141], selects a prediction model with the smallest error (step S490)); and 
utilizing the selected parameter value set to generate a time series prediction model, wherein the time series prediction model is subsequently used to predict values expected to occur at some future point in time (see paragraphs [0047]-[0048], [0055] and [0162]-[0164], generates a prediction model in each category (cluster), and obtains a predicted value of the future energy demand from the change in the past energy demand using the generated prediction model). 
But ARIYOSHI fails to explicitly teach wherein the parameters comprise an indication of order of differences and each parameter value within each parameter value set is selected by:
selecting a value from one of a plurality of strata, for a corresponding parameter, based on probabilities associated with each strata of the plurality of strata for the corresponding parameter.
However, Fang teaches wherein the parameters comprise an indication of order of differences values and each parameter value within each parameter value set (see col. 3, lines 29-51…differencing orders) and, in combination, Heching teaches selected by: 
selecting a value from one of a plurality of strata, for a corresponding parameter, based on probabilities associated with each strata of the plurality of strata for the corresponding parameter (see col. 2, lines 25-67…selecting a subset of members from a target population. (Block 1). This step may be performed by implementing probability sampling techniques, which are based on the assumption that every member in the population has some known, positive probability of being selected as a member of the subset….In probability sampling, every member of the population has a positive probability of being selected as a member of the sample...; col. 3, lines 4-40…Stratified sampling: each member of the population is assigned to a stratum. Simple random sampling is used to select within each stratum...By using probability sampling, one can compute the probability that a given member of the population is included in the sample (which may be referred to as the "inclusion probability" for that member of the population)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI and Fang by adding a subset selection of members from a target population using probability sampling for the combination system of ARIYOSHI and Fang’s system as taught by Heching above.  The modification would have been obvious because one of ordinary skill would be motivated to provide an accurate indication of past behavior of a target population, but which also establishes an accurate basis from which to determine the future likely beliefs and behavior of a target population as suggested by Heching (col.1, lines 45-49 and 53-58).
But ARIYOSHI, Fang and Heching fail to explicitly teach wherein each strata of the plurality of strata for the corresponding parameter includes a subportion of a plurality of potential parameter values and the strata is associated with a corresponding probability, used to select the value, determined based on a distribution of previously selected parameter values for previous prediction models.
However Jamal teaches wherein each strata of the plurality of strata for the corresponding parameter includes a subportion of a plurality of potential parameter values and the strata is associated with a corresponding probability (see paragraphs [0009]-[0010]…data may be transformed according to action number into stratified data including strata, with each of the strata representing actions for one or more action numbers…; [0025]-[0027]… Hazard functions may be estimated from the strata, indicated at 36. A "hazard function," as used herein, is a probability measure that a customer will have a next action at
a given time after a previous action, conditional on the occurrence of the previous action…A likelihood (a probability) of a next action may be calculated from a hazard function for a stratum, indicated at 38. The likelihood may be calculated with a computer and may provide a likelihood of next action at one or more different time points from the latest action taken by individual customers whose latest action has an action number for which the stratum represents actions).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Fang and Heching to associate a probability to each strata in the combination system of ARIYOSHI, Fang and Heching’s system as taught by Jamal above.  The modification would have been obvious because one of ordinary skill would be motivated to provide additional insights into which attributes are the key drivers of customer experience and which of the firm's processes and systems need to be improved to ensure an enhanced customer experience, as suggested by Jamal (see paragraph [0013]).
But ARIYOSHI, Fang, Heching and Jamal fail to explicitly teach:
select the value, determined based on a distribution of previously selected parameter values for previous prediction models.
However Sorjamaa teaches:
select the value, determined based on a distribution of previously selected parameter values for previous prediction models (see page 2862, 2. Time series prediction …prediction of future values based on the previous values and the current value of the time series (see Eq. (1)); 2.1. Recursive prediction strategy… uses the predicted values as known data to predict the next ones…).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Fang, Heching and Jamal to associate a probability to each strata in the combination system of ARIYOSHI, Fang, Heching and Jamal’s system as taught by Sorjamaa above.  The modification would have been obvious because one of ordinary skill would be motivated to use input selection as an essential pre-processing stage to guarantee high accuracy, efficiency and scalability in time series modeling, as suggested by Sorjamaa (see page 2862, right column, 3. 1. Input selection strategies).

As to claim 2, ARIYOSHI teaches wherein each parameter value within each of parameter value set is selected in accordance with a given range of parameter values for each parameter, the given range including the plurality of potential parameter values for the corresponding parameter selected based on recent time series data of the set of observed time series data (see paragraphs [0009]-[0011] and [0026]-[0029]… acquire observation values that continue at predetermined time intervals, as a prediction data, from time-series data of an observation value of a predetermined observation target and acquires a training data; wherein Examiner interprets the time intervals as “a given range” and “acquire observation values that continue at predetermined time intervals” to  teach the limitation; [0047]…time-series data prediction device sets the latest time-series data... and xn of the prediction use period of 3 days, wherein the latest time-series data are interpreted as recent time series data).

As to claim 3, ARIYOSHI teaches wherein the generation of the time series prediction model is initiated based on a request for one or more prediction values (see paragraphs [0009] and [0073], the prediction model learned by the prediction model generation unit 34 is configured by an approximation model to calculate the predicted value of each observation value of the time-series data of the prediction use period from the time-series data of the prediction use period).

As to claim 4,  ARIYOSHI, Heching, and Jamal fail to explicitly teach wherein the parameters within the time series prediction model comprise a number of differencing terms, a number of previous values of a metric to be predicted, and a number of lagged white noise terms.
However Fang teaches wherein the parameters within the time series prediction model comprise a number of differencing terms, a number of previous values of a metric to be predicted, and a number of lagged white noise terms (see col. 3, lines 1-45, ARIMA, differencing orders, past values (univariate modeling) or a combination of past values viewed in conjunction with other time series (multivariate modeling), white noise series).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Heching, and Jamal to add modeling and prediction based upon past values (univariate modeling) or a combination of past values viewed in conjunction with other time Series (multivariate modeling), through increasingly complex ARIMA Statistical modeling techniques in the combination system of ARIYOSHI, Heching, and Jamal’s system as taught by Fang above.  The modification would have been obvious because one of ordinary skill would be motivated to create a better model which can be used to improve a prior model and performing Sensitivity analyses on the created models, as suggested by Fang (see col.2, lines 64-67 and col. 3, lines 1-11).

As to claim 5, ARIYOSHI, Heching, and Jamal fail to explicitly teach wherein the time series prediction model comprises an autoregressive integrated moving average (ARIMA) model (see abstract, col. 3, lines 1-45, ARIMA).
However, Fang teaches wherein the time series prediction model comprises an autoregressive integrated moving average (ARIMA) model (see abstract, col. 3, lines 1-45, ARIMA).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Heching, and Jamal to add an autoregressive integrated moving average (ARIMA) model in the combination system of ARIYOSHI, Heching, and Jamal’s system as taught by Fang above.  The modification would have been obvious because one of ordinary skill would be motivated to create a better model, which can be used to improve a prior model, and to perform Sensitivity analyses on the created models as suggested by Fang (see col.2, lines 64-67 and col. 3, lines 1-11).

As to claim 6, ARIYOSHI teaches the method further comprising
selecting a portion of the plurality of potential parameter values based on a predetermined number of recent observed time series data values of a set of observed time series data (see paragraphs [0046]-[0047]…time-series data prediction device sets the latest time-series data... and xn of the prediction use period of 3 days, wherein the latest time-series data are interpreted as recent time series data).
But ARIYOSHI fails to explicitly teach:
for each parameter, dividing portion of the plurality of potential parameter values into the plurality of strata, wherein each strata of the plurality of strata includes the subportion of the portion of the plurality of potential parameter values.
However, Heching teaches:
           for each parameter, dividing portion of the plurality of potential parameter values into the plurality of strata, wherein each strata of the plurality of strata includes the subportion of the portion of the plurality of potential parameter values (see col. 3, lines 4-13...Stratified sampling: each member of the population is assigned to a stratum. Simple random sampling is used to select within each stratum... col. 6, lines 24-32… Stratified sampling classifies the population elements into sub-populations, or strata, and samples separately from each stratum); and
 assigning the corresponding probability weight to each strata (see col. 2, lines 55-67…Simple random sampling: all members of population have equal probability of being selected. (In this case, if the size 65 of the population is N and the sample size is n, then a member of the population has probability n/N of being selected as element of the sample; col. 5, lines 12-67 to col. 6, lines 1-17… p(s) is the probability that sample s is selected from the set of all possible samples in S).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of ARIYOSHI by adding stratified sampling that classifies the population elements into sub-populations, or strata, and samples separately from each stratum in ARIYOSHI’s system as taught by Heching above.  The modification would have been obvious because one of ordinary skill would be motivated to provide an accurate indication of past behavior of a target population, but which also establishes an accurate basis from which to determine the future likely beliefs and behavior of a target population as suggested by Heching (col.1, lines 45-49 and 53-58).

As to claim 8, ARIYOSHI, Heching, and Jamal fail to explicitly teach wherein each of the time series prediction model comprises an autoregressive integrated moving average (ARIMA) model. 
However Fang teaches wherein each of the time series prediction model comprises an autoregressive integrated moving average (ARIMA) model (see abstract, col. 3, lines 1-45, ARIMA). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Heching, and Jamal to add an autoregressive integrated moving average (ARIMA) model in the combination system of ARIYOSHI, Heching, and Jamal’s system as taught by Fang above.  The modification would have been obvious because one of ordinary skill would be motivated to create a better model, which can be used to improve a prior model, and to perform Sensitivity analyses on the created models (see col.2, lines 64-67 and col. 3, lines 1-11) as suggested by Fang.
As to claim 9, ARIYOSHI, Heching, and Jamal fail to explicitly teach wherein the prediction model comprises a moving average model or an autoregressive model.
However Fang teaches wherein the prediction model comprises a moving average model or an autoregressive model (see abstract, col. 3, lines 1-45, ARIMA, moving-average, autoregressive).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Heching, and Jamal to add an autoregressive integrated moving average (ARIMA) model in the combination system of ARIYOSHI, Heching, and Jamal’s system as taught by Fang above.  The modification would have been obvious because one of ordinary skill would be motivated to create a better model, which can be used to improve a prior model, and to perform Sensitivity analyses on the created models (see col.2, lines 64-67 and col. 3, lines 1-11) as suggested by Fang.

As to claim 10, ARIYOSHI teaches a method to facilitate generation of prediction models, the method comprising: 
identifying a predetermined number of parameter value sets, each parameter value set having parameter values that represent corresponding parameters for inputting as a combination into a time series prediction model to define a function of the corresponding time series prediction model, wherein the parameters comprise an order of differences or a number of previous values  (see paragraphs [0023]-[0031],  clusters time-series data of an observation value of a predetermined observation target into clusters that are a plurality of similar groups….even if there is a change in the observation value variation pattern, the time-series data prediction device can accurately calculate the predicted value of the time-series data…predicts time-series data using the given time-series data and the prediction model generated for each of the clusters by the prediction model generation unit; [0050]-[0051]…the previous prediction model is a slightly previous prediction model in many cases, prediction can be performed with high accuracy when there is a large power change recently. When the pattern of power use has changed, for example, when summer vacation begins, the use of the average use prediction model makes it easy to follow the change; [0098]…generates an approximation model, which is approximated from a plurality of functions (basis functions) and weighting coefficients of the functions; [0113]);
implementing each identified parameter value set into the corresponding  time series prediction model to generate a prediction value in accordance with a set of observed time series data (see paragraphs [0045]-[0050], time-series data; [0126], wherein Examiner interprets the prediction model learning process as a second computing process); 
determining, by a third computing process, a parameter value set, from among the parameter value sets, resulting in a least amount of prediction error (see paragraphs [0089] and [0141], where Examiner interprets the evaluation unit process 35 as a third computing process and selects a prediction model with the smallest error (step S490));  and 
utilizing, by a fourth computing process, the parameter value set to generate a prediction model, wherein the prediction model is subsequently used to predict values expected to occur at some future point in time (see paragraphs [0027],  wherein Examiner interprets the prediction model generation process as a fourth computing process, [0047]-[0048], [0055] and [0162]-[0164], generates a prediction model in each category (cluster), and obtains a predicted value of the future energy demand from the change in the past energy demand using the generated prediction model), wherein the first, second, third, and fourth computing processes are performed by one or more processors ([0166], CPU)). 
But ARIYOSHI fails to explicitly teach each parameter value within each parameter value set is selected by: 
selecting a value from one of a plurality of strata, for a corresponding parameter, based on probabilities associated with each strata of the plurality of strata.
However Heching teaches each parameter value within each parameter value set is selected by:
selecting a value from one of a plurality of strata, for a corresponding parameter, based on probabilities associated with each strata of the plurality of strata for the corresponding parameter (see col. 2, lines 25-67…selecting a subset of members from a target population.(Block 1). This step may be performed by implementing probability sampling techniques, which are based on the assumption that every member in the population has some known, positive probability of being selected as a member of the subset….In probability sampling, every member of the population has a positive probability of being selected as a member of the sample...).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of ARIYOSHI by adding a subset selection of members from a target population using probability sampling for ARIYOSHI’s system as taught by Heching above.  The modification would have been obvious because one of ordinary skill would be motivated to provide an accurate indication of past behavior of a target population, but which also establishes an accurate basis from which to determine the future likely beliefs and behavior of a target population as suggested by Heching (col.1, lines 45-49 and 53-58).
But ARIYOSHI and Heching fail to explicitly teach wherein each strata of the plurality of strata for the corresponding parameter includes a subportion of a plurality of potential parameter values for the corresponding parameter and the strata is associated with a corresponding probability, used to select the value, determined based on a distribution of previously selected parameter values for previous prediction models.
However Jamal teaches wherein each strata of the plurality of strata for the corresponding parameter includes a subportion of a plurality of potential parameter values for the corresponding parameter and the strata is associated with a corresponding probability, used to select the value, determined based on a distribution of previously selected parameter values for previous prediction models (see paragraphs [0009]-[0010]…data may be transformed according to action number into stratified data including strata, with each of the strata representing actions for one or more action numbers…; [0025]-[0027]…A likelihood (a probability) of a next action may be calculated from a hazard function for a stratum, indicated at 38. The likelihood may be calculated with a computer and may provide a likelihood of next action at one or more different time points from the latest action taken by individual customers whose latest action has an action number for which the stratum represents actions).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI and Heching to associate a probability to each strata in the combination system of ARIYOSHI and Heching’s system as taught by Jamal above.  The modification would have been obvious because one of ordinary skill would be motivated to provide additional insights into which attributes are the key drivers of customer experience and which of the firm's processes and systems need to be improved to ensure an enhanced customer experience, as suggested by Jamal (see paragraph [0013]).

As to claim 11, ARIYOSHI teaches wherein each parameter value within each parameter value set is selected in accordance with a given range of parameter values for each parameter, the given range of parameter values including the plurality of potential parameter values of the corresponding parameter (see paragraphs [0009]-[0011], observation values that continue at predetermined time intervals, as a prediction data, from time-series data of an observation value of a predetermined observation target and acquires a training data; wherein Examiner interprets the time intervals as “a given range”), the plurality of potential parameter values selected based on based on a distribution of previous parameter values associated with one or more of previous prediction models (see paragraphs [0017]…time-series data used as a prediction data in the previous prediction in the training data; [0047]-[0051]…uses an average of the consecutive time-series data of the prediction use period).

As to claim 12, ARIYOSHI, Heching, and Jamal fail to explicitly teach wherein the prediction model comprises a moving average model or an autoregressive model.
However Fang teaches wherein the parameters within the prediction model comprise at least one of a number of differencing terms, a number of previous values of a metric to be predicted, and a number of lagged white noise terms (see col. 3, lines 1-45, ARIMA, differencing orders, past values (univariate modeling) or a combination of past values viewed in conjunction with other time series (multivariate modeling), white noise series). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Heching, and Jamal to add modeling and prediction based upon past values (univariate modeling) or a combination of past values viewed in conjunction with other time Series (multivariate modeling), through increasingly complex ARIMA Statistical modeling techniques in the combination system of ARIYOSHI, Heching, and Jamal’s system as taught by Fang above.  The modification would have been obvious because one of ordinary skill would be motivated to create a better model which can be used to improve a prior model and performing Sensitivity analyses on the created models, as suggested by Fang (col.2, lines 64-67 and col. 3, lines 1-11).

As to claim 13, ARIYOSHI, Heching, and Jamal fail to explicitly teach wherein the parameters within the prediction model comprise a number of differencing terms, a number of previous values of a metric to be predicted, and a number of lagged white noise terms.  
However Fang teaches wherein the parameters within the prediction model comprise a number of differencing terms, a number of previous values of a metric to be predicted, and a number of lagged white noise terms (see col. 3, lines 1-45, ARIMA, differencing orders, past values (univariate modeling) or a combination of past values viewed in conjunction with other time series (multivariate modeling), white noise series). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Heching, and Jamal to add modeling and prediction based upon past values (univariate modeling) or a combination of past values viewed in conjunction with other time Series (multivariate modeling), through increasingly complex ARIMA Statistical modeling techniques in the combination system of ARIYOSHI, Heching, and Jamal’s system as taught by Fang above.  The modification would have been obvious because one of ordinary skill would be motivated to create a better model, which can be used to improve a prior model and performing Sensitivity analyses on the created models as suggested by Fang (col.2, lines 64-67 and col. 3, lines 1-11).

As to claim 14, ARIYOSHI, Heching, and Jamal fail to explicitly teach wherein the prediction model comprises an autoregressive integrated moving average (ARIMA) model.
However, Fang teaches wherein the prediction model comprises an autoregressive integrated moving average (ARIMA) model (see abstract, col. 3, lines 1-45, ARIMA). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Heching, and Jamal to add an autoregressive integrated moving average (ARIMA) model in the combination system of ARIYOSHI, Heching, and Jamal’s system as taught by Fang above.  The modification would have been obvious because one of ordinary skill would be motivated to create a better model, which can be used to improve a prior model, and to perform Sensitivity analyses on the created models (see col.2, lines 64-67 and col. 3, lines 1-11) as suggested by Fang.

Claims 7 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over ARIYOSHI et al. (US 2015/0112900 A1, hereinafter referred to as ARIYOSHI), in view of Fang et al. (US 6,928,398 B1, hereinafter referred to as Fang), and further in view of Heching et al. (US 7,054,828 B2, hereinafter referred to as Heching), and Jamal et al. (US 2012/0330881 A1, hereinafter referred to as Jamal), and G. Peter Zhang (“Time series forecasting using a hybrid ARIMA and neural network model” hereinafter referred to as Zhang).

   As to claim 7, ARIYOSHI teaches wherein determining the parameter value set resulting in the least amount of prediction error (see paragraphs [0074], (time-series data xn…) and prediction model with the smallest error; [0141]).
But ARIYOSHI, Fang, Heching, and Jamal fail to explicitly teach comparing the prediction value with a corresponding observed time series data. 
However Zhang teaches comparing the prediction value with a corresponding observed time series data (see pages 168-169…actual value and the forecast value for the 67 points out-of-sample is given in Fig. 4; pages 171-172, the point-to-point comparison between actual and predicted values is given in Fig. 6).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Fang, Heching, and Jamal to add a time series forecasting using a hybrid ARIMA with point-to-point comparison between actual and predicted values in the combination system of ARIYOSHI, Fang, Heching, and Jamal’s system as taught by Zhang above.  The modification would have been obvious because one of ordinary skill would be motivated to have a combined model that can be an effective way to improve forecasting accuracy achieved by either of the models used separately (see abstract) as suggested by Zhang.

As to claim 15, ARIYOSHI teaches wherein determining the parameter value set resulting in the least amount of prediction error (see paragraphs [0074], (time-series data xn…) and prediction model with the smallest error; [0141]).
But ARIYOSHI, Fang, Heching, and Jamal fail to explicitly teach comparing the prediction value with a corresponding observed time series data. 
However Zhang teaches comparing the prediction value with a corresponding observed time series data (see pages 168-169…actual value and the forecast value for the 67 points out-of-sample is given in Fig. 4; pages 171-172, the point-to-point comparison between actual and predicted values is given in Fig. 6).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Fang, Heching, and Jamal to add a time series forecasting using a hybrid ARIMA with point-to-point comparison between actual and predicted values in the combination system of ARIYOSHI, Fang, Heching, and Jamal’s system as taught by Zhang above.  The modification would have been obvious because one of ordinary skill would be motivated to have a combined model that can be an effective way to improve forecasting accuracy achieved by either of the models used separately (see abstract) as suggested by Zhang.

Claims 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over ARIYOSHI et al. (US 2015/0112900 A1, hereinafter referred to as ARIYOSHI), in view of Elbiaze et al. (“A new structure-preserving method of sampling for predicting self-similar traffic,” hereinafter referred to as Elbiaze), and further in view of Heching et al. (US 7,054,828 B2, hereinafter referred to as Heching), and G. Peter Zhang (“Time series forecasting using a hybrid ARIMA and neural network model” hereinafter referred to as Zhang), and Jamal et al. (US 2012/0330881 A1, hereinafter referred to as Jamal), further in view of Sorjamaa.

As to claim 16, ARIYOSHI teaches a system comprising: 
one or more processors (see paragraph [0161], CPU); and 
one or more computer storage media storing computer-useable instructions that, when used by the one or more processors (see paragraphs [0166]-[0168], computer-readable recording medium and CPU), cause the one or more processors to: 
select the parameter value set associated with a least relative prediction error for generating a time series prediction model (see paragraphs [0074], (time-series data xn…) and prediction model with the smallest error);
use the time series prediction model to predict values expected to occur at a later time (see paragraphs [0047]-[0048], [0055] and [0162]-[0164], generates a prediction model in each category (cluster), and obtains a predicted value of the future energy demand from the change in the past energy demand using the generated prediction model).  
Elbiaze teaches:
identify a predetermined number of parameter value sets, each parameter value set having parameter values that represent corresponding parameters for inputting as a combination to define a function in  an autoregressive integrated moving average (ARIMA) prediction model, wherein each parameter value indicates one of a number of differencing terms or an order of differencing terms to use in the corresponding function of the ARIMA prediction model (see page 265; right column, second paragraph… sampling accuracy…sampled traffic must reflect faithfully the characteristics of the original Internet traffic; page 267, left column, section 2.1 Self-similarity; page 269, section 2.3.2 The autoregressive integrated moving average model (ARIMA), time series…potential p and q values; page 272,  4 Experimental results,  stratified sampling, Figure 5 shows the average of the obtained sampled data compared to real traffic average; Figure 6 shows also that the variance of the signal is almost preserved for the systematic and the stratified sampling techniques; page 275, Table 1).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of ARIYOSHI to add a combination of   autoregressive integrated moving average (ARIMA) model and stratified sampling for time series forecasting in ARIYOSHI’s system as taught by Elbiaze above.  The modification would have been obvious because one of ordinary skill would be motivated to have a prediction using aggregated data improves the performance compared to the
stratified and systematic sampling as suggested by Elbiaze (see page 10 of 13, left column, section 4.3 Exploiting the sampling techniques for traffic prediction).
But ARIYOSHI and Elbiaze fail to explicitly teach:
each parameter value within each parameter value set is selected by: 
selecting a value from one of a plurality of strata, for a corresponding parameter, based on probabilities associated with each strata of the plurality of strata for the corresponding parameter, wherein each strata of the plurality of strata for the corresponding parameter includes a subportion of a plurality of potential parameter values for the corresponding parameter and the strata is associated with a corresponding probability, 
implement each identified parameter value set into the autoregressive integrated moving average model and use a first portion of an observed time series data set to generate a predicted value associated with each parameter value set; 
compare the predicted value associated with each parameter value set to a second portion of the observed time series data set that corresponds with the predicted value to generate a relative prediction error for each parameter value set; and
implement the selected parameter value set into the autoregressive integrated moving average model to generate the time series prediction model.
However Heching teaches each parameter value within each parameter value set is selected by:
selecting a value from one of a plurality of strata, for a corresponding parameter, based on probabilities associated with each strata of the plurality of strata for the corresponding parameter (see col. 2, lines 25-67…selecting a subset of members from a target population. (Block 1). This step may be performed by implementing probability sampling techniques, which are based on the assumption that every member in the population has some known, positive probability of being selected as a member of the subset….In probability sampling, every member of the population has a positive probability of being selected as a member of the sample...; col. 3, lines 4-40…Stratified sampling: each member of the population is assigned to a stratum. Simple random sampling is used to select within each stratum...By using probability sampling, one can compute the probability that a given member of the population is included in the sample (which may be referred to as the "inclusion probability" for that member of the population)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of ARIYOSHI by adding a subset selection of members from a target population using probability sampling for ARIYOSHI’s system as taught by Heching above.  The modification would have been obvious because one of ordinary skill would be motivated to provide an accurate indication of past behavior of a target population, but which also establishes an accurate basis from which to determine the future likely beliefs and behavior of a target population as suggested by Heching (col.1, lines 45-49 and 53-58).
But ARIYOSHI, Elbiaze, and Heching fail to explicitly teach:
implement each identified parameter value set into the corresponding ARIMA prediction model and use a first portion of the observed time series data set to generate a predicted value associated with each parameter value set; 
compare the predicted value associated with each parameter value set to a second portion of the observed time series data set that corresponds with the predicted value to generate a relative prediction error for each parameter value set; and
implement the selected parameter value set into the ARIMA prediction model to generate the time series prediction model.
However, Zhang teaches:
implement each identified parameter value set into the corresponding ARIMA prediction model and use a first portion of the observed time series data set to generate a predicted value associated with each parameter value set (see pages 171-172, the point-to-point comparison between actual and predicted values is given in Fig. 6, wherein Examiner interprets the actual data as the observed data); 
         compare the predicted value associated with each parameter value set to a second portion of the observed time series data set that corresponds with the predicted value to generate a relative prediction error for each parameter value set (see pages 168-169…actual value and the forecast value for the 67 points out-of-sample is given in Fig. 4; pages 171-172, the point-to-point comparison between actual and predicted values is given in Fig. 6; page 162, 2.1. The ARIMA model…the actual value and random error at time period t); and
implement the selected parameter value set into the ARIMA prediction model to generate the time series prediction model (see page 172, Figure 6, where Examiner interprets the time series predictions for exchange rate to include the limitation).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Elbiaze, and Heching to add a time series forecasting using a hybrid ARIMA with point-to-point comparison between actual and predicted values in the combination system of ARIYOSHI, Elbiaze, and Heching’s system as taught by Zhang above.  The modification would have been obvious because one of ordinary skill would be motivated to have a combined model that can be an effective way to improve forecasting accuracy achieved by either of the models used separately (see abstract) as suggested by Zhang.
But ARIYOSHI, Elbiaze, Heching, and Zhang fail to explicitly teach wherein each strata of the plurality of strata for the corresponding parameter includes a subportion of a plurality of potential parameter values for the corresponding parameter and the strata is associated with a corresponding probability, used to select the value, determined based on a distribution of previously selected parameter values for previous prediction models.
However, Jamal teaches wherein each strata of the plurality of strata for the corresponding parameter includes a subportion of a plurality of potential parameter values for the corresponding parameter and the strata is associated with a corresponding probability, used to select the value, determined based on a distribution of previously selected parameter values for previous prediction models, used to select the value, determined based on a distribution of previously selected parameter values for previous prediction models (see paragraphs [009]…conditional proportional
hazard model for repeated events may be used to compute a probability measure of a next action for an individual customer, conditional on the customer's previous action (i.e., last action); [0010]…data may be transformed according to action number into stratified data including strata, with each of the strata representing actions for one or more action numbers…; [0025]-[0027]…A likelihood (a probability) of a next action may be calculated from a hazard function for a stratum, indicated at 38. The likelihood may be calculated with a computer and may provide a likelihood of next action at one or more different time points from the latest action taken by individual customers whose latest action has an action number for which the stratum represents actions).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Elbiaze, Heching, and Zhang to associate a probability to each strata in the combination system of ARIYOSHI, Elbiaze, Heching, and Zhang’s system as taught by Jamal above.  The modification would have been obvious because one of ordinary skill would be motivated to provide additional insights into which attributes are the key drivers of customer experience and which of the firm's processes and systems need to be improved to ensure an enhanced customer experience, as suggested by Jamal (see paragraph [0013]).
But ARIYOSHI, Elbiaze, Heching, Jamal and Zhang fail to explicitly teach:
select the value, determined based on a distribution of previously selected parameter values for previous prediction models.
However Sorjamaa teaches:
select the value, determined based on a distribution of previously selected parameter values for previous prediction models (see page 2862, 2. Time series prediction …prediction of future values based on the previous values and the current value of the time series (see Eq. (1)); 2.1. Recursive prediction strategy… uses the predicted values as known data to predict the next ones…).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of ARIYOSHI, Elbiaze, Heching, Jamal and Zhang to associate a probability to each strata in the combination system of ARIYOSHI, Elbiaze, Heching, Jamal and Zhang’s system as taught by Sorjamaa above.  The modification would have been obvious because one of ordinary skill would be motivated to use input selection as an essential pre-processing stage to guarantee high accuracy, efficiency and scalability in time series modeling, as suggested by Sorjamaa (see page 2862, right column, 3. 1. Input selection strategies).

As to claim 17, ARIYOSHI teaches wherein the time series prediction model uses at least a portion of the observed time series data set to predict values (see paragraphs [0010]-[0012]…prediction model, which is generated based on the time-series data observed in the past, using the actually observed time-series data). 

As to claim 18, Elbiaze teaches wherein the time series prediction model comprises the ARIMA prediction model or autoregressive moving average model (ARMA) (see page 269, section 2.3.2 The autoregressive integrated moving average model (ARIMA), right column, ARMA model). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of ARIYOSHI to add a combination of   autoregressive integrated moving average (ARIMA) model and stratified sampling for time series forecasting in ARIYOSHI’s system as taught by Elbiaze above.  The modification would have been obvious because one of ordinary skill would be motivated to have a prediction using aggregated data improves the performance compared to the
stratified and systematic sampling as suggested by Elbiaze (see page 10 13, left column, section 4.3 Exploiting the sampling techniques for traffic prediction).

As to claim 19, Elbiaze teaches wherein the time series prediction model comprises a moving average model) (see page 269, section 2.3.2 The autoregressive integrated moving average model (ARIMA), left column, Moving Average (MA)). 
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of ARIYOSHI to add a combination of   autoregressive integrated moving average (ARIMA) model and stratified sampling for time series forecasting in ARIYOSHI’s system as taught by Elbiaze above.  The modification would have been obvious because one of ordinary skill would be motivated to have a prediction using aggregated data improves the performance compared to the
stratified and systematic sampling as suggested by Elbiaze (see page 10 13, left column, section 4.3 Exploiting the sampling techniques for traffic prediction).

As to claim 20, Elbiaze teaches wherein the time series prediction model comprises an autoregressive model (see page 269, section 2.3.2 The autoregressive integrated moving average model (ARIMA), left column, Autoregressive (AR)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of ARIYOSHI to add a combination of   autoregressive integrated moving average (ARIMA) model and stratified sampling for time series forecasting in ARIYOSHI’s system as taught by Elbiaze above.  The modification would have been obvious because one of ordinary skill would be motivated to have a prediction using aggregated data improves the performance compared to the
stratified and systematic sampling as suggested by Elbiaze (see page 10 13, left column, section 4.3 Exploiting the sampling techniques for traffic prediction).
Response to Arguments
The Applicant’s arguments have been fully considered, but are not persuasive. 
Rejections based on 35 U.S.C. § 103
Rejection of Claims 1-3, 6, and 10-11
Arguments 
1. The cited references fail to teach or suggest "identifying a predetermined plurality of parameter value sets, each parameter value set having parameter values that represent corresponding parameters for inputting as a combination into a time series prediction model to define a function of the corresponding time series prediction model."

Argument 
Ariyoshi fails to teach or suggest identifying a predetermined number of parameter value sets having parameter values that represent corresponding parameters for inputting as a combination into a time series prediction model to define a function of the corresponding time series prediction model, as recited in claim 1.
Examiner response:
Examiner respectfully disagrees.  Ariyoshi teaches the limitation as explained in the rejection above. Ariyoshi further teaches, [0047]-[0049], the prediction model is a calculation expression for calculating the predicted value of time-series data of the prediction target period subsequent to the time series data of the prediction use period by inputting the time-series data of the prediction use period as an input parameter. 
Heching teaches the limitation as explained in the rejection above. Heching further teaches, col. 2, lines 1-5…selecting members from the population is preferably performed using probability sampling techniques…col.6, lines 25-32, Stratified sampling classifies the population elements into sub-populations, or strata, and samples separately from each stratum. A stratification scheme defines the set of one or more characteristics based upon which the population is stratified. For example, suppose that one wishes to sample 30 students from a particular school. One can then stratify the students according to which grade they are in, and then sample from within each stratum.
Jamal, (US 2012/0330881 A1), teaches each strata of the plurality of 3dstrata including a subportion of the plurality of potential parameter values and the strata associated with a corresponding probability determined based on a distribution of previously selected parameter values for previous prediction models (see paragraphs [0010]…data may be transformed according to action number into stratified data including strata, with each of the strata representing actions for one or more action numbers…; [0025]-[0027]… Hazard functions may be estimated from the strata, indicated at 36. A "hazard function," as used herein, is a probability measure that a customer will have a next action at a given time after a previous action; conditional on the occurrence of the previous action…A likelihood (a probability) of a next action may be calculated from a hazard function for a stratum, indicated at 38. The likelihood may be calculated with a computer and may provide a likelihood of next action at one or more different time points from the latest action taken by individual customers whose latest action has an action number for which the stratum represents actions).

Therefore, the cited arts Ariyoshi, Heching, Jamal (US 2012/0330881 A1) and Sorjamaa teach the limitation.

2. The cited references fails to teach or suggest identifying a predetermined number of parameter value sets, each parameter value set having parameter values that represent corresponding parameters [that] comprise an order of differences.

Argument 
Applicant, however, cannot find any discussion in Heching regarding parameters being an order of differences, much less identifying parameter value sets having parameter values that represent an order of differences, as recited in claim 1.

Examiner response:
Examiner respectfully disagrees.  Fang teaches wherein the parameters comprise an indication of order of differences values and each parameter value within each parameter value set (see col. 3, lines 29-51…differencing orders). 

3. The cited references fail to teach or suggest each parameter value within each parameter value set is selected by: selecting a value from one of a plurality of strata, for a corresponding parameter, based on probabilities associated with each strata of the plurality of strata for the corresponding parameter, wherein each strata of the plurality of strata for the corresponding parameter includes a subportion of a plurality of potential parameter values for the corresponding parameter and the strata is associated with a corresponding probability, used to select the value, determined based on a distribution of previously selected parameter values for previous prediction models.

Argument
The cited references do not teach or suggest parameter values as described in claim 1 and, as such, cannot teach or suggest parameter values selected by selecting a value from one of a plurality of strata, for a corresponding parameter, based on probabilities associated with each strata of the plurality of strata for the corresponding parameter, wherein each strata of the plurality of strata for the corresponding parameter includes a subportion of a plurality of potential parameter values for the corresponding parameter and the strata is associated with a corresponding probability, used to select the value, determined based on a distribution of previously selected parameter values for previous prediction models, as in claim 1.

Examiner response:
Examiner respectfully disagrees.  Heching teaches the limitation as explained in the rejection above: 
selecting a value from one of a plurality of strata, for a corresponding parameter, based on probabilities associated with each strata of the plurality of strata for the corresponding parameter (see col. 2, lines 25-67…selecting a subset of members from a target population…. every member in the population has some known, positive probability of being selected as a member of the subset….In probability sampling, every member of the population has a positive probability of being selected as a member of the sample...; col. 3, lines 4-40…Stratified sampling: each member of the population is assigned to a stratum. Simple random sampling is used to select within each stratum... By using probability sampling, one can compute the probability that a given member of the population is included in the sample (which may be referred to as the "inclusion probability" for that member of the population); col. 6, lines 26-27, Stratified sampling classifies the population elements into sub-populations, or strata, and samples separately from each stratum).
Jamal teaches wherein each strata of the plurality of strata for the corresponding parameter includes a subportion of a plurality of potential parameter values for the corresponding parameter and the strata is associated with a corresponding probability, used to select the value, determined based on a distribution of previously selected parameter values for previous prediction models, used to select the value, determined based on a distribution of previously selected parameter values for previous prediction models (see paragraphs [009]…conditional proportional hazard model for repeated events may be used to compute a probability measure of a next action for an individual customer, conditional on the customer's previous action (i.e., last action); [0010]…data may be transformed according to action number into stratified data including strata, with each of the strata representing actions for one or more action numbers…; [0025]-[0027]…A likelihood (a probability) of a next action may be calculated from a hazard function for a stratum, indicated at 38. The likelihood may be calculated with a computer and may provide a likelihood of next action at one or more different time points from the latest action taken by individual customers whose latest action has an action number for which the stratum represents actions).

4. The cited references fail to teach or suggest "implementing each identified parameter value set into the corresponding time series prediction model to generate a prediction value in accordance with the set of observed time series data."

Argument
Ariyoshi fails to teach or suggest implementing parameter value sets into corresponding time series prediction models, especially, where the parameter value sets include parameter values selected by selecting a value for each parameter value set from one of a plurality of strata, for the corresponding parameter, based on probabilities  associated with the plurality of strata, each strata of the plurality of strata associate with a corresponding probability.

Examiner response:
Examiner respectfully disagrees. Ariyoshi teaches the limitation as explained in the
rejection above. Ariyoshi further teaches, [0045] ... a time-series data prediction
algorithm performed by a time-series data prediction device .. , [0046] ... a time-series
data prediction device reads a prediction use period and a prediction target period from
the setting file stored in advance ... a case will be described in which time-series data is
data indicating the observation value of the energy demand of consecutive I –minute intervals in a day, a reading target period is 365 days, a prediction use period is 3 days
(72 hours), and a prediction target period is 2 days (48 hours); [0089]... After the observation value of the date and time corresponding to the prediction result output from the prediction unit 36 is obtained, the evaluation unit 35 may evaluate the prediction model and rewrite various flags of the storage unit 31 afterward.
As such, Examiner respectfully submits that the cited references do teach all the limitations of claim 1. Examiner respectfully submits that claim 1 and the claims, which depend, therefrom, claims 2-3, and 6, are not patentable over the cited reference.
For similar reasons, Examiner respectfully submits that independent claim 10, and the claim which depend therefrom, claim 11, is likewise not patentable over the cited references.

5. Lack of Motivation to Combine
It is submitted that there is a lack of motivation to combine the Jamal reference with other references. In particular, Jamal describes that hazard functions "may be estimated from the strata." Para. [0025]. There would be no motivation to combine estimating a hazard function (probability measure) from a strata, as in Jamal, for use in parameter value selection from the strata. Such an estimated hazard function is estimated for a strata and is not used to select a parameter value from the strata.

Examiner’s response:
Examiner respectfully provide an explanation for the rejection in the office.  For the reason of how these reference could allegedly be combined and how that combination could allegedly fit the stated motivation.

In response to applicant's argument that there is a lack of motivation to combine the Jamal reference with other references.  Examiner recognizes that obviousness may be established by combining or modifying the teachings of the prior art to produce the claimed invention where there is some teaching, suggestion, or motivation to do so found either in the references themselves or in the knowledge generally available to one of ordinary skill in the art. See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988), In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992), and KSR International Co. v. Teleflex, Inc., 550 U.S. 398, 82 USPQ2d 1385 (2007). 

In this case, Jamal teaches computing a probability measure of a next action for an individual customer, conditional on the customer's previous action (i.e., last action).
 ([0009]) and customer data may be stratified, according to action number
and using a stratification routine, into strata, with each of the strata representing actions for one or more action numbers ([00120]).
Accordingly, combining Ariyoshi, Fang, Heching and Jamal would have been obvious to one of ordinary skill in the art in order to teach the limitations of claim 1.

Argument 
Independent Claim 10 and Dependent claims 2. 3, and 11
Applicant respectfully submits that claim 1 and the claims which depend therefrom, claims 2-3, and 6, are patentable over the cited reference. For similar reasons, Applicant respectfully submits that independent claim 10, and the claim which depend
therefrom, claim 11, is likewise patentable over the cited references.

Examiner response:
Examiner respectfully submits that claim 1 and the claims which depend therefrom, claims 2-3, and 6, are not patentable over the cited references. For similar reasons, Examiner respectfully submits that independent claim 10, and the claim which depend
therefrom, claim 11, is not patentable over the cited references.  No further arguments were made for dependent claims 2-3, 6, and 11.

Argument
Claim 6
It is respectfully submitted that the cited references fail to teach or suggest, "selecting a portion of the plurality of potential parameter values based on a predetermined number of recent observed time series data values of a set of observed time series data; for each parameter, dividing the portion of plurality of potential parameter values into the plurality of strata, wherein each strata of the plurality of strata includes the subportion of the portion of the plurality of potential parameter values," features of claim 6

Examiner response:
Examiner respectfully disagrees.  Ariyoshi teaches the limitation as explained in the rejection above. Ariyoshi further teaches, [0068]-[0069, acquisition unit 32 sets the latest time-series data of the prediction use period, among the read time-series data, as prediction data, and sets the remaining time-series data as learning data….(from the newest data of the learning data) of 5 days as a sum of the prediction use period of 3
days and the prediction target period, wherein the latest time-series data are interpreted as recent time series data.

Heching teaches the limitation as explained in the rejection above. Heching further teaches, col. 6, lines 26-27, Stratified sampling classifies the population elements into 
sub-populations, or strata, and samples separately from each stratum. Heching also teaches, col, col. 3, using probability sampling, one can compute the probability that a given member of the population is included in the sample (which may be referred to as the "inclusion probability" for that member of the population).
As such, Ariyoshi and Heching teach the limitations of the claim.

Rejection of Claims 4-5, 8-9, and 12-14
No further arguments were made for the following claims:
Claims 4-5, and 8-9 are not patentable over the cited references by virtue of its dependency on not patentable claim 1.
Claims 12-14 are not patentable over the cited references by virtue of its dependency on not patentable claim 10.

Rejection of Claims 7 and 15
No further arguments were made for the following claims:
Claim 7 is not patentable over the cited references by virtue of its dependency on not patentable claim 1.
Claim 15 is not patentable over the cited references by virtue of its dependency on not patentable claim 1.

Rejection of Claims 16-20
Argument 1
Initially, it is respectfully submitted that Elbiaze fails to teach or suggest identifying a predetermined number of parameter value sets having identifying a predetermined number of parameter value sets, each parameter value set having parameter values that represent corresponding parameters to input as a combination to define a function, as recited in claim 1.

Examiner response:
Examiner respectfully disagrees.  Elbiaze teaches the limitation as explained in the rejection above. Elbiaze further teaches, pages 269 right column,…“differencing operation…” differencing repeatedly until the resulting series can plausibly be modeled as a realization of a stationary process.
Heching teaches the limitation as explained in the rejection above. Heching further teaches, col. 6, lines 26-27, Stratified sampling classifies the population elements into 
sub-populations, or strata, and samples separately from each stratum. Heching also teaches, col, col. 3, using probability sampling, one can compute the probability that a given member of the population is included in the sample (which may be referred to as the "inclusion probability" for that member of the population).
 Jamal teaches each strata of the plurality of strata associated with a corresponding probability as shown in the office action.

As such, Ariyoshi, Heching and Jamal teach the limitations of the claim.

Argument 2
Zhang fails to teach or suggest implementing parameter value sets into corresponding
ARIMA prediction models, especially, where the parameter value sets include parameter values selected by selecting a value for each parameter values set from one of a plurality of strata, for the corresponding parameter, based on probabilities associated with the plurality of strata.

Examiner response:
Examiner respectfully disagrees.  Zhang teaches the limitation as explained in the rejection above. Zhang further teaches, page 162, the actual value and random error at time period t, wherein  Examiner interprets actual values as observed data values. 
Heching teaches the limitation as explained in the rejection above. Heching further teaches, col. 6, lines 26-27, Stratified sampling classifies the population elements into 
sub-populations, or strata, and samples separately from each stratum. Heching also teaches, col, col. 3, using probability sampling, one can compute the probability that a given member of the population is included in the sample (which may be referred to as the "inclusion probability" for that member of the population).
Jamal teaches each strata of the plurality of strata associated with a corresponding probability as shown in the office action.
As such, Ariyoshi, Heching and Jamal teach the limitations of the claim.

Claims 17-20
No further arguments were made for the following claims.  Therefore, claims 17-20 are not patentable over the cited references by virtue of its dependency on not patentable claim 16.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABABACAR SECK whose telephone number is (571)270-7146.  The examiner can normally be reached Monday through Friday, between 8:00 a.m. and 5:00 p.m. EST. or via telephone at (571) 270-7146 or facsimile transmission (571) 270-8146 or email: ababacar.seck@uspto.gov.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 571 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ABABACAR SECK/Examiner, Art Unit 2122      

 
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122