DETAILED ACTION
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
2.	This communication is in response to the Applicant’s submission filed 23 April 2019, where:
Claims 1-19 are pending.
Claims 1-19 are rejected.
Claim Objections
3.	Claims 1, 7-11, and 14 are objected to because of the following informalities:  
Claim 1, lines 11, recites “point forecast features,” which should read --the one or more point forecast features--. 
Claim 7, line 2, recites “the trained first stage point forecasting model” should read --the trained point forecast model--.
Claim 8, line 1, recites “each stage,” which should read --each of the stages--.
Claim 9, line 3, recites “QRF, QGBR, and QRNN,” which is objected to because where an acronym or abbreviation in the first mention should be spelled out, with the acronym identified in parentheses, and then abbreviated throughout the remainder of the claim set.
Claim 10, lines 1-2, recites “GBR” and “QRNN,” which is objected to because where an acronym or abbreviation in the first mention should be spelled out, with the acronym identified in parentheses, and then abbreviated throughout the remainder of the claim set.
Claim 11, lines 1-2, recites “QGBR” and “QRNN,” which is objected to because where an acronym or abbreviation in the first mention should be spelled out, with the acronym identified in parentheses, and then abbreviated throughout the remainder of the claim set.
Claim 14, line 1, recites “wherein predetermined features,” which should read --wherein the predetermined features--.
Appropriate correction is required.
Claim Rejections - 35 U.S.C. § 112
6.	The following is a quotation of 35 U.S.C. § 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
7.	Claims 1-19 are rejected under 35 U.S.C. § 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
Claim 1, line 9-10, recites “the probabilistic forecasting model,” which lacks antecedent basis in the claim.
Claim 2, lines 1-2, recites “the testing period,” which lacks antecedent basis in the claim.
Claim 2, lines 1-2, recites “the forecast model,” which is indefinite because it is unclear whether the phrase refers to the “point forecast model,” the probabilistic load forecasting, or a combination thereof, of claim 1.
Claim 3, lines 1-2, recites “the forecasting model,” which is indefinite because it is unclear whether the phrase refers to the “point forecast model,” the “probabilistic load forecasting model,” or a combination thereof, of claim 1.
Claim 4, line 2, recites “the contributions,” which lacks antecedent basis in the claim.
Claim 4, line 2, recites “the forecasting results,” which lacks antecedent basis in the claim.
Claim 4, line 2, recites “the forecasting results,” which is indefinite because it is unclear whether “the forecasting results” refers to the “point forecast model,” the “probabilistic load forecasting model,” or a combination thereof, of claim 1.
Claim 5, line 2, recites “the second stage computing time,” which lacks antecedent basis in the claim.
Claim 6, lines 2-3, recites “the probabilistic forecasting engine,” which lacks antecedent basis in the claim.
Claim 7, line 3, recites “the selected features,” which lacks antecedent basis in the claim because the limitations of claim 1 is “obtain one or more point forecast features.”
Claim 7 recites both an apparatus and a process of using the apparatus. When both an apparatus and a method are claimed in the same claim it is unclear whether direct infringement arises when the apparatus is constructed or when the apparatus is used. The claim, being directed to a “system,” also recites a method of “during testing, test data is first fitted into the trained first stage point forecasting model . . . .” (claim 7, lines 1-2). Therefore the claims have an indefinite scope.
Claim 9, line 4, recites “a probabilistic load forecasting model,” which is indefinite because it is unclear whether “a probabilistic load forecasting model” is intended to be an additional “model” to the claim, or whether the limitation is intended to refer back to the “point forecast model,” the “probabilistic load forecasting model,” or a combination thereof, of claim 1.
Claim 11, line 2, recites “first and second stages,” which is indefinite because it is unclear whether “first and second stages” is intended to draw antecedence from “a first stage” and “a second stage” of claim 1, or whether additional “first and second stages” are intended by the instant claim.
Claim 13, line 2, recites “a first stage point load forecasting model,” which is indefinite because it is unclear whether “a first stage load forecasting model” is intended to draw antecedence from “a first stage” having “a point forecast model” of claim 1, or whether an additional model to that of claim 1 is intended by the instant claim.
Claim 14, line 2, recites “a relative importance rate,” which is a relative term which renders the claim indefinite because the term simply relates to the identification of predetermined features via a feature ranked in some nebulous fashion to a rate of relative importance. The Specification does not provide guidance of how this rate is applied, but instead simply recites “GBR is applied in this stage for feature selection as it can produce the relative feature importance for all input features. With such a procedure, the most important features are identified through a list of features ranked by their relative importance rate. Then, the cumulative importance cut point is defined to determine which feature combination to adopted in second stage.” (Specification at p. 8, lines 19-24; see also Specification at p. 13, lines 7-10). Accordingly, the term " a relative importance rate" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
Claim 15, line 1, recites “the point load forecast,” which lacks antecedent basis in the claim.
Claim 16, lines 2-3, recites “the input dimension,” which lacks antecedent basis in the claim.
Claim 16, line 3, recites “the second stage model,” which lacks antecedent basis in the claim.
Claim 16, line 3, recites “while retaining the most information,” which is a relative term which renders the claim indefinite because the term “most” simply relates to a reduction of input dimensions at the second stage that retains “the most information” as to some unknown criteria. Accordingly, the term "the most information" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree (see, e.g., Specification at p. 9, lines 1-3), and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
Claim 17, lines 1-2, recite “the predetermined feature selection,” which lacks antecedent basis in the claim.
Claims 2-17 depend from claim 1, and are rejected as depending from a rejected claim; further, the claims fail to cure the deficiencies of claim 1.
Claim 18, line 8, recites “the forecasting model,” which lacks antecedent basis in the claim.
Claim 18, line 8, recites “the forecasting model,” which is indefinite because it is unclear whether “the forecasting model” is intended to draw antecedence from “a point forecast model,” of the first stage, or whether an additional model to that of “a point forecast model” is intended. For purposes of examination, Examiner considers that “the forecasting model” is intended as an additional model pertaining to the second stage.
Claim 19, line 9, recites “point forecast features,” while line 7 recites “a set of point forecast features.” The claim is indefinite because it is unclear whether the “point forecast features” are intended to draw antecedence from the “a set of point forecast features,” or that the “point forecast features” are intended to be another, or additional, set of the collection of obtained “point forecast features.” 
Claim Rejections - 35 U.S.C. § 101
8.	The following is a quotation of 35 U.S.C. § 101, which reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
9.	Claim 18 is rejected under 35 U.S.C. § 101 because the claimed invention is directed to non-statutory subject matter. The claim does not fall within at least one of the four categories of patent eligible subject matter because the claim is software expressed as code or a set of instructions detached from any medium is an idea without physical embodiment. See Microsoft Corp. v. AT&T Corp., 550 U.S. 437, 449, 82 USPQ2d 1400, 1407 (2007); MPEP § 2106.06. 
Claim 18 recites “[a] software to forecast loads in an energy grid, comprising: computer readable code to provide a two-stage probabilistic load forecasting with integrated point forecast as a probabilistic load forecasting (PLF) . . . .” 
As such, claim 18 is a product claim to a software program because it does not also contain at least one structural limitation (such as a "means plus function" limitation) and has no physical or tangible form. Thus, claim 18 does not fall within any statutory category.
Claim Rejections - 35 U.S.C. § 102
10.	The following is a quotation of the appropriate paragraphs of 35 U.S.C. § 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
11.	Claims 1-3, 6-8, 12-15, 18, and 19 are rejected under 35 U.S.C. § 102(a)(1) as being anticipated by Liu et al., "Probabilistic Load Forecasting via Quantile Regression Averaging on Sister Forecasts," IEEE Transactions on Smart Grid (2017) [hereinafter Liu].
Regarding claim 1, Liu teaches [a] system to forecast electrical loads in an energy grid (Liu, Abstract, teaches probabilistic load forecasting, which provides additional information on the variability and uncertainty of future load values, is becoming of great importance to power systems planning and operations (that is, a system)), comprising:
a processor to receive load information from the energy grid (Liu, right column of p. 732, “A. GEFCom2014 Data and Sister Load Forecasts,” first paragraph, teaches [t]he probabilistic load forecasting track of GEFCom2014 released seven years of hourly load history (2005–2011) (that is, load information from the energy grid) and 11 years of hourly weather history from 25 weather stations (2001–2011) (that is, to receive load information from the energy grid));
a two-stage probabilistic load forecasting unit with integrated point forecast as a probabilistic load forecasting (PLF) (Liu, left column of p. 731, “II. Methodology,” first paragraph, teaches proposed methodology can be dissected into two steps (that is, “two steps” are a two-stage probabilistic load forecasting unit), generating a set of sister load forecasts (that is, the “sister load forecasts” are point forecast) and applying QRA on the sister forecasts), including:
a first stage where predetermined features (Liu, right column of p. 731, “A. Sister Models and Sister Forecasts,” second paragraph, teaches that [w]hen developing a regression model for load forecasting, a key step is variable selection. (that is, “variable selection” pertains to predetermined features)) are utilized to train a point forecast model (Liu, left column of p. 732, “A. Sister Models and Sister Forecasts,” second paragraph, teaches tuning the length of the training dataset . . . and the partition of the training and validation datasets for model selection (that is, predetermined features utilized to train a point forecast model)) and obtain one or more point forecast features (Liu, left column of p. 731, “I. Introduction,” second paragraph, teaches [t]he direct input (that is, obtain one or more point forecast features) to QRA can be generated based on point forecasts from individual models (that is, the “direct input” is based on point forecasts from individual models)); and
a second stage where the probabilistic forecasting model is trained (Liu, left column of p. 732, “B. Quantile Regression Averaging,” second paragraph, teaches the individual point forecasts and the corresponding observation (here: electric load, yt) are put in a standard quantile regression setting (that is, where the probabilistic forecasting model is trained), being treated as independent variables and the dependent variable, respectively; see also Liu, right column at p. 734, “D. Model Selection,” first paragraph, teaches [t]he two naïve benchmarks do not require a model selection process, because their underlying models are predefined (that is, a first stage). . . . On the other hand, we have to determine several control parameters for the nine advanced benchmarks and the [Quantile Regression Averaging] models (that is, a second stage)), taking into consideration point forecast features (Liu, left column of p. 732, “B. Quantile Regression Averaging,” second paragraph, teaches to apply quantile regression to point forecasts of a number of individual forecasting models (that is, “to apply” is taking into consideration point forecast features)).
Examiner notes that the term "processor" recited in Applicant's claims is interpreted to be a well-known hardware structure. 
Examiner notes that the Applicant’s preamble does not afford patentable weight to the Applicant’s claims because the claim preamble is not “necessary to give life, meaning, and vitality” to the claim. Moreover, because the Applicant’s preamble merely states the purpose or intended use of the invention rather than any distinct definition of any of the claimed invention’s limitations, the preamble is not considered a limitation and is of no significance to claim construction.
Regarding claim 2, Liu teaches all of the limitations of claim 1, as described in detail above. 
Liu teaches -
wherein during the testing period of the forecast model (Liu, Fig. 1, caption, teaches [t]he first three years are used for the calibration of the sister (i.e., individual) models only and the latter three for validation and testing of the [Prediction Interval] implied by the sister models and the QRA technique (that is, the “prediction interval” is during the testing period of the forecast model)), final probabilistic load forecast results are leveraged to obtain both point forecasting and probabilistic forecasting (Liu, left column of p. 733, “IV. Case Study,” last paragraph, teaches For all eight sister models, data for year 2009 was used as the validation dataset, which allowed selection of the average-lag (or d-lag) pair (that is, during the testing period of the forecast model)
[Examiner note: the Specification does not provide guidance as to the meaning of the use of the term “leveraged.” The plain meaning of term “leveraged” is simply “[u]se (something) to maximum advantage,”1 which is not inconsistent with the specification]).
Regarding claim 3, Liu teaches all of the limitations of claim 1, as described in detail above. 
Liu teaches -
wherein the forecasting model is trained with selected feature subsets (Liu, right column of p. 731, “A. Sister Models and Sister Forecasts,” second paragraph, teaches [a]pplying the same variable selection process to the same dataset, we should get the same subset of variables. On the other hand, different variable selection processes may lead to different subsets of variables being selected (that is, the “subsets of variables” are selected feature subsets)).
Regarding claim 6, Liu teaches all of the limitations of claim 1, as described above in detail. 
Liu teaches -
wherein the predetermined features and produced point forecast are provided to the probabilistic forecasting engine (Liu, left column of p. 732, “B. Quantile Regression Averaging,” second paragraph, teaches to apply quantile regression to point forecasts of a number of individual forecasting models. More precisely, the individual point forecasts (that is, produced point forecast) and the corresponding observation (that is, the predetermined features) (here: electric load, yt) are put in a standard quantile regression setting) to train the model in the second stage (Liu, left column of p. 732, “B. Quantile Regression Averaging,” second paragraph, teaches the individual point forecasts (that is, produced point forecast) and the corresponding observation (that is, the predetermined features) (here: electric load, yt) are put in a standard quantile regression setting (that is, the “standard quantile regression setting” is to train the model in the second stage)
[Examiner notes that the phrase “to train the model in the second stage” is an “intended use,” and not afforded patentable weight]).
Regarding claim 7, Liu teaches all of the limitations of claim 1, as described above in detail. 
Liu teaches -
wherein during testing, test data is first fitted into the trained first stage point forecasting model (Liu, left column of p. 732, “A. Sister Models and Sister Forecasts,” last paragraph, teaches that [b]y tuning (that is, test data first fitted) the length of the training dataset (here: two or three years for parameter estimation) and the partition of the training and validation datasets for model selection (here: using the same four calibration schemes as in [26] that either treat all hourly values as one time series or as 24 independent series), we can obtain (that is, “obtain” is output of the trained first stage point forecasting model) different “average-lag” (or d-lag) pairs, leading to different sister models); then the output and the selected features from the first stage are used by the trained second stage forecasting model (Liu, left column of p. 732, “B. Quantile Regression Averaging,” second paragraph, teaches to apply quantile regression to point forecasts of a number of individual forecasting models. More precisely, the individual point forecasts (that is, produced point forecast) and the corresponding observation (that is, the selected features) (here: electric load, yt) are put in a standard quantile regression setting (that is, the trained second stage forecasting model)) to generate predictions (Liu, left column of p. 732, “B. Quantile Regression Averaging,” second paragraph, teaches [t]his QRA method yields an interval forecast of the predicted process (that is, to generate predictions).
Regarding claim 8, Liu teaches all of the limitations of claim 1, as described above in detail.
Liu teaches -
wherein each stage comprises a learning machine to be trained (Liu, right column of p. 734, “D. Model Selection,” first paragraph, teaches, with reference to the first stage, The two naïve benchmarks do not require a model selection process, because their underlying models are predefined (that is, by “predefined,” is a trained learning machine, and thus, the first stage comprises a learning machine to be trained); Liu, left column of p. 732, “B. Quantile Regression Averaging,” second paragraph, teaches [t]he quantile regression problem (that is, a learning machine) can be written as follows:
Qy(q|Xt) = Xtβq 
where Qy(q| ·) is the conditional qth quantile of the electric load distribution (yt), Xt are the regressors (explanatory variables) (that is, the “regressors” are a learning machine to be trained of the second stage), and βq is a vector of parameters for quantile q).
Regarding claim 12, Liu teaches all of the limitations of claim 1, as described above in detail. 
Liu teaches -
 wherein the predetermined features include historical load data, time, and weather-predetermined features (Liu, right column of p. 732, “A. GEFCom2014 Data and Sister Load Forecasts,” first paragraph, teaches probabilistic load forecasting track of GEFCom2014 released seven years of hourly (that is, time) load history (2005–2011) (that is, historical load data) and 11 years of hourly weather (that is, weather) history from 25 weather stations (2001–2011) (that is, the predetermined features include historical load data, time, and weather-predetermined features)).
Regarding claim 13, Liu teaches all of the limitations of claim 1, as described above in detail. 
wherein the predetermined features are used in a first stage point load forecasting model (Liu, right column of p. 732, “A. GEFCom2014 Data and Sister Load Forecasts,” first paragraph, teaches probabilistic load forecasting track of GEFCom2014 released seven years of hourly (that is, time) load history (2005–2011) (that is, historical load data) and 11 years of hourly weather (that is, weather) history from 25 weather stations (2001–2011) (that is, the predetermined features); Liu, left column of p. 733, “A. GEFCom2014 Data and Sister Load Forecasts,” last partial paragraph, teaches summer week are shown in Fig. 2. The first four sister models (denoted as Ind1–Ind4) (that is, a first stage point load forecasting model) were created based on two years (2007–2008) of training data).
Regarding claim 14, Liu teaches all of the limitations of claim 1, as described above in detail. 
Liu teaches -
wherein predetermined features are identified through a list of features ranked by a relative importance rate (Liu, right column of p. 731, “A. Sister Models and Sister Forecasts,” last paragraph, teaches The recency effect refers to the fact that the current hour load is affected by the weather conditions in the preceding hours (that is, the “current hour load” a predetermined feature that is identified through a list of features ranked by a relative importance rate). In the load forecasting literature the term was coined by Hong et al. [26], who adopted it from psychology, where it means that when asked to recall a list of items in any order, people tend to begin recall with the end of the list, recalling those items best (that is, as with “last-in-first-out,” the features are ranked by a relative importance rate)), and a cumulative importance cut point is defined to determine a feature combination for the second stage (Liu, left column of p. 732, “B. Quantile Regression Averaging,” second paragraph, teaches This QRA method yields an interval forecast of the predicted process, but does not use the PI of the individual methods (that is, from the first stage to the second stage, where the “prediction interval” at the first stage is a cumulative importance cut point is defined to determine a feature combination for the second stage)).
Regarding claim 15, Liu teaches all of the limitations of claim 1, as described above in detail. 
Liu teaches -
wherein the point load forecast given by the first stage is used as an additional input feature for the second stage (Liu, left column of p. 732, “B. Quantile Regression Averaging,” second paragraph, teaches to apply quantile regression to point forecasts of a number of individual forecasting models. More precisely, the individual point forecasts (that is, the point load forecast given by the first stage is used an additional input feature) and the corresponding observation (here: electric load, yt) are put in a standard quantile regression setting (that is, for the second stage)).
Regarding claim 18, Liu teaches [a] software to forecast loads in an energy grid, comprising:
computer readable code to provide a two-stage probabilistic load forecasting with integrated point forecast as a probabilistic load forecasting (PLF) (Liu, left column of p. 731, “II. Methodology,” first paragraph, teaches proposed methodology can be dissected into two steps (that is, “two steps” are a two-stage probabilistic load forecasting unit), generating a set of sister load forecasts (that is, the “sister load forecasts” are point forecast) and applying QRA on the sister forecasts), including:
a first stage where predetermined features (Liu, right column of p. 731, “A. Sister Models and Sister Forecasts,” second paragraph, teaches that [w]hen developing a regression model for load forecasting, a key step is variable selection. (that is, “variable selection” pertains to predetermined features)) are utilized to train a point forecast model (Liu, left column of p. 732, “A. Sister Models and Sister Forecasts,” second paragraph, teaches tuning the length of the training dataset . . . and the partition of the training and validation datasets for model selection (that is, predetermined features utilized to train a point forecast model)) and obtain the feature importance (Liu, left column of p. 731, “I. Introduction,” second paragraph, teaches [t]he direct input (that is, obtain one or more point forecast features) to QRA can be generated based on point forecasts from individual models (that is, the “direct input” is based on point forecasts from individual models)); and
a second stage where the forecasting model is trained (Liu, left column of p. 732, “B. Quantile Regression Averaging,” second paragraph, teaches the individual point forecasts and the corresponding observation (here: electric load, yt) are put in a standard quantile regression setting (that is, where the probabilistic forecasting model is trained), being treated as independent variables and the dependent variable, respectively; see also Liu, right column at p. 734, “D. Model Selection,” first paragraph, teaches [t]he two naïve benchmarks do not require a model selection process, because their underlying models are predefined (that is, a first stage). . . . On the other hand, we have to determine several control parameters for the nine advanced benchmarks and the [Quantile Regression Averaging] models (that is, a second stage)), taking into consideration point forecast features (Liu, left column of p. 732, “B. Quantile Regression Averaging,” second paragraph, teaches to apply quantile regression to point forecasts of a number of individual forecasting models (that is, “to apply” is taking into consideration point forecast features)).
Examiner notes that the term "software" and “computer readable code” recited in Applicant's claims is interpreted to be a well-known hardware structure. 
Examiner notes that the Applicant’s preamble does not afford patentable weight to the Applicant’s claims because the claim preamble is not “necessary to give life, meaning, and vitality” to the claim. Moreover, because the Applicant’s preamble merely states the purpose or intended use of the invention rather than any distinct definition of any of the claimed invention’s limitations, the preamble is not considered a limitation and is of no significance to claim construction.
Regarding claim 19, Liu teaches [a] method to forecast loads in an energy grid (Liu, Abstract), comprising:
providing a two-stage probabilistic load forecasting unit with integrated point forecast as a probabilistic forecasting feature into probabilistic load forecasting (PLF) (Liu, left column of p. 731, “II. Methodology,” first paragraph, teaches proposed methodology can be dissected into two steps (that is, “two steps” are a two-stage probabilistic load forecasting unit), generating a set of sister load forecasts (that is, the “sister load forecasts” are point forecast) and applying QRA on the sister forecasts), including:
training a point forecast model at a first stage where predetermined features are utilized (Liu, left column of p. 732, “A. Sister Models and Sister Forecasts,” second paragraph, teaches tuning the length of the training dataset . . . and the partition of the training and validation datasets for model selection (that is, training a point forecast model at a first stage where predetermined features are utilized model)) and obtaining a set of point forecast features (Liu, left column of p. 731, “I. Introduction,” second paragraph, teaches [t]he direct input (that is, obtaining a set of point forecast features) to QRA can be generated based on point forecasts from individual models (that is, the “direct input” is based on point forecasts from individual models)); and
training a forecasting model at a second stage, and taking into consideration point forecast features (Liu, left column of p. 732, “B. Quantile Regression Averaging,” second paragraph, teaches the individual point forecasts and the corresponding observation (here: electric load, yt) are put in a standard quantile regression setting (that is, where the probabilistic forecasting model is trained), being treated as independent variables and the dependent variable, respectively; see also Liu, right column at p. 734, “D. Model Selection,” first paragraph, teaches [t]he two naïve benchmarks do not require a model selection process, because their underlying models are predefined (that is, a first stage). . . . On the other hand, we have to determine several control parameters for the nine advanced benchmarks and the [Quantile Regression Averaging] models (that is, a second stage)).
Examiner notes that the Applicant’s preamble does not afford patentable weight to the Applicant’s claims because the claim preamble is not “necessary to give life, meaning, and vitality” to the claim. Moreover, because the Applicant’s preamble merely states the purpose or intended use of the invention rather than any distinct definition of any of the claimed invention’s limitations, the preamble is not considered a limitation and is of no significance to claim construction.
Claim Rejections - 35 USC § 103
12.	The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
13.	The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. § 103 are summarized as follows:
1. 	Determining the scope and contents of the prior art.
2. 	Ascertaining the differences between the prior art and the claims at issue.
3. 	Resolving the level of ordinary skill in the pertinent art.
4. 	Considering objective evidence present in the application indicating obviousness or nonobviousness.
14.	This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. § 102(b)(2)(C) for any potential 35 U.S.C. § 102(a)(2) prior art against the later invention.
15.	Claims 4, 5, 9-11, 16 and 17 are rejected under 35 U.S.C. § 103 as being unpatentable over Liu et al., "Probabilistic Load Forecasting via Quantile Regression Averaging on Sister Forecasts," IEEE Transactions on Smart Grid (2017) [hereinafter Liu] in view of Zhang et al., “Constructing Probabilistic Load Forecast From Multiple Point Forecasts: A Bootstrap Based Approach,” IEEE (2018) [hereinafter Zhang].
Regarding claim 4, Liu teaches all of the limitations of claim 1, as described above in detail.
Liu teaches -
wherein the predetermined features are ranked according to the contributions to the forecasting results (Liu, right column of p. 731, “A. Sister Models and Sister Forecasts,” second paragraph, teaches “recency effect models,” where recency effect refers to the fact that the current hour load is affected by the weather conditions in the preceding (that is, ranked) hours (that is, through “recency,” the predetermined features are ranked according to the contributions to the forecasting results)), . . . .
Though Liu teaches regression models to yield the sister point forecasts, Liu does not explicitly teach -
. . . which are the outputs from tree-based regression methods, such as gradient boosting regression (GBR).
But Zhang teaches -
. . . which are the outputs from tree-based regression methods, such as gradient boosting regression (GBR) (Zhang, right column of p. 185, “C. Bootstrap on train data set,” last partial paragraph, teaches M point forecast algorithms such as random forests and [gradient boosting regression tree (GBRT)] available to build forecast models (that is, outputs from tree-based regression methods, such as gradient boosting regression).
Liu and Zhang are from the same or similar field of endeavor. Liu teaches multi-stage load forecasting. Zhang teaches a multi-stage ensemble load forecasting. Thus, a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention would modify Liu pertaining to multi-stage load forecasting with the gradient boosting of Zhang.
The motivation to do so is because forms of gradient boosting and quantile regression are widely used in power load forecasts. (Zhang, Abstract, and Zhang, right column of p. 184, “I. Introduction,” first partial paragraph & full paragraph).
Regarding claim 5, Liu teaches all of the limitations of claim 1, as described in detail above. 
Though Liu teaches regression models to yield the sister point forecasts, Liu does not explicitly teach -
the predetermined features reduce the second stage computing time by extracting information to ensure solution quality.
But Zhang teaches -
wherein the predetermined features reduce the second stage computing time by extracting information to ensure solution quality (Zhang, Fig. 1, teaches dual-stage probabilistic load forecasting:

    PNG
    media_image1.png
    298
    593
    media_image1.png
    Greyscale

Zhang, left column of p. 185, “A. Motivation,” first paragraph, teaches bootstrap method samples the dataset with a replacement for M×B times, construct M×B sample sets differ in detail. After training these sample sets through M algorithms, we get M × B models. Since B may not be large due to the computing power (that is, “computing power” correlates to computing time), we shall bootstrap these models again for B’ times; Zhang, right column of p. 185, “B. Bootstrap,” second paragraph, teaches that [t]he bootstrap method helps a lot to get different samples of the training sets by sampling with replacement (that is, “bootstrapping” is extracting information to ensure solution quality)).
Liu and Zhang are from the same or similar field of endeavor. Liu teaches multi-stage load forecasting. Zhang teaches a multi-stage ensemble load forecasting. Thus, a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention would modify Liu pertaining to multi-stage load forecasting with the data bootstrapping of Zhang.
The motivation to do so is to reduce bias while constructs intervals for paramters or estimators. (Zhang, right column of p. 185, “B. Bootstrap,” first paragraph).
Regarding claim 9, Liu teaches all of the limitations of claim 1, as described above in detail. 
Though Liu teaches point load forecasting models and probabilistic load forecasting models, Liu does not explicitly teach -
wherein one of random forests, gradient boosting regression (GBR) and deep neural networks (DNN) is used for the point forecasting model.
Zhang teaches -
wherein one of random forests, gradient boosting regression (GBR) and deep neural networks (DNN) is used for the point forecasting model (Zhang, right column at p. 185, “C Bootstrap on train data set,” second paragraph, teaches there are M point forecast algorithms (that is, the first stage having the point forecasting model) such as random forests and [gradient boosting regression tree (GBRT)] available to build forecast models (that is, random forest)), and one of QRF, QGBR, and QRNN is used for a probabilistic load forecasting model (Zhang, right column of p. 187, “B. Comparison,” first paragraph, teaches algorithms to construct probabilistic distributions of a forecast. Here, we use two quantile regression methods as follows: 
• Quantile Random Forest (Q-RF): Q-RF uses [Random Forest] to regress the quantiles of the probabilistic forecast (that is, the second stage having probabilistic load forecasting model).
• Quantile GBRT (Q-GBRT): Q-BGRT uses gradient boosting regression tree algorithm to regress the quantile of the probabilistic forecast).
Liu and Zhang are from the same or similar field of endeavor. Liu teaches multi-stage load forecasting. Zhang teaches a multi-stage ensemble load forecasting. Thus, a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention would modify Liu pertaining to multi-stage load forecasting with the gradient boosting of Zhang.
The motivation to do so is because forms of gradient boosting and quantile regression are widely used in power load forecasts. (Zhang, Abstract, and Zhang, right column of p. 184, “I. Introduction,” first partial paragraph & full paragraph).
Regarding claim 10, Liu teaches all of the limitations of claim 1, as described above in detail. 
Though Liu teaches point load forecasting models and probabilistic load forecasting models, Liu does not explicitly teach -
wherein GBR is selected for the first stage and QRNN is used for the second stage.
But Zhang teaches -
wherein GBR is selected for the first stage (Zhang, right column at p. 185, “C Bootstrap on train data set,” second paragraph, teaches there are M point forecast algorithms (that is, the first stage having the point forecasting model) such as random forests and [gradient boosting regression tree (GBRT)] available to build forecast models (that is, GBRT is GBR is selected for the first stage)), and QRNN is used for the second stage (Zhang, right column of p. 184, “I. Introduction,” first full paragraph, teaches to construct distributions through quantile regressions (that is, QR) is a method widely used in power load forecasts. For example, He et al. introduced quantile regressions methods applied to machine learning algorithms such as SVM, neural networks (that is, quantile regression neural network (QRNN) is used for the second stage)).
Liu and Zhang are from the same or similar field of endeavor. Liu teaches multi-stage load forecasting. Zhang teaches a multi-stage ensemble load forecasting. Thus, a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention would modify Liu pertaining to multi-stage load forecasting with the gradient boosting of Zhang.
The motivation to do so is because forms of gradient boosting and quantile regression are widely used in power load forecasts. (Zhang, Abstract, and Zhang, right column of p. 184, “I. Introduction,” first partial paragraph & full paragraph).
Regarding claim 11, Liu teaches all of the limitations of claim 1, as described above in detail. 
Though Liu teaches a “direct model approach” based on observed load and weather data (Liu, right column of p. 732, “A. GEFCom2014 Data and Sister Load Forecasts,” first paragraph), Liu, however does not explicitly teach -
wherein a direct QGBR model and direct QRNN are trained over training first and second stages to generate probabilistic load forecasting for testing.
But Zhang teaches -
wherein a direct QGBR model and direct QRNN are trained over training first and second stages to generate probabilistic load forecasting for testing (Zhang, right column at p. 185, “C Bootstrap on train data set,” second paragraph, teaches there are M point forecast algorithms (that is, the first stage having the point forecasting model) such as random forests and [gradient boosting regression tree (GBRT)] available to build forecast models (that is, GBRT is GBR is selected for the first stage); Zhang, right column of p. 184, “I. Introduction,” first full paragraph, teaches to construct distributions through quantile regressions (that is, QR) is a method widely used in power load forecasts. For example, He et al. introduced quantile regressions methods applied to machine learning algorithms such as SVM, neural networks (that is, direct quantile regression neural network (QRNN) is used for the second stage and quantile gradient boosting regression (QGBR); Liu, left column at p. 732, “Sister Models and Sister Forecasts,” last paragraph, teaches tuning the length of the training dataset (here: two or three years for parameter estimation) and the partition of the training and validation datasets for model selection) (that is, a direct QGBR  model and direct QRNN are trained over training first and second stages  to generate probabilistic load forecasting for testing)
[Examiner notes: the term “direct” pertains to “observed” data; also, that a training data set applied at the first stage produces training data for the second stage, and accordingly, the direct QGBR model and the direct QRNN are trained over training first and second stages]).
Liu and Zhang are from the same or similar field of endeavor. Liu teaches multi-stage load forecasting. Zhang teaches a multi-stage ensemble load forecasting. Thus, a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention would modify Liu pertaining to multi-stage load forecasting with the gradient boosting of Zhang.
The motivation to do so is because forms of gradient boosting and quantile regression are widely used in power load forecasts. (Zhang, Abstract, and Zhang, right column of p. 184, “I. Introduction,” first partial paragraph & full paragraph).
Regarding claim 16, Liu teaches all of the limitations of claim 1, as described above in detail. 
Though Liu teaches a “tuning” the dataset by intervals (PI), Liu does not explicitly teach -
wherein the predetermined features comprise a set of feature combinations is constructed that reduces the input dimension  for the second stage model  while retaining the most  information.
But Zhang teaches -
wherein the predetermined features comprise a set of feature combinations is constructed that reduces the input dimension for the second stage model while retaining the most information (Zhang, Fig. 2, teaches bootstrapping, which reduces prediction bias:

    PNG
    media_image2.png
    438
    459
    media_image2.png
    Greyscale

Zhang, right column of p. 185, “B. Bootstrap,” second paragraph, teaches [t]he core spirit of bootstrap is the process of randomly sampling with replacement. For example, [l]et X = (X1, . . . ,Xn) be a sample of the certain stochastic event. Since n is finite (that is, reduces the input dimension), there must be some information lost when sampled from the whole sample space. For example, in practice, we are supposed to choose the data two or three years before to train our models for electric load forecast, but actually, the two-or-three-year episode is incapable to cover all kinds of stochastic patterns in the real world power system. The bootstrap method helps a lot to get different samples of the train sets by sampling with replacement (that is, a set of feature combinations is constructed that reduces the input dimension  for the second stage model  while retaining the most  information)).
Liu and Zhang are from the same or similar field of endeavor. Liu teaches multi-stage load forecasting. Zhang teaches a multi-stage ensemble load forecasting. Thus, a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention would modify Liu pertaining to multi-stage load forecasting with the bootstrap sampling of Zhang.
The motivation to do so is because forms of gradient boosting and quantile regression are widely used in power load forecasts. (Zhang, Abstract, and Zhang, right column of p. 184, “I. Introduction,” first partial paragraph & full paragraph).
Regarding claim 17, Liu teaches all of the limitations of claim 1, as described in detail above. 
Liu teaches -
wherein the predetermined feature selection applies lasso regression, ridge regression or forward selection (Liu, right column of p. 734, “D. Model Selection,” first paragraph, teaches “calibration windows,” in which [f]or QRA, we need to determine: 1) the number of sister forecasts used for QRA; and 2) the best calibration window length. In this paper, we consider four calibration windows of different lengths (24 h times 91, 122, 183, or 365 days) within a rolling scheme, i.e., each day the calibration window is moved forward (that is, forward selection) by one day (or 24 h) and the parameters of the models are re-estimated every day (that is, the predetermined feature selection applies . . . forward selection), and . . . .
Though Liu teaches a “regression,” Liu, however, does not explicitly teach gradient boosting regression (GBR). But Zhang teaches GBR (Zhang, Abstract, teaches common machine learning methods, random forest (RF) and gradient boosting regression tree (GBRT) (that is, gradient boosting regression)).
Liu and Zhang are from the same or similar field of endeavor. Liu teaches multi-stage load forecasting. Zhang teaches a multi-stage ensemble load forecasting. Thus, a person having ordinary skill in the art as of the effective filing date of the Applicant’s invention would modify Liu pertaining to multi-stage load forecasting with the gradient boosting regression of Zhang.
The motivation to do so is because forms of gradient boosting and quantile regression are widely used in power load forecasts. (Zhang, Abstract, and Zhang, right column of p. 184, “I. Introduction,” first partial paragraph & full paragraph).
Conclusion
16.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
(Wang et al., "Probabilistic Short-Term Load Forecasting," Energy Intelligence Laboratory (2015)) teaches probabilistic short-term load forecasting including two-stage bootstrap sampling for multi-stage load forecasting.
(Hong et al., “Probabilistic electric load forecasting: A tutorial review,” Int’l Journal of Forecasting (2016)) teaches probabilistic electric load forecasting that underlines the need to invest in additional research such as reproducible case studies, probabilistic load forecast evaluation and valuation and a consideration of emerging technologies.
(US Published Application US20170061315 to Leonard et al.) teaches generating summary statistics for data predictions based on the aggregation of data from past time intervals in relation to power or energy grids.
17.	Any inquiry concerning this communication or earlier communications from the Examiner should be directed to KEVIN L. SMITH whose telephone number is (571) 272-5964. Normally, the Examiner is available on Monday-Thursday 0730-1730. 
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the Examiner by telephone are unsuccessful, the Examiner’s supervisor, KAKALI CHAKI can be reached on 571-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/K.L.S./
Examiner, Art Unit 2122

/BRIAN M SMITH/Primary Examiner, Art Unit 2122                                                                                                                                                                                                        


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 https://www.lexico.com/en/definition/leverage