Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 08 September 2022 has been entered.
Response to Arguments
Regarding the 35 USC 103 rejection, Examiner has fully considered Applicant’s arguments and assertions.
Applicant’s arguments with respect to the previous prior art combination of the record have been considered but are moot because the new grounds of rejection does not rely on Ura or He for any teachings or matter specifically challenged in the argument. The claims are rejected under a new grounds of rejection. The present claims are rejected as being unpatentable over the combination of Beddo, Zhang, and Brzezicki. See the detailed rejection below.
Therefore, the present claims are rejected under 35 USC 103.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-4, 6-12, and 14-22 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 1, 9, and 17 are rejected under 35 USC 112(b) because the bounds of the claim are rendered unclear due to the combination of the following claim limitations: randomly sampling with replacement the pooled training dataset to form a plurality of different training sets and a plurality of different validation sets that correspond to the training sets, wherein each combination of a training set and a validation set forms all of the plurality of data points and for each different sampled pooled training dataset the data points that are not part of the validation set are part of the training set, each different training set and corresponding validation set is formed from the same pooled training dataset. 
The bounds of the claims are rendered unclear because it is unclear what manner of random sampling is being performed by the claim. In particular, the broadest reasonable interpretation of the random sampling includes random sampling being performed on a data point level, a data set level, or a combination of training set and validation set level. Random sampling performed at a granular level, whether on a data point level or per data set level, would not yield training and corresponding validation data sets where each data point that is not a part of the validation set is part of the training set, in which each training and corresponding validation data set combination does not include overlapping data points. Upon viewing the specification, in at least [0039-0040], the specification does not clarify the manner of random sampling is being performed. The performance of random sampling with replacement at a data point level would render the metes and bounds of the claim unclear because this type of sampling would not yield non-overlapping combinations of training and validation sets, such that each data point that is not a part of the validation set is part of the training set. For example, if the random sampling with replacement is performed at a data point level (i.e. each data point that is randomly sampled is returned to the pooled dataset), then a single data point may appear in both the training and corresponding validation set, a data point may never be sampled in either data set (i.e. other data points are sampled multiple times), or a data point may appear multiple times within a single training or validation set. Similarly, if the data points are randomly sampled with replacement for each set, whether training or validation, then a group of data points may appear in both the training and corresponding validation set, a group of data points may never be sampled in either data set (i.e. other groups of data are sampled multiple times), or a group of data points may appear multiple times within a single training or validation data set. If the random sampling is performed at the training and validation set combination level, then the metes and bounds of the claim would be clear because the random “split” of the data into either the training or validation set would yield each combination of a training set and a validation set forms all of the plurality of data points and for each different sampled pooled training dataset, the data points that are not part of the validation set are part of the training set. Therefore, the bounds of the claim are rendered unclear because the broadest reasonable interpretation of the claimed “random sampling” includes random sampling processes that do not create a combination of a training set and validation formed of non-overlapping data points. 
For the sake of compact prosecution, Examiner is interpreting the random sampling with replacement as producing a plurality of training and validation sets, wherein each training and validation set comprises all of the pooled dataset points, and where each combination of training and validation set comprises non-overlapping data points that are randomly sampled. 
Dependent claims are rejected due to dependency on rejected base claims 1, 9, and 17.
Therefore, claims 1-4, 6-12, and 14-22 rejected under 35 U.S.C. 112(b).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-4, 9-12, and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Beddo et al. (US 20140108094 A1) in view of Zhang (US 11182691 B1) in view of Brzezicki et al. (US 20130024173 A1).

Regarding claim 1, Beddo teaches a method of forecasting sales of a retail item (Figs. 2 and 4), the method comprising:
receiving historical sales data of a class of a retail item that comprises a plurality of stock keeping units (paragraph [0064] teaches collecting historical product information associated with the sale of products of a specific category (i.e. class); see also: [0002, 0024, 0063, 0069]), 
the historical sales data comprising past sales and promotions of the retail item across a plurality of past time periods (paragraphs [0065] teaches collecting product information including sales data measured in any time period, such as weeks, days, months, etc., wherein paragraph [0067] the product information includes promotions associated with the product; see also: [0024, 0063, 0069]), 
the historical sales data corresponding to an amount of sales of each stock keeping unit at each store during each of the past time periods (paragraph [0064] teaches the product information relates to the sale of a products of a category; paragraph [0063] teaches the sales product information includes the sales of the product by an individual store, retailer as a whole, or geographic area; paragraph [0065] teaches the product sales data is the number of units sold during a time period, such as hours, weeks, days, months, and years); 
aggregating the historical sales to a higher aggregation level than the historical sales data to form a pooled training dataset having a plurality of data points (paragraph [0074] teaches constructing a data matrix of product information used by the neural network, wherein paragraph [0114] the product information is sales data of weekly sales over a year at all of the retailer’s distribution outlets in a specific geographic retail area (i.e. a higher aggregation level)), 
the higher level comprising an amount of sales of each subclass that corresponds to the stock keeping units at each store during each of the past time periods (paragraph [0074] teachings constructing a data matrix of product information used by the neural network, wherein paragraph [0114] the product information is sales data of weekly sales over a year at all of the retailer’s distribution outlets in a specific geographic retail area of 24-packs of consumer product Z (i.e. subclass)); 
each data point representing a subclass/store combination ([0063-0064] teach the production information includes the product information is associated with the sale of a product by a retailer, such as a grocery store or retailer, and the product category, wherein [0065] teaches the product information can be any amount of type of information associated with the sale of the product, such as number of units sold by a certain company, wherein paragraph [0114] teaches the product information is sales data of weekly sales over a year at all of the retailer’s distribution outlets in a specific geographic retail area of 24-packs of consumer product Z (i.e. subclass); see also: [0074]); 
training multiple models ([0040-0041] teach creating numerous neural network models, wherein [0042-0043] teach training the neural networks using a training data set, as well as [0080] teach the dynamic system can train and re-train the neural network connections; see also: [0031-0036]),
… calculate an error (paragraph [0045] teaches evaluating each neural network for a given validation set of data and producing a validation score, wherein paragraphs [0045-0053] teach the minimum description length is calculated for each validation set using the residual sum of squares; see also: [0046]; Examiner’s Note: Minimum description length is an error minimization technique used in machine learning applications that is calculated using the residual sum of squares, otherwise known as the sum of squared errors, which measures the error of the model compared to the validation data set. Furthermore, see the detailed combination below.), 
calculating model weights for each trained model (paragraph [0045] teaches assigned a weight to each neural network model; see also: [0007]), 
outputting a forecasted demand function comprising a model combination of each trained model and corresponding model weight (paragraph [0053] teaches combining multiple neural network models using their weights in order to combine them into a single model for producing forecasts, wherein paragraphs [0090-0091] teach the forecast is a sales forecast (i.e. forecasting the demand); see also: [0007, 0036, 0045]; Examiner’s Note: Examiner is interpreting the weighted combination of neural network models as being a “function,” or an expression of multiple weighted neural networks used to forecast demand.); 
and generating a forecast of future sales based on the forecasted demand function (paragraphs [0090-0091] teach generating a sales forecast, wherein paragraph [0053] teaches combining multiple neural network models using their weights in order to combine them into a single model for producing forecasts; see also: [0007, 0036, 0045]).
However, Beddo does not explicitly teach randomly sampling with replacement the pooled training dataset to form a plurality of different training sets and a plurality of different validation sets that correspond to the training sets, wherein each combination of a training set and a validation set forms all of the plurality of data points and for each different sampled pooled training dataset the data points that are not part of the validation set are part of the training set, each different training set and corresponding validation set is formed from the same pooled training dataset; each model trained using a unique training set of the plurality of different training sets, and using each corresponding different validation set to validate each trained model; wherein each of the training and validating of each of the multiple models uses all of the plurality of data points of the pooled training dataset; wherein the error is a root- mean-square error (RMSE) and the calculated model weights are based in the RMSE.
	From the same or similar field of endeavor, Zhang teaches randomly sampling with replacement the pooled training dataset to form a plurality of different training sets and a plurality of different validation sets that correspond to the training sets (Fig. 27 and Col 46 line 49 to Col 47 line 8 teach splitting a data set of labeled observation records five different ways to produce 5 respective training sets and 5 corresponding test sets, wherein each training set comprises 80% of the training data and the test sets comprise the remaining 20% of the data, such as “2740A” and “2740E” of Fig. 27, wherein Col 47 lines 23-40 teach the training and test data sets may be obtained using random selection, as well as in Fig. 28 and Col 47 lines 41-63 teach performing consistent chunk-level splits of input data sets using a random number based split algorithm, wherein the data set chunks are divided into training sets and corresponding test sets, and wherein Col 24 lines 34-55 teach the machine learning prediction process can be applied to several domains including financial analysis; see also: Col 56 lines 34-58, Col 59 lines 3-17; Examiner’s Note: See the 35 USC 112(b) rejection above.), 
wherein each combination of a training set and a validation set forms all of the plurality of data points and for each different sampled pooled training dataset the data points that are not part of the validation set are part of the training set (Fig. 27 and Col 46 line 49 to Col 47 line 8 teach splitting a data set of labeled observation records five different ways to produce 5 respective training sets and 5 corresponding test sets, wherein Fig. 28 and Col 47 line 64 to Col 48 line 41 teach a split ratio of 80-20 is used to place chunks of the data record into the training set or test set, such that the observation records of a given chunk are not distributed to both the training set and test set, and wherein the split operation is deemed consistent when each object of the source data set is place into exactly one split result set, either the training set or corresponding test set; see also: Col 24 lines 34-55, Col 47 lines 41-63, Col 56 lines 34-58, Col 59 lines 3-17), 
each different training set and corresponding validation set is formed from the same pooled training dataset (Fig. 27 and Col 46 line 49 to Col 47 line 8 teach splitting a data set of labeled observation records five different ways to produce 5 respective training sets and 5 corresponding test sets, wherein each training set comprises 80% of the training data and the test sets comprise the remaining 20% of the data, such as “2740A” and “2740E” of Fig. 27, as well as in Fig. 28 and Col 47 lines 41-63 teach performing consistent chunk-level splits of input data sets using a random number based split algorithm, wherein the data set chunks are divided into training sets and corresponding test set; see also: Col 24 lines 34-55, Col 56 lines 34-58, Col 59 lines 3-17);
each model trained using a unique training set of the plurality of different training sets (Col 46 line 16 to Col 47 line 8 teach training a model on a number of distinct training and test sets extracted from the same underlying data, wherein the model may be trained on a distinct training and test data sets, wherein the model is trained on a training dataset comprising 80% of the data, and tested using the remaining 20% of the data, wherein Col 47 lines 23-40 teach the training and test datasets are obtained using random selection, and wherein Fig. 28 and Col 48 lines 59-60 teach the models may be trained and evaluated one time with a single split training evaluation; see also: Col 60 lines 5-30), 
and using each corresponding different validation set to validate each trained model (Col 46 line 16 to Col 47 line 8 teach training a model on a number of distinct training and test sets extracted from the same underlying data, wherein the model may be trained on a distinct training and test data sets, wherein the model is trained on a training dataset comprising 80% of the data, and tested using the remaining 20% of the data, wherein Col 47 lines 23-40 teach the training and test datasets are obtained using random selection, and wherein Fig. 28 and Col 48 lines 59-60 teach the models may be trained and evaluated one time with a single split training evaluation; see also: Col 60 lines 5-30), 
wherein each of the training and validating of each of the multiple models uses all of the plurality of data points of the pooled training dataset (Col 46 line 16 to Col 47 line 8 teach training a model on a number of distinct training and test sets extracted from the same underlying data, wherein the model may be trained on a distinct training and test data sets, wherein the model is trained on a training dataset comprising 80% of the data, and tested using the remaining 20% of the data, and wherein Col 47 lines 23-40 teach the training and test datasets are obtained using random selection, as well as in Fig. 28 and Col 47 lines 41-63 teach performing consistent chunk-level splits of input data sets using a random number based split algorithm, wherein the data set chunks are divided into training sets and corresponding test set, and wherein Col 48 lines 59-60 teach the models may be trained and evaluated one time with a single split training evaluation; see also: Col 60 lines 5-30).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Beddo to incorporate the teachings of Zhang to include randomly sampling with replacement the pooled training dataset to form a plurality of different training sets and a plurality of different validation sets that correspond to the training sets, wherein each combination of a training set and a validation set forms all of the plurality of data points and for each different sampled pooled training dataset the data points that are not part of the validation set are part of the training set, each different training set and corresponding validation set is formed from the same pooled training dataset; each model trained using a unique training set of the plurality of different training sets, and using each corresponding different validation set to validate each trained model; wherein each of the training and validating of each of the multiple models uses all of the plurality of data points of the pooled training dataset. One would have been motivated to do so in order to improve prediction quality and accuracy by using consistent 80-20 splits of record data, thus avoiding inconsistent splits with incomplete data, which may result in poorer prediction quality and accuracy (Zhang, Col 48 lines 27-41). 
However, the combination of Beddo and Zhang does not explicitly teach wherein the error is a root- mean-square error (RMSE) and the calculated model weights are based in the RMSE.
From the same or similar field of endeavor, Brzezicki teaches wherein the error is a root- mean-square error (RMSE) and the calculated model weights are based in the RMSE ([0027-0028] teach combining multiple models in order to generate a weighted forecast that estimates product sales, wherein [0034-0035] teach generating a combined forecast with three weighted models, wherein the models are weighted using root mean square error weights; see also: [0030-0033, 0084]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Beddo and Zhang to incorporate the teachings of Brzezicki to include wherein the error is a root- mean-square error (RMSE) and the calculated model weights are based in the RMSE. One would be motivated to do so in order to provide even stronger forecast results by using a combination of models instead of a single model (Brzezicki, [0003]). By incorporating the teachings of Brzezicki, one would be able to generate an optimal combined forecasts that best utilizes the available individual forecasts by identifying a weighted combination of models (Brzezicki, [0028]).
Regarding claims 9 and 17, the claims recite limitations already addressed by the rejection of claim 1. Regarding claim 9, Beddo teaches a computer-readable medium having instructions stored thereon that (paragraph [0012] teaches a non-transitory computer readable medium provided for forecasting sales), when executed by a processor, cause the processor to forecast sales of a retail item (paragraph [0162] teaches the non-transitory computer readable medium has computer executable code). Regarding claim 17, Beddo teaches a retail sales forecasting system comprising (Figs. 2-5): a processor coupled to a storage device that implements promotions effect module comprising (paragraph [0094] teaches a forecasting apparatus including a processor coupled to a memory with a software application (i.e. module)). Therefore, the rejection to claim 1 as being unpatentable over Beddo in view of Zhang in view of Brzezicki applies to claims 9 and 17. 

Regarding claims 2, 10, and 18, the combination of Beddo, Zhang, and Brzezicki teach all the limitations of claims 1, 9, and 17 above.
	Beddo further teaches the training multiple models comprises using a machine learning algorithm for the training (paragraph [0040] teaches creating multiple neural network models (i.e. machine learning), wherein paragraph [0034] the model actively trains and re-trains).

	Regarding claims 3, 11, and 19, the combination of Beddo, Zhang, and Brzezicki teach all the limitations of claims 2, 10, and 18 above.
	Beddo further teaches the machine learning algorithm comprises one of Artificial Neural Networks (paragraph [0024] teaches generating a neural network).

	Regarding claims 4, 12, and 20 the combination of Beddo, Zhang, and Brzezicki teach all the limitations of claims 1, 9, and 17 above.
Beddo further teaches the historical data comprises data for multiple retail stores and multiple stock keeping units that belong to a subclass over multiple time periods (paragraph [0063] teaches the product information includes the sale of a product by an entity with multiple locations (i.e. multiple retail stores), wherein paragraph [0064] the product information relates to the sales of products (i.e. multiple stock keepings units), wherein paragraph [0065] the product information contains sales data for the number of units sold during any time period, such as days, weeks, or months (i.e. multiple time periods), wherein paragraph [0070] teaches the product information relates the sales of two different products from the same competitive selection set);
wherein the aggregating comprises a subclass level (paragraph [0070] teaches the product information may multiple products existing within the same competitive selection set that are associated with different brands (i.e. subclass level)).

Claim(s) 6-8, 14-16, and 21-22 are rejected under 35 U.S.C. 103 as being unpatentable over Beddo et al. (US 20140108094 A1) in view of Zhang (US 11182691 B1) in view of Brzezicki et al. (US 20130024173 A1) and further in view of Caldeira et al. (Predicting the yield curve using forecast combinations; Caldeira J.F., Moura G.V., Santos A.A.P. (2016); Computational Statistics and Data Analysis, 100, pp. 79-98.) in view of Kraftsow et al. (US 20130346385 A1).

Regarding claims 6, 14, and 21, the combination of Beddo, Zhang, and Brzezicki teach all the limitations of claims 1, 9, and 17 above.
However, Beddo does not explicitly teach for each model of each training set i, the calculating model weights w(i) comprises:                         
                            w
                            
                                
                                    i
                                
                            
                            =
                            
                                
                                    1
                                
                                
                                    1
                                    +
                                    R
                                    M
                                    S
                                    E
                                    (
                                    i
                                    )
                                
                            
                        
                    .
	From the same or similar field of endeavor, Caldeira teaches for each model of each training set i (page 86, section 5. “Thick modeling approach with RMSE-weights (FC-RMSE)” teaches computing the root mean square error of all selected models),
the calculating model weights w(i) comprises:                         
                            w
                            
                                
                                    i
                                
                            
                            =
                            
                                
                                    1
                                
                                
                                    R
                                    M
                                    S
                                    E
                                    (
                                    i
                                    )
                                
                            
                        
                     (page 86, section 5. “Thick modeling approach with RMSE-weights (FC-RMSE)” discloses an equation, wherein the number teaches calculating the model weight for one model as: 
    PNG
    media_image1.png
    27
    145
    media_image1.png
    Greyscale
.).
While Caldeira does not explicitly evaluate item demand forecasting, Caldeira presents a solution to a problem reasonably pertinent to the claimed invention. For example, as explained above, Beddo addresses calculating weights for each forecasting model; however, Beddo does not explicitly teach the claimed manner of calculating weights for each forecasting model. Caldeira describes an approach to improving the usefulness of weighted forecasting. In Beddo, one is inquiring about an optimal sales forecast. Analogously, in Caldeira, one is inquiring about an optimal interest rate forecasting. It would have been obvious to one of ordinary skill in the art at the time of Applicant’s invention to modify the combination of Beddo, Zhang, and Brzezicki to incorporate the teachings of Caldeira to include for each model of each training set, i the calculating model weights w(i) comprises:                         
                            w
                            
                                
                                    i
                                
                            
                            =
                            
                                
                                    1
                                
                                
                                    R
                                    M
                                    S
                                    E
                                    (
                                    i
                                    )
                                
                            
                        
                    . This improvement is suggested since Caldeira analogously provides this weighting scheme in order to alleviate model uncertainty (Caldeira, Page 80, first paragraph).
Although the combination of Beddo, Zhang, Brzezicki, and Caldeira teach the calculating model weights w(i) comprises:                         
                            w
                            
                                
                                    i
                                
                            
                            =
                            
                                
                                    1
                                
                                
                                    R
                                    M
                                    S
                                    E
                                    (
                                    i
                                    )
                                
                            
                        
                    , the combination does not explicitly teach the calculating model weights w(i) comprises:                         
                            w
                            
                                
                                    i
                                
                            
                            =
                            
                                
                                    1
                                
                                
                                    1
                                    +
                                    R
                                    M
                                    S
                                    E
                                    (
                                    i
                                    )
                                
                            
                            .
                             
                        
                     (Emphasis added by the Examiner, specifically the “1+” in the denominator of the equation.)
From the same or similar field of endeavor, Kraftsow discloses adding a small constant to the denominator of a weighting equation (paragraphs [0039-0040] teach summing the reciprocal of an equation, wherein a constant factor of “cis” (such as 1) is added to the denominator). 
It would have been obvious to one of ordinary skill in the art to modify the equation Caldeira to incorporate the teachings of Kraftsow to incorporate the addition of a small constant to the denominator of the weighting equation. This known technique is being applied to a known art ready for improvement. The improvement is provided by Kraftsow because the technique of Kraftsow limits the maximum score of the weights (Kraftsow, [0039]). Additionally, the art of Kirshenbaum (US 20110119209 A1) suggests a similar motivation, wherein a small constant is added to the denominator of a weighting formula to ensure the uncertainty value can never be zero (Kirshenbaum, [0063-0064]). A person having ordinary skill in the art would recognize the benefit of adding this constant factor to the denominator, which would limit any single model from being weighed as infinity as the error value approaches zero.

Regarding claims 7, 15, and 22, the combination of Beddo, Zhang, Brzezicki, Caldeira, and Kraftsow teach all the limitations of claims 6, 14, and 21 above.
However, Beddo does not explicitly teach determining a sum S of the model weights w(i) comprising S=sum(w(i)); and normalizing a weight w'(i) for each w(i) comprising                         
                            
                                
                                    w
                                
                                
                                    '
                                
                            
                            
                                
                                    i
                                
                            
                            =
                            
                                
                                    w
                                    (
                                    i
                                    )
                                
                                
                                    s
                                
                            
                        
                    .
	From the same or similar field of endeavor, Caldeira further teaches: 
determining a sum S of the model weights w(i) comprising S=sum(w(i)) (page 86, section 5. “Thick modeling approach with RMSE-weights (FC-RMSE)” discloses calculating the sum of each model weight in the denominator of the equation: 
    PNG
    media_image2.png
    45
    156
    media_image2.png
    Greyscale
); 
and normalizing a weight w'(i) for each w(i) comprising                         
                            
                                
                                    w
                                
                                
                                    '
                                
                            
                            
                                
                                    i
                                
                            
                            =
                            
                                
                                    w
                                    (
                                    i
                                    )
                                
                                
                                    s
                                
                            
                        
                     (page 86, section 5. “Thick modeling approach with RMSE-weights (FC-RMSE)” discloses calculating the normalized weight for each model using the following equation:
    PNG
    media_image3.png
    87
    301
    media_image3.png
    Greyscale
).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Beddo, Zhang, Brzezicki, Caldeira, and Kraftsow to incorporate the further teachings of Caldeira to include determining a sum S of the model weights w(i) comprising S=sum(w(i)); and normalizing a weight w'(i) for each w(i) comprising                         
                            
                                
                                    w
                                
                                
                                    '
                                
                            
                            
                                
                                    i
                                
                            
                            =
                            
                                
                                    w
                                    (
                                    i
                                    )
                                
                                
                                    s
                                
                            
                        
                    . One would be motivated to do so in order to alleviate model uncertainty (Caldeira, Page 80, first paragraph).

	Regarding claims 8 and 16, the combination of Beddo, Zhang, Brzezicki, Caldeira, and Kraftsow teach all the limitations of claims 7 and 15 above.
	However, Beddo does not explicitly teach the generating the forecast of future sales y using each model M(i) comprises: y = sum(f(M(i), x)*W(i)), wherein f comprises the function to forecast for each model, and x corresponds to each data point.
	From the same or similar field of endeavor, Caldeira further teaches: the generating the forecast of future sales y using each model M(i) comprises: y = sum(f(M(i), x)*W(i)) (page 85 section 2.8 “Combined forecasts” teaches: 
    PNG
    media_image4.png
    73
    317
    media_image4.png
    Greyscale
; Examiner’s Note: The “y” variable is the equivalent to the claimed y variable; The                         
                            
                                ∑
                                
                                     
                                
                            
                        
                    variable is the equivalent of the “sum” variable. The “w” variable is the equivalent of the claimed w variable. The second “y” is equivalent to f(M(i),x), wherein the “tau” is equivalent to the claimed x variable),
 wherein f comprises the function to forecast for each model (page 85 section 2.8 “Combined forecasts” teaches the “y” is the forecast of the mth model. Examiner’s Note: The second “y” is equivalent to f(M(i),x), wherein the “tau” is equivalent to the claimed x variable).
and x corresponds to each data point (page 85 section 2.8 “Combined forecasts” teaches the “tau” is each maturity value (i.e. a data point)). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Beddo, Zhang, Brzezicki, Caldeira, and Kraftsow to incorporate the further teachings of Caldeira to include the generating the forecast of future sales y using each model M(i) comprises: y = sum(f(M(i), x)*W(i)), wherein f comprises the function to forecast for each model, and x corresponds to each data point. One would be motivated to do so in order to alleviate model uncertainty (Caldeira, Page 80, first paragraph).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Lahouar et al. (Day-ahead load forecast using random forest and expert input selection, 2015) discloses performing bootstrapping to produce training and validation sets, wherein the system utilizes random sampling with replacement, wherein the values of (Xi, Yi) may appear many times
Runkana et al. (US 20180330300 A1) discloses evaluating the predictive performance of the models is evaluated using the test dataset and is expressed in terms of root mean square error
Grindstaff et al. (US 20180285691 A1) discloses splitting known data into a training dataset and validation dataset, wherein 80% of the data is used for training and 20% of the data is used for validation
Alstad et al. (US 20160342751 A1) discloses randomly splitting the data points into a training set and validation set with 80% in the training set and 20% in the validation set, wherein the cross validation data is used to calculate a root mean squared error of the classifier
Garge (US 10832264 B1) discloses the historical promotion data is set into a training set and test set by randomly sampling from the original data either with or without replacement 
Eder (US 20160196587 A1) discloses randomly partitioning data into a training set and test set, wherein the system may utilize bootstrapping where the different training data sets are created by re-sampling with replacement from the original training set so data records may occur more than once

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Sara G Brown whose telephone number is (469)295-9145. The examiner can normally be reached M-Th 8:00 am- 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Brian Epstein can be reached on (571) 270-5389. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SARA GRACE BROWN/Examiner, Art Unit 3683