DETAILED ACTION
This action is in response to the Applicant Response filed 20 October 2022 for application 16/375,627 filed 04 April 2019.
Claims 1-18, 20 are currently amended.
Claim 21 is new.
Claim 19 is cancelled.
Claims 1-18, 20-21 are pending.
Claims 1-18, 20-21 are rejected.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments regarding the 35 U.S.C. 101 rejections of claims 1-10, 14-18, 20 have been fully considered and, in light of the amendments to the claims, are persuasive. The 35 U.S.C. 101 rejections of claims 1-10, 14-18, 20 have been withdrawn.

Applicant’s arguments with respect to claims 1-18, 20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Objections
Claims 1-10, 15-18 are objected to because of the following informalities:
Claim 1, line 15, the estimator ensemble should read “the trained estimator ensemble”
Claim 15, lines 2-3, the means for generating the recommendation comprising should read “the neural network comprising”
Claim 17, lines 2-3, the means for generating the recommendation comprising should read “the neural network comprising”
Claims 2-10, 16, 18 are objected to due to their dependence, either directly or indirectly on claims 1, 15, 17
Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 6-11, 14, 20-21 are rejected under 35 U.S.C. 103 as being unpatentable over Ghai et al. (Multi-Level Ensemble Learning Based Recommender System, hereinafter referred to as “Ghai”) in view of Géron, Aurélien (Hands-On Machine Learning with Scikit-Learn and TensorFlow Chapter 7 Excerpt, hereinafter referred to as “Geron”) and further in view of Lu et al. (US 2020/0231466 A1 – Intelligent Systems and Methods for Process and Asset Health Diagnosis, Anomaly Detection and Control in Wastewater Treatments Plants or Drinking Water Plants, hereinafter referred to as “Lu”).

Regarding claim 1 (Currently Amended), Ghai teaches in a digital medium environment to enhance a digital experience for a user (Ghai, section 1 – teaches providing recommendations [enhancing digital experience] in various environments, such as Amazon, Netflix, Facebook [digital medium environment]), a method implemented by at least one computing device (Ghai, section 7 – teaches training and testing models using R and its set of packages [using statistical programming packages requires computing devices]), the method comprising:  
receiving a request for a recommendation to enhance the digital experience for the user, the request including an indication of past user interactions of the user with the digital experience (Ghai, section 1 – teaches platforms such as Amazon, Netflix and Facebook providing recommendations; Ghai - teaches making personalized recommendations [Delivering personalized recommendations means a request, even if automated, is made for the recommendation]); 
generating, using the estimator ensemble (Ghai, Fig. 1, section 5 – teaches an ensemble of a plurality of models) and based on the indication of past user interactions (Ghai, section 6 – teaches the dataset of past movie recommendations by users), multiple estimation values (Ghai, Fig. 1, section 5 – teaches an ensemble of a plurality of models which takes in data [past user interactions] and generates multiple values, at least one value from each model of the ensemble); 
generating, using the neural network and based on the multiple estimation values, the recommendation to enhance the digital experience for the user (Ghai, Fig. 1, section 5 – teaches inputting the results from the ensemble models to a neural network to generate the movie recommendation [recommendation to enhance digital experience]; see also Ghai, Table 3); 
enhancing the digital experience based on the recommendation to generate an enhanced digital experience (Ghai, section 1 – teaches platforms such as Amazon, Netflix and Facebook [digital experience] providing recommendations [enhancement] generated using recommender systems); and 
displaying the enhanced digital experience (Ghai, section 1 – teaches platforms such as Amazon, Netflix and Facebook providing recommendations [It is obvious that to provide the recommendation the noted platforms would have to display the recommendation to the user]).
While Ghai teaches an estimator ensemble whose estimation values are used as inputs into a neural network to generate a recommendation, Ghai does not explicitly teach generating a trained estimator ensemble by training, using a first training data set, each of a singular value decomposition estimator, a neighborhood or clustering estimator, a factorization estimator, a time-aware estimator, a variational autoencoder estimator, and a gradient boosting estimator included in an estimator ensemble to generate an estimation value; training, using a second training data set and multiple estimation values generated by the trained estimator ensemble, a neural network to generate recommendations to enhance the digital experience for the user.
Geron teaches
generating a trained estimator ensemble by training, using a first training data set, … an estimator ensemble to generate an estimation value (Geron, pp. 22-24, Stacking section – teaches using a first training set (subset 1) to train a plurality of models [estimator ensemble] to generate estimation values; see also Geron, Fig. 7-13 (below)); 

    PNG
    media_image1.png
    691
    1430
    media_image1.png
    Greyscale


training, using a second training data set and multiple estimation values generated by the trained estimator ensemble, a neural network to generate recommendations to enhance the digital experience for the user (Geron, pp. 22-24, Stacking section – teaches using a second training set (subset 2), different from the first training set, to generate predictions from the previously trained layer of models [estimator ensemble] to train a blender machine learning model to make predictions; see also Geron Fig. 7-14 (above) [In light of the Ghai reference above, the blender machine learning model can be interpreted as a neural network. Further, in light of the Ghai reference above, the predictions can be interpreted as recommendations to enhance the digital experience for the user.]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai with the teachings of Geron in order to combine multiple methods to generate more comprehensive and reliable results by ensuring that predictions are “clean” since the estimator ensemble does not see the second training set during its training in the field of ensemble learning for recommender systems (Geron, p. 1 – “Moreover, ... you will often use Ensemble methods near the end of a project, once you have already built a few good predictors, to combine them into an even better predictor. In fact, the winning solutions in Machine Learning competitions often involve several Ensemble methods ...”; Geron, p. 23 – “Next, the first layer predictors are used to make predictions on the second (held-out) set ... This ensures that the predictions are 'clean,' since the predictors never saw these instances during training. Now for each instance in the hold-out set there are ... predicted values. We can create a new training set using these predicted values as input features (which makes this new training set ...), and keeping the target values. The blender is trained on this new training set, so it learns to predict the target value given the first layer’s predictions.”).
While Ghai in view of Geron teaches an estimator ensemble whose estimation values are used as inputs into a neural network to generate a recommendation, Ghai in view of Geron does not explicitly teach that the estimator ensemble comprises  a singular value decomposition estimator, a neighborhood or clustering estimator, a factorization estimator, a time-aware estimator, a variational autoencoder estimator, and a gradient boosting estimator.
Lu teaches generating a trained estimator ensemble (Lu, ¶0045 – teaches recommendations based on combination of models) …, each of a singular value decomposition estimator (Lu, ¶0020 – teaches an SVD model), a neighborhood or clustering estimator (Lu, ¶0020 – teaches clustering model), a factorization estimator (Lu, ¶0020 – teaches matrix factorization model), a time-aware estimator (Lu, ¶0019 – teaches RNN, LSTM, GRU models), a variational autoencoder estimator (Lu, ¶0020 – teaches VAE model), and a gradient boosting estimator (Lu, ¶0019 – teaches gradient boosting model) included in an estimator ensemble to generate an estimation value (Lu, ¶¶0004, 0045 – teaches recommendations based on combination of models, each with its own output).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai with the teachings of Lu in order to combine multiple methods to generate more comprehensive and reliable results in the field of ensemble learning for recommender systems (Lu, Abstract – “Disclosed herein are intelligent methods or systems for process and asset health diagnosis and anomaly detection in wastewater treatment plants or drinking water plants. The system includes the entire diagnosis methodology to determine the plant health status including process and asset health. The results can be pushed out to a user interface as notifications or to a control system for actions taken in accordance with the results. Data for diagnosis can be obtained from one or more of influent sensors, assets sensors, process sensors, effluent sensors, lab tests, plant dynamic or static simulated model, any other models to simulate or predict the plant process or asset, and the like. Compared with traditional human experience or simple threshold method, the systems and methods described herein combine a series of advanced methods or algorithms to get more comprehensive and reliable diagnosis results. The systems and methods described herein provide an intelligent water plant diagnosis service or product to end user for better monitoring and control and management of daily operations. The algorithms or models can be, but are not limited to supervised learning, unsupervised learning, risk recognition, anomaly detection, statistical analytics, cross validation, and the like. All the algorithms or models could be continuously upgraded as data loads.”).

Regarding claim 6 (Currently Amended), Ghai in view of Geron and further in view of Lu teaches all of the limitations of the method of claim 1 as noted above. Ghai further teaches the past user interactions including values provided by the user for different items included in the digital experience (Ghai, section 6 – teaches MovieLens dataset [past user interactions] which includes user ratings for movies).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Ghai, Geron and Lu for the same reasons as disclosed in claim 1 above.

Regarding claim 7 (Currently Amended), Ghai in view of Geron and further in view of Lu teaches all of the limitations of the method of claim 6 as noted above. Ghai further teaches the past user interactions further including a time feature that indicates, for a particular item, a time that the particular item was first available to the user (Ghai, section 6 – teaches MovieLens dataset [past user interactions] which includes user ratings for movies and movie information including release date [first available]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Ghai, Geron and Lu for the same reasons as disclosed in claim 6 above.

Regarding claim 8 (Currently Amended), Ghai in view of Geron and further in view of Lu teaches all of the limitations of the method of claim 6 as noted above. Ghai further teaches the past user interactions further including a time feature that indicates, for a particular item, a time that the user provided the value for the particular item (Ghai, section 6 – teaches MovieLens dataset [past user interactions] which includes user ratings for movies including timestamp for the rating).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Ghai, Geron and Lu for the same reasons as disclosed in claim 6 above.

Regarding claim 9 (Currently Amended), Ghai in view of Geron and further in view of Lu teaches all of the limitations of the method of claim 6 as noted above. Ghai further teaches the past user interactions further including a time feature that indicates a time that the user first provided a value for any of the different items (Ghai, section 6 – teaches MovieLens dataset [past user interactions] which includes user ratings for movies, including timestamps for ratings of all movies rated by user).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Ghai, Geron and Lu for the same reasons as disclosed in claim 6 above.

Regarding claim 10 (Currently Amended), Ghai in view of Geron and further in view of Lu teaches all of the limitations of the method of claim 6 as noted above. Ghai further teaches the past user interactions further including a time feature that indicates, for a particular item, a timespan between a time that the particular item was first available to the user and a time that the user provided the value for the particular item (Ghai, section 6 – teaches MovieLens dataset [past user interactions] which includes user ratings for movies, including time stamp of rating and movie release date [Having the rating timestamp and the movie release, the timespan can easily be calculated]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Ghai, Geron and Lu for the same reasons as disclosed in claim 6 above.

Regarding claim 11 (Currently Amended), Ghai teaches in a digital medium environment to enhance a digital experience for a user (Ghai, section 1 – teaches providing recommendations [enhancing digital experience] in various environments, such as Amazon, Netflix, Facebook [digital medium environment]), a method implemented by at least one computing device (Ghai, section 7 – teaches training and testing models using R and its set of packages [using statistical programming packages requires computing devices]), the method comprising: 
obtaining a first training data set that includes, for each of multiple users, values associated with the user for particular items (Ghai, section 6 – teaches MovieLens dataset with 100,000 entries of movie rating for 1682 movies by 943 users to create training dataset [See Geron reference below, the dataset is split into a first and second subset; see also Geron, Figs. 7-13, 7-14 (above)]); 
obtaining a second training data set that includes, for each of the multiple users, values associated with the user for particular items (Ghai, section 6 – teaches MovieLens dataset with 100,000 entries of movie rating for 1682 movies by 943 users to create training dataset [See Geron reference below, the dataset is split into a first and second subset; see also Geron, Figs. 7-13, 7-14 (above)]); 
enhancing, using the recommendation, the digital experience for the user (Ghai, section 1 – teaches platforms such as Amazon, Netflix and Facebook [digital experience] providing recommendations [enhancement] generated using recommender systems).
While Ghai teaches an estimator ensemble whose estimation values are used as inputs into a neural network to generate a recommendation, Ghai does not explicitly teach training, in a first stage using the first training data set, each of a singular value decomposition estimator, a neighborhood or clustering estimator, a factorization estimator, a time-aware estimator, a variational autoencoder estimator, and a gradient boosting estimator in an estimator ensemble to generate an estimation value; generating, using the estimator ensemble previously trained in the first stage and based on the second training data set, multiple estimation values; training, in a second stage using the multiple estimation values, a neural network to generate a recommendation to enhance the digital experience for the user.
Geron teaches
training, in a first stage using the first training data set, … an estimator ensemble to generate an estimation value (Geron, pp. 22-24, Stacking section – teaches using a first training set (subset 1) to train a plurality of models [estimator ensemble] to generate estimation values; see also Geron, Fig. 7-13 (above)); 
generating, using the estimator ensemble previously trained in the first stage and based on the second training data set, multiple estimation values (Geron, pp. 22-24, Stacking section – teaches using a second training set (subset 2), different from the first training set, to generate predictions from the previously trained layer of models [estimator ensemble] to train a blender machine learning model to make predictions; see also Geron Fig. 7-14 (above)); 
training, in a second stage using the multiple estimation values, a neural network to generate a recommendation to enhance the digital experience for the user (Geron, pp. 22-24, Stacking section – teaches using a second training set (subset 2), different from the first training set, to generate predictions from the previously trained layer of models [estimator ensemble] to train a blender machine learning model to make predictions; see also Geron Fig. 7-14 (above) [In light of the Ghai reference above, the blender machine learning model can be interpreted as a neural network. Further, in light of the Ghai reference above, the predictions can be interpreted as recommendations to enhance the digital experience for the user.]). 
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai with the teachings of Geron in order to combine multiple methods to generate more comprehensive and reliable results by ensuring that predictions are “clean” since the estimator ensemble does not see the second training set during its training in the field of ensemble learning for recommender systems (Geron, p. 1 – “Moreover, ... you will often use Ensemble methods near the end of a project, once you have already built a few good predictors, to combine them into an even better predictor. In fact, the winning solutions in Machine Learning competitions often involve several Ensemble methods ...”; Geron, p. 23 – “Next, the first layer predictors are used to make predictions on the second (held-out) set ... This ensures that the predictions are 'clean,' since the predictors never saw these instances during training. Now for each instance in the hold-out set there are ... predicted values. We can create a new training set using these predicted values as input features (which makes this new training set ...), and keeping the target values. The blender is trained on this new training set, so it learns to predict the target value given the first layer’s predictions.”).
While Ghai in view of Geron teaches an estimator ensemble whose estimation values are used as inputs into a neural network to generate a recommendation, Ghai in view of Geron does not explicitly teach that the estimator ensemble comprises  a singular value decomposition estimator, a neighborhood or clustering estimator, a factorization estimator, a time-aware estimator, a variational autoencoder estimator, and a gradient boosting estimator.
Lu teaches each of a singular value decomposition estimator (Lu, ¶0020 – teaches an SVD model), a neighborhood or clustering estimator (Lu, ¶0020 – teaches clustering model), a factorization estimator (Lu, ¶0020 – teaches matrix factorization model), a time-aware estimator (Lu, ¶0019 – teaches RNN, LSTM, GRU models), a variational autoencoder estimator (Lu, ¶0020 – teaches VAE model), and a gradient boosting estimator (Lu, ¶0019 – teaches gradient boosting model) in an estimator ensemble to generate an estimation value (Lu, ¶¶0004, 0045 – teaches recommendations based on combination of models, each with its own output).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai with the teachings of Lu in order to combine multiple methods to generate more comprehensive and reliable results in the field of ensemble learning for recommender systems (Lu, Abstract – “Disclosed herein are intelligent methods or systems for process and asset health diagnosis and anomaly detection in wastewater treatment plants or drinking water plants. The system includes the entire diagnosis methodology to determine the plant health status including process and asset health. The results can be pushed out to a user interface as notifications or to a control system for actions taken in accordance with the results. Data for diagnosis can be obtained from one or more of influent sensors, assets sensors, process sensors, effluent sensors, lab tests, plant dynamic or static simulated model, any other models to simulate or predict the plant process or asset, and the like. Compared with traditional human experience or simple threshold method, the systems and methods described herein combine a series of advanced methods or algorithms to get more comprehensive and reliable diagnosis results. The systems and methods described herein provide an intelligent water plant diagnosis service or product to end user for better monitoring and control and management of daily operations. The algorithms or models can be, but are not limited to supervised learning, unsupervised learning, risk recognition, anomaly detection, statistical analytics, cross validation, and the like. All the algorithms or models could be continuously upgraded as data loads.”).

Regarding claim 14 (Currently Amended), Ghai teaches a system (Ghai, section 7 – teaches training and testing models using R and its set of packages [using statistical programming packages requires computing devices]) comprising: 
means for …, using … multiple estimation values generated by the trained estimator ensemble, a neural network to generate recommendations to enhance a digital experience for a user (Ghai, Fig. 1, section 5 – teaches inputting the results from the ensemble models to a neural network to generate the movie recommendation [recommendation to enhance digital experience]; see also Ghai, Table 3); and 
a display device to display (Ghai, section 1 – teaches platforms such as Amazon, Netflix and Facebook providing recommendations [It is obvious that to provide the recommendation the noted platforms would have to display the recommendation to the user]), based on a recommendation to enhance the digital experience for the user (Ghai, section 1 – teaches platforms such as Amazon, Netflix and Facebook [digital experience] providing recommendations [enhancement] generated using recommender systems) generated by the neural network, an enhanced digital experience (Ghai, section 1 – teaches platforms such as Amazon, Netflix and Facebook providing recommendations [It is obvious that to provide the recommendation the noted platforms would have to display the recommendation to the user]).
While Ghai teaches an estimator ensemble whose estimation values are used as inputs into a neural network to generate a recommendation, Ghai does not explicitly teach means for generating a trained estimator ensemble by training, using a first training data set, each of a singular value decomposition estimator, a neighborhood or clustering estimator, a factorization estimator, a time-aware estimator, a variational autoencoder estimator, and a gradient boosting estimator included in an estimator ensemble to generate an estimation value; means for training, using a second training data set and multiple estimation values generated by the trained estimator ensemble, a neural network to generate recommendations to enhance a digital experience for a user.
Geron teaches
means for generating a trained estimator ensemble by training, using a first training data set, … an estimator ensemble to generate an estimation value (Geron, pp. 22-24, Stacking section – teaches using a first training set (subset 1) to train a plurality of models [estimator ensemble] to generate estimation values; see also Geron, Fig. 7-13 (above)); 
means for training, using a second training data set and multiple estimation values generated by the trained estimator ensemble, a neural network to generate recommendations to enhance a digital experience for a user (Geron, pp. 22-24, Stacking section – teaches using a second training set (subset 2), different from the first training set, to generate predictions from the previously trained layer of models [estimator ensemble] to train a blender machine learning model to make predictions; see also Geron Fig. 7-14 (above) [In light of the Ghai reference above, the blender machine learning model can be interpreted as a neural network. Further, in light of the Ghai reference above, the predictions can be interpreted as recommendations to enhance the digital experience for the user.]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai with the teachings of Geron in order to combine multiple methods to generate more comprehensive and reliable results by ensuring that predictions are “clean” since the estimator ensemble does not see the second training set during its training in the field of ensemble learning for recommender systems (Geron, p. 1 – “Moreover, ... you will often use Ensemble methods near the end of a project, once you have already built a few good predictors, to combine them into an even better predictor. In fact, the winning solutions in Machine Learning competitions often involve several Ensemble methods ...”; Geron, p. 23 – “Next, the first layer predictors are used to make predictions on the second (held-out) set ... This ensures that the predictions are 'clean,' since the predictors never saw these instances during training. Now for each instance in the hold-out set there are ... predicted values. We can create a new training set using these predicted values as input features (which makes this new training set ...), and keeping the target values. The blender is trained on this new training set, so it learns to predict the target value given the first layer’s predictions.”).
While Ghai in view of Geron teaches an estimator ensemble whose estimation values are used as inputs into a neural network to generate a recommendation, Ghai in view of Geron does not explicitly teach that the estimator ensemble comprises  a singular value decomposition estimator, a neighborhood or clustering estimator, a factorization estimator, a time-aware estimator, a variational autoencoder estimator, and a gradient boosting estimator.
Lu teaches means for generating a trained estimator ensemble (Lu, ¶0045 – teaches recommendations based on combination of models) …, each of a singular value decomposition estimator (Lu, ¶0020 – teaches an SVD model), a neighborhood or clustering estimator (Lu, ¶0020 – teaches clustering model), a factorization estimator (Lu, ¶0020 – teaches matrix factorization model), a time-aware estimator (Lu, ¶0019 – teaches RNN, LSTM, GRU models), a variational autoencoder estimator (Lu, ¶0020 – teaches VAE model), and a gradient boosting estimator (Lu, ¶0019 – teaches gradient boosting model) included in an estimator ensemble to generate an estimation value (Lu, ¶¶0004, 0045 – teaches recommendations based on combination of models, each with its own output). 
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai with the teachings of Lu in order to combine multiple methods to generate more comprehensive and reliable results in the field of ensemble learning for recommender systems (Lu, Abstract – “Disclosed herein are intelligent methods or systems for process and asset health diagnosis and anomaly detection in wastewater treatment plants or drinking water plants. The system includes the entire diagnosis methodology to determine the plant health status including process and asset health. The results can be pushed out to a user interface as notifications or to a control system for actions taken in accordance with the results. Data for diagnosis can be obtained from one or more of influent sensors, assets sensors, process sensors, effluent sensors, lab tests, plant dynamic or static simulated model, any other models to simulate or predict the plant process or asset, and the like. Compared with traditional human experience or simple threshold method, the systems and methods described herein combine a series of advanced methods or algorithms to get more comprehensive and reliable diagnosis results. The systems and methods described herein provide an intelligent water plant diagnosis service or product to end user for better monitoring and control and management of daily operations. The algorithms or models can be, but are not limited to supervised learning, unsupervised learning, risk recognition, anomaly detection, statistical analytics, cross validation, and the like. All the algorithms or models could be continuously upgraded as data loads.”).

Regarding claim 20 (Currently Amended), Ghai in view of Geron and further in view of Lu teaches all of the limitations of the method of claim 11 as noted above. Ghai further teaches the particular items including movies (Ghai, section 6 – teaches MovieLens dataset [past user interactions] which includes user ratings for movies).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Ghai, Geron and Lu for the same reasons as disclosed in claim 11 above.

Regarding claim 21 (New), Ghai in view of Geron and further in view of Lu teaches all of the limitations of the method of claim 11 as noted above. Ghai further teaches wherein the first training data set and the second training data set are two different training data sets (Geron, pp. 22-24, Stacking section – teaches using a first training set (subset 1) to train the first layer of models and using a second training set (subset 2), different from the first training set, to make predictions from the trained first layer which are used to train the second layer blender model; see also Geron, Figs. 7-13, 7-14 (above)).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Ghai, Geron and Lu for the same reasons as disclosed in claim 11 above.

Claims 2-3, 12, 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Ghai in view of Geron, further in view of Lu, further in view of Asmita et al. (Review on the Architecture, Algorithm and Fusion Strategies in Ensemble Learning, herein after referred to as “Asmita”) and further in view of Deng et al. (DeepCF: A Unified Framework of Representation Learning and Matching Function Learning in Recommender System, hereinafter referred to as “Deng”).

Regarding claim 2 (Currently Amended), Ghai in view of Geron and further in view of Lu teaches all of the limitations of the method of claim 1 as noted above. Ghai further teaches the recommendation being one of multiple potential values (Ghai, section 6 – teaches movie ratings from 1 to 5 [multiple values]; Ghai, section 7 – teaches predicting the rating of the recommendation). 
While Ghai in view of Geron and further in view of Lu teaches an ensemble followed by a neural network, Ghai in view of Geron and further in view of Lu does not explicitly teach the neural network comprising a 3-layer neural network followed by a mapping and normalization layer, the mapping and normalization layer outputting the recommendation as a set of probability distributions on the multiple values.
Asmita teaches the neural network comprising a 3-layer neural network (Asmita, Fig. 4 – teaches an ensemble of classifiers whose outputs are inputs into a 3-layer neural network, including an input layer, a hidden layer and an output layer; see also Asmita, section 7.5 – teaches recommender systems) followed by a mapping and normalization layer (Asmita, Fig. 4 – teaches an ensemble followed by a 3-layer neural network followed by a classifier ensemble [mapping and normalization layer]), the mapping and normalization layer outputting the recommendation (Asmita, Fig. 4 – teaches the classifier ensemble [mapping and normalization layer] generating the output [recommendation]; see also Asmita, section 7.5 – teaches recommender systems).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai in view of Geron and further in view of Lu with the teachings of Asmita in order to compare various model architectures in order to generate an ensemble model which combines multiple results to generate an improved result in the field of ensemble learning for recommender systems (Asmita, Abstract – “Ensemble Learning is an approach in machine learning to find a predictive model taking into considerations the opinions of various experts. Groups of people can often make better decisions than individuals especially when group members come in with their own biases. This document presents a review on the possible architectures that can be used to build an ensemble model, the techniques in which the opinions of the experts could be combined to give a general improved model and the algorithms for implementing the Ensemble Learning. Comparison of architectures is done on the basis of diversity, classification accuracy and memory consumption. This can be helpful in choosing the options depending on the requirement. In the last part an analysis of ensemble learning algorithms on the basis on Bias and Variance is included.”).
While Ghai in view of Geron, further in view of Lu and further in view of Asmita teaches a mapping and normalization layer that outputs the recommendation, Ghai in view of Geron, further in view of Lu and further in view of Asmita does not explicitly teach the output recommendations as a set of probability distributions on the multiple values.
Deng teaches the mapping and normalization layer outputting the recommendation (Deng, p. 65, Fusion and Learning section – teaches the ensemble outputs as inputs to a fully connected layer [neural network] and mapping the values to an output) as a set of probability distributions (Deng, p. 63, Learning the Model section – teaches outputting probability values for the outputs) on the multiple values (Deng, p. 66, Experiments section – teaches generating recommendations where outputs have multiple values, e.g., movie ratings).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai in view of Geron, further in view of Lu and further in view of Asmita with the teachings of Deng in order to use ensembles to combine the strengths of multiple models while overcoming the flaws of the individual models in the field of ensemble learning for recommender systems (Deng, Abstract – “In general, recommendation can be viewed as a matching problem, i.e., match proper items for proper users. However, due to the huge semantic gap between users and items, it’s almost impossible to directly match users and items in their initial representation spaces. To solve this problem, many methods have been studied, which can be generally categorized into two types, i.e., representation learning-based CF methods and matching function learning-based CF methods. Representation learning-based CF methods try to map users and items into a common representation space. In this case, the higher similarity between a user and an item in that space implies they match better. Matching function learning-based CF methods try to directly learn the complex matching function that maps user-item pairs to matching scores. Although both methods are well developed, they suffer from two fundamental flaws, i.e., the limited expressiveness of dot product and the weakness in capturing low-rank relations respectively. To this end, we propose a general framework named DeepCF, short for Deep Collaborative Filtering, to combine the strengths of the two types of methods and overcome such flaws. Extensive experiments on four publicly available datasets demonstrate the effectiveness of the proposed DeepCF framework.”).

Regarding claim 3 (Currently Amended), Ghai in view of Geron, further in view of Lu, further in view of Asmita and further in view of Deng teaches all of the limitations of the method of claim 2 as noted above. Deng further teaches the training the neural network including minimizing cross-entropy loss between the recommendations and one-hot representations of ground truths (Deng, pp. 63-64, Learning the Model section - teaches minimizing cross-entropy loss between prediction and ground truth; Deng, p. 65, Learning section - teaches minimizing cross-entropy function; Deng, pp. 66-67, Experiments section – teaches ratings, e.g., movie ratings, as one-hot ground truth values).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai, Geron, Lu, Asmita and Deng in order to minimize cross-entropy loss to obtain an objective function suitable for learning from implicit feedback (Deng, P. 64, Learning the Model section).

Regarding claim 12 (Currently Amended), Ghai in view of Geron and further in view of Lu teaches all of the limitations of the method of claim 11 as noted above. Ghai further teaches the recommendation being one of multiple potential values (Ghai, section 6 – teaches movie ratings from 1 to 5 [multiple values]; Ghai, section 7 – teaches predicting the rating of the recommendation).
While Ghai in view of Geron and further in view of Lu teaches an ensemble followed by a neural network, Ghai in view of Geron and further in view of Lu does not explicitly teach the neural network comprising a 3-layer neural network followed by a mapping and normalization layer, the mapping and normalization layer outputting the recommendation as a set of probability distributions on the multiple values. Further, Ghai in view of Geron and further in view of Lu does not explicitly teach the training the neural network comprising training the neural network to minimize cross-entropy loss between the recommendation and a one-hot representation of a ground truth.
Asmita teaches the neural network comprising a 3-layer neural network (Asmita, Fig. 4 – teaches an ensemble of classifiers whose outputs are inputs into a 3-layer neural network, including an input layer, a hidden layer and an output layer; see also Asmita, section 7.5 – teaches recommender systems) followed by a mapping and normalization layer (Asmita, Fig. 4 – teaches an ensemble followed by a 3-layer neural network followed by a classifier ensemble [mapping and normalization layer]), the mapping and normalization layer outputting the recommendation (Asmita, Fig. 4 – teaches the classifier ensemble [mapping and normalization layer] generating the output [recommendation]; see also Asmita, section 7.5 – teaches recommender systems).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai in view of Geron and further in view of Lu with the teachings of Asmita in order to compare various model architectures in order to generate an ensemble model which combines multiple results to generate an improved result in the field of ensemble learning for recommender systems (Asmita, Abstract – “Ensemble Learning is an approach in machine learning to find a predictive model taking into considerations the opinions of various experts. Groups of people can often make better decisions than individuals especially when group members come in with their own biases. This document presents a review on the possible architectures that can be used to build an ensemble model, the techniques in which the opinions of the experts could be combined to give a general improved model and the algorithms for implementing the Ensemble Learning. Comparison of architectures is done on the basis of diversity, classification accuracy and memory consumption. This can be helpful in choosing the options depending on the requirement. In the last part an analysis of ensemble learning algorithms on the basis on Bias and Variance is included.”).
While Ghai in view of Geron, further in view of Lu and further in view of Asmita teaches a mapping and normalization layer that outputs the recommendation, Ghai in view of Geron, further in view of Lu and further in view of Asmita does not explicitly teach the output recommendations as a set of probability distributions on the multiple values. Further, Ghai in view of Geron, further in view of Lu and further in view of Asmita does not explicitly teach the training the neural network comprising training the neural network to minimize cross-entropy loss between the recommendation and a one-hot representation of a ground truth.
Deng teaches the mapping and normalization layer outputting the recommendation (Deng, p. 65, Fusion and Learning section – teaches the ensemble outputs as inputs to a fully connected layer [neural network] and mapping the values to an output) as a set of probability distributions (Deng, p. 63, Learning the Model section – teaches outputting probability values for the outputs) on the multiple values (Deng, p. 66, Experiments section – teaches generating recommendations where outputs have multiple values, e.g., movie ratings), and the training the neural network comprising training the neural network to minimize cross-entropy loss between the recommendation and a one-hot representation of a ground truth (Deng, pp. 63-64, Learning the Model section - teaches minimizing cross-entropy loss between prediction and ground truth; Deng, p. 65, Learning section - teaches minimizing cross-entropy function; Ghai, pp. 66-67, Experiments section – teaches ratings, e.g., movie ratings, as one-hot ground truth values).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai in view of Geron, further in view of Lu and further in view of Asmita with the teachings of Deng in order to use ensembles to combine the strengths of multiple models while overcoming the flaws of the individual models in the field of ensemble learning for recommender systems (Deng, Abstract – “In general, recommendation can be viewed as a matching problem, i.e., match proper items for proper users. However, due to the huge semantic gap between users and items, it’s almost impossible to directly match users and items in their initial representation spaces. To solve this problem, many methods have been studied, which can be generally categorized into two types, i.e., representation learning-based CF methods and matching function learning-based CF methods. Representation learning-based CF methods try to map users and items into a common representation space. In this case, the higher similarity between a user and an item in that space implies they match better. Matching function learning-based CF methods try to directly learn the complex matching function that maps user-item pairs to matching scores. Although both methods are well developed, they suffer from two fundamental flaws, i.e., the limited expressiveness of dot product and the weakness in capturing low-rank relations respectively. To this end, we propose a general framework named DeepCF, short for Deep Collaborative Filtering, to combine the strengths of the two types of methods and overcome such flaws. Extensive experiments on four publicly available datasets demonstrate the effectiveness of the proposed DeepCF framework.”).

Regarding claim 15 (Currently Amended), Ghai in view of Geron and further in view of Lu teaches all of the limitations of the system of claim 14 as noted above. Ghai further teaches the recommendation being one of multiple potential values (Ghai, section 6 – teaches movie ratings from 1 to 5 [multiple values]; Ghai, section 7 – teaches predicting the rating of the recommendation), the means for generating the recommendation comprising a … neural network (Ghai, section 7 – teaches predicting the rating for the movie recommendation using a neural network).
While Ghai in view of Geron and further in view of Lu teaches an ensemble followed by a neural network, Ghai in view of Geron and further in view of Lu does not explicitly teach the neural network comprising a 3-layer neural network followed by a mapping and normalization layer, the mapping and normalization layer outputting the recommendation as a set of probability distributions on the multiple values.
Asmita teaches the means for generating the recommendation comprising a 3-layer neural network (Asmita, Fig. 4 – teaches an ensemble of classifiers whose outputs are inputs into a 3-layer neural network, including an input layer, a hidden layer and an output layer; see also Asmita, section 7.5 – teaches recommender systems) followed by a mapping and normalization layer (Asmita, Fig. 4 – teaches an ensemble followed by a 3-layer neural network followed by a classifier ensemble [mapping and normalization layer]), the mapping and normalization layer outputting the recommendation (Asmita, Fig. 4 – teaches the classifier ensemble [mapping and normalization layer] generating the output [recommendation]; see also Asmita, section 7.5 – teaches recommender systems).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai in view of Geron and further in view of Lu with the teachings of Asmita in order to compare various model architectures in order to generate an ensemble model which combines multiple results to generate an improved result in the field of ensemble learning for recommender systems (Asmita, Abstract – “Ensemble Learning is an approach in machine learning to find a predictive model taking into considerations the opinions of various experts. Groups of people can often make better decisions than individuals especially when group members come in with their own biases. This document presents a review on the possible architectures that can be used to build an ensemble model, the techniques in which the opinions of the experts could be combined to give a general improved model and the algorithms for implementing the Ensemble Learning. Comparison of architectures is done on the basis of diversity, classification accuracy and memory consumption. This can be helpful in choosing the options depending on the requirement. In the last part an analysis of ensemble learning algorithms on the basis on Bias and Variance is included.”).
While Ghai in view of Geron, further in view of Lu and further in view of Asmita teaches a mapping and normalization layer that outputs the recommendation, Ghai in view of Geron, further in view of Lu and further in view of Asmita does not explicitly teach the output recommendations as a set of probability distributions on the multiple values.
Deng teaches the mapping and normalization layer outputting the recommendation (Deng, p. 65, Fusion and Learning section – teaches the ensemble outputs as inputs to a fully connected layer [neural network] and mapping the values to an output) as a set of probability distributions (Deng, p. 63, Learning the Model section – teaches outputting probability values for the outputs) on the multiple values (Deng, p. 66, Experiments section – teaches generating recommendations where outputs have multiple values, e.g., movie ratings).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai in view of Geron, further in view of Lu and further in view of Asmita with the teachings of Deng in order to use ensembles to combine the strengths of multiple models while overcoming the flaws of the individual models in the field of ensemble learning for recommender systems (Deng, Abstract – “In general, recommendation can be viewed as a matching problem, i.e., match proper items for proper users. However, due to the huge semantic gap between users and items, it’s almost impossible to directly match users and items in their initial representation spaces. To solve this problem, many methods have been studied, which can be generally categorized into two types, i.e., representation learning-based CF methods and matching function learning-based CF methods. Representation learning-based CF methods try to map users and items into a common representation space. In this case, the higher similarity between a user and an item in that space implies they match better. Matching function learning-based CF methods try to directly learn the complex matching function that maps user-item pairs to matching scores. Although both methods are well developed, they suffer from two fundamental flaws, i.e., the limited expressiveness of dot product and the weakness in capturing low-rank relations respectively. To this end, we propose a general framework named DeepCF, short for Deep Collaborative Filtering, to combine the strengths of the two types of methods and overcome such flaws. Extensive experiments on four publicly available datasets demonstrate the effectiveness of the proposed DeepCF framework.”).

Regarding claim 16 (Currently Amended), Ghai in view of Geron, further in view of Lu, further in view of Asmita and further in view of Deng teaches all of the limitations of the system of claim 15 as noted above. Deng further teaches the means for training the neural network including minimizing cross-entropy loss between the recommendations and one-hot representations of ground truths (Deng, pp. 63-64, Learning the Model section - teaches minimizing cross-entropy loss between prediction and ground truth; Deng, p. 65, Learning section - teaches minimizing cross-entropy function; Ghai, pp. 66-67, Experiments section – teaches ratings, e.g., movie ratings, as one-hot ground truth values).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai, Geron, Lu, Asmita and Deng in order to minimize cross-entropy loss to obtain an objective function suitable for learning from implicit feedback (Deng, P. 64, Learning the Model section).

Claims 4-5, 13, 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Ghai in view of Geron, further in view of Lu and further in view of Asmita et al. (Review on the Architecture, Algorithm and Fusion Strategies in Ensemble Learning, herein after referred to as “Asmita”).

Regarding claim 4 (Currently Amended), Ghai in view of Geron and further in view of Lu teaches all of the limitations of the method of claim 1 as noted above. Ghai further teaches the recommendation being one of multiple potential values (Ghai, section 6 – teaches movie ratings from 1 to 5 [multiple values]; Ghai, section 7 – teaches predicting the rating of the recommendation), the neural network … outputting the recommendation as a single value output (Ghai, section 7 – teaches predicting the rating for the movie recommendation).
While Ghai in view of Geron and further Lu teaches an ensemble passing outputs to a neural network in order to generate recommendations, Ghai in view of Geron and further in view of Lu does not explicitly teach that the neural network has 3 layers.
Asmita teaches the neural network comprising a 3-layer neural network (Asmita, Fig. 4 – teaches an ensemble of classifiers whose outputs are inputs into a 3-layer neural network, including an input layer, a hidden layer and an output layer; see also Asmita, section 7.5 – teaches recommender systems).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai in view of Geron and further in view of Lu with the teachings of Asmita in order to compare various model architectures in order to generate an ensemble model which combines multiple results to generate an improved result in the field of ensemble learning for recommender systems (Asmita, Abstract – “Ensemble Learning is an approach in machine learning to find a predictive model taking into considerations the opinions of various experts. Groups of people can often make better decisions than individuals especially when group members come in with their own biases. This document presents a review on the possible architectures that can be used to build an ensemble model, the techniques in which the opinions of the experts could be combined to give a general improved model and the algorithms for implementing the Ensemble Learning. Comparison of architectures is done on the basis of diversity, classification accuracy and memory consumption. This can be helpful in choosing the options depending on the requirement. In the last part an analysis of ensemble learning algorithms on the basis on Bias and Variance is included.”).

Regarding claim 5 (Currently Amended), Ghai in view of Geron, further in view of Lu and further in view of Asmita teaches all of the limitations of the method of claim 4 as noted above. Ghai further teaches the training the neural network including minimizing root mean square errors (Ghai, section 7 – teaches minimizing RMSE error; see also, Ghai, Tables 1, 3) between the recommendations and ground truth values (Ghai, section 6 – teaches movie ratings from 1 to 5 [multiple values]; Ghai, section 7 – teaches predicting the rating of the recommendation; [Predicting a single value for the rating of a recommendation means comparing the recommendation to the ground truth rating given in the dataset]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Ghai, Geron, Lu and Asmita for the same reasons as disclosed in claim 4 above.

Regarding claim 13 (Currently Amended), Ghai in view of Geron and further in view of Lu teaches all of the limitations of the method of claim 11 as noted above. Ghai further teaches the recommendation being one of multiple potential values (Ghai, section 6 – teaches movie ratings from 1 to 5 [multiple values]; Ghai, section 7 – teaches predicting the rating of the recommendation), the neural network … outputting the recommendation as a single value output (Ghai, section 7 – teaches predicting the rating for the movie recommendation), the training the neural network comprising training the neural network to minimize root mean square error (Ghai, section 7 – teaches minimizing RMSE error; see also, Ghai, Tables 1, 3) between the recommendation and a ground truth value (Ghai, section 6 – teaches movie ratings from 1 to 5 [multiple values]; Ghai, section 7 – teaches predicting the rating of the recommendation; [Predicting a single value for the rating of a recommendation means comparing the recommendation to the ground truth rating given in the dataset]).
While Ghai in view of Geron and further Lu teaches an ensemble passing outputs to a neural network in order to generate recommendations, Ghai in view of Geron and further in view of Lu does not explicitly teach that the neural network has 3 layers.
Asmita teaches the neural network comprising a 3-layer neural network (Asmita, Fig. 4 – teaches an ensemble of classifiers whose outputs are inputs into a 3-layer neural network, including an input layer, a hidden layer and an output layer; see also Asmita, section 7.5 – teaches recommender systems).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai in view of Geron and further in view of Lu with the teachings of Asmita in order to compare various model architectures in order to generate an ensemble model which combines multiple results to generate an improved result in the field of ensemble learning for recommender systems (Asmita, Abstract – “Ensemble Learning is an approach in machine learning to find a predictive model taking into considerations the opinions of various experts. Groups of people can often make better decisions than individuals especially when group members come in with their own biases. This document presents a review on the possible architectures that can be used to build an ensemble model, the techniques in which the opinions of the experts could be combined to give a general improved model and the algorithms for implementing the Ensemble Learning. Comparison of architectures is done on the basis of diversity, classification accuracy and memory consumption. This can be helpful in choosing the options depending on the requirement. In the last part an analysis of ensemble learning algorithms on the basis on Bias and Variance is included.”).

Regarding claim 17 (Currently Amended), Ghai in view of Geron and further in view of Lu teaches all of the limitations of the system of claim 14 as noted above. Ghai further teaches the recommendation being one of multiple potential values (Ghai, section 6 – teaches movie ratings from 1 to 5 [multiple values]; Ghai, section 7 – teaches predicting the rating of the recommendation), the means for generating the recommendation … outputting the recommendation as a single value output (Ghai, section 7 – teaches predicting the rating for the movie recommendation).
While Ghai in view of Geron and further in view of Lu teaches an ensemble passing outputs to a means for generating the recommendation comprising a neural network, Ghai in view of Geron and further in view of Lu does not explicitly teach that the neural network has 3 layers.
Asmita teaches the means for generating the recommendation comprising a 3-layer neural network (Asmita, Fig. 4 – teaches an ensemble of classifiers whose outputs are inputs into a 3-layer neural network, including an input layer, a hidden layer and an output layer; see also Asmita, section 7.5 – teaches recommender systems).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Ghai in view of Geron and further in view of Lu with the teachings of Asmita in order to compare various model architectures in order to generate an ensemble model which combines multiple results to generate an improved result in the field of ensemble learning for recommender systems (Asmita, Abstract – “Ensemble Learning is an approach in machine learning to find a predictive model taking into considerations the opinions of various experts. Groups of people can often make better decisions than individuals especially when group members come in with their own biases. This document presents a review on the possible architectures that can be used to build an ensemble model, the techniques in which the opinions of the experts could be combined to give a general improved model and the algorithms for implementing the Ensemble Learning. Comparison of architectures is done on the basis of diversity, classification accuracy and memory consumption. This can be helpful in choosing the options depending on the requirement. In the last part an analysis of ensemble learning algorithms on the basis on Bias and Variance is included.”).

Regarding claim 18 (Currently Amended), Ghai in view of Geron, further in view of Lu and further in view of Asmita teaches all of the limitations of the system of claim 17 as noted above. Ghai further teaches the means for training the neural network including minimizing root mean square errors (Ghai, section 7 – teaches minimizing RMSE error; see also, Ghai, Tables 1, 3) between the recommendations and ground truth values (Ghai, section 6 – teaches movie ratings from 1 to 5 [multiple values]; Ghai, section 7 – teaches predicting the rating of the recommendation; [Predicting a single value for the rating of a recommendation means comparing the recommendation to the ground truth rating given in the dataset]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Ghai, Geron, Lu and Asmita for the same reasons as disclosed in claim 17 above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
Yang et al. (Neural Network Ensemble: Combining Multiple Models for Enhanced Performance Using a Multistage Approach) teaches a neural ensemble with a plurality of first layer neural networks which feed into a second layer neural network.
Wozniak et al. (A Survey of Multiple Classifier Systems as Hybrid Systems) teaches multilayer ensemble networks.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communication from the examiner should be directed to MARSHALL WERNER whose telephone number is (469) 295-9143. The examiner can normally be reached on Monday – Thursday 7:30 AM – 4:30 PM ET.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar, can be reached at (571) 272-7796. The fax number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MARSHALL L WERNER/               Examiner, Art Unit 2125                                                                                                                                                                              
	

	
	/KAMRAN AFSHAR/               Supervisory Patent Examiner, Art Unit 2125