DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10/5/2020 has been entered.
 
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Applicant remarks:
The Office Action relies on hindsight in its determination of obviousness. MPEP § 
2145(X)(A) and MPEP § 2142. The Office Action relies on the Applicant's own disclosure to read features into the cited references that are not actually disclosed or suggested in the cited references Radhakrishnan and Hong. Therefore, the claimed invention is not obvious in view of the cited references. 

 Examiner Response:
Applicant's arguments filed 10/05/2020 in regards to hindsight reasoning have been fully considered but they are not persuasive.
In response to applicant's argument that the examiner's conclusion of obviousness is based upon improper hindsight reasoning, it must be recognized that any judgment on obviousness is in a sense necessarily a reconstruction based upon hindsight reasoning.  But so long as it takes into account only knowledge which was within the level of ordinary skill at the time the claimed invention was made, and does not include knowledge gleaned only from the applicant's disclosure, such a reconstruction is proper.  See In re McLaughlin, 443 F.2d 1392, 170 USPQ 209 (CCPA 1971).
	
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective 

Claim 1-17, 19 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Radhakrishnan (U.S. 2013/0246301) in view of Hong (U.S. Patent No. 6546379) and Huang (U.S. Patent No. 6356213).

Regarding claim 1, Radhakrishnan teaches a machine learning system, comprising: 
one or more processors (Radhakrishnan: Paragraph [0025] “the use of instructions that are executable by one or more processors”); and

a trip data store comprising a plurality of historical trip records (Radhakrishnan: Paragraph [0100] “instances of customer pickups and drop-offs may be recorded, including recording the pickup locations (610) and drop-offs (620) the information may be stored for use as historical data.” A plurality of historical trip records is taught as instances of customer pickups and drop-offs may be stored for use as historical data.), each trip record characterizing attributes of an observed trip associated with a rider and a driver (Radhakrishnan: Paragraph [0063] “The rating information of each participant may be recorded as part of that user's profile information, and thus stored in the profile store 350. Thus, each feedback results in a respective driver rating update 387 (provided by the customer) or customer rating update 391 (provided from the driver)” Rider is taught as the customer. Driver is taught as the driver. Trip is taught as the trip the driver provided to the customer in which the feedback was given.), each trip record comprising: a plurality of observed exogenous variables associated with the rider (Radhakrishnan: Paragraph [0028] “rating information that identifies a reputation or class of user of the customer” Observed exogenous variables associated with the Rider are taught as rating information that identifies a reputation of a customer[first entity].), the driver (Radhakrishnan: Paragraph [0068] “the [previous] rating information may be used as a parameter in selecting one driver to be paired to a customer” Observed exogenous variables associated with the driver are taught as the rating information used to help select the driver.) and the trip, wherein the value of each exogeneous variable is known prior to the start of the trip (Radhakrishnan: Paragraph [0051] “may require use of geographic information resource (GIR) 326 that identifies, for example, proximity by distance or time of individual drivers to the requesting customer. The geographic information may also be used to identify the geographic location of individual parties based on their communicated GPS information. The geographic information resource may include maps or codes that enable locating parties from their GPS coordinates, as well as information needed for calculating time/distance separating the two parties.” Observed exogenous variables associated with the trip observed prior to the start of the trip is taught as locating the parties by their GPS to calculate the time/distance of the trip.); 

a plurality of observed endogenous variables associated with the rider, the driver and the trip after the start of the trip (Radhakrishnan: Paragraph [0051] “position information, such as pickup and drop-off locations and route information or distance traveled, and (ii) stop/wait time” Plurality of observed endogenous variables associated with the rider, the driver and the trip after the start of the trip are taught as position information, such as pickup and drop-off locations and route information or distance traveled, and (ii) stop/wait time.), wherein the value of each endogenous variable is known after the start of the trip (Radhakrishnan: Paragraph [0051] “position information, such as pickup and drop-off locations and route information or distance traveled, and (ii) stop/wait time” Plurality of observed endogenous variables associated with the rider, the driver and the trip after the start of the trip are taught as position information, such as pickup and drop-off locations and route information or distance traveled, and (ii) stop/wait time.); 

a plurality of elicited endogenous variables associated with the rider (Radhakrishnan: Paragraph [0066] “the transport service may prompt the customer to answer a series of yes or no questions as a means of evaluating the performance of a particular driver.” A plurality of elicited endogenous variables associated with the rider is taught as customers being prompted to answer a series of yes or no questions as a means of evaluating the driver for the quality of the trip provided.), the driver and the trip, wherein the value of each elicited endogenous variable is elicited from the driver or the rider during or after the trip (Radhakrishnan: Paragraph [0066] “the transport service may prompt the customer to answer a series of yes or no questions as a means of evaluating the performance of a particular driver.” A plurality of elicited endogenous variables associated with the rider is taught as customers being prompted to answer a series of yes or no questions as a means of evaluating the driver for the quality of the trip provided.); 

an observed dependent variable for the driver provided by the rider for the trip after the trip is completed (Radhakrishnan: Paragraph [0066] “a rating interface 384, 388 may be provided for each of the customer and the driver. The rating interface 384 of the customer enables the customer to record feedback 385 about a driver, or more generally, about the transport party (e.g. driver or the taxi or limousine company that provided the transport). Likewise, the rating interface 388 enables the driver to record feedback about the customer 389. The rating information of each participant may be recorded as part of that user's profile information, and thus stored in the profile store” An observed dependent variable for the driver provided by the rider for the trip after the trip is completed is taught as the rating interface of the customer enables the customer to record feedback about a driver.); 
one or more memory resources storing instructions that, when executed by the one or more memory resources storing instructions that, when executed by the one or more processors, cause the one or more processors (Radhakrishnan: Paragraph [0025] “devices that utilize processors, memory, and instructions stored on computer-readable mediums. Additionally, embodiments may be implemented in the form of computer-programs, or a computer usable carrier medium capable of carrying such a program.”) variable… exogenous…associated with the trip (Radhakrishnan: Paragraph [0051] “may require use of geographic information resource (GIR) 326 that identifies, for example, proximity by distance or time of individual drivers to the requesting customer. The geographic information may also be used to identify the geographic location of individual parties based on their communicated GPS information. The geographic information resource may include maps or codes that enable locating parties from their GPS coordinates, as well as information needed for calculating time/distance separating the two parties.” Observed exogenous variables associated with the trip observed prior to the start of the trip is taught as locating the parties by their GPS to calculate the time/distance of the trip.)



executing a second predictive model to predict a second component of the predicted output variable …, the second predictive model receiving as inputs: (1) the first residual value, and (2) … and the plurality of…variables associated with the trip; and determining a second residual value as a difference between the predicted output variable and the sum of the first and second components;

executing a third predictive model to predict a third component of the predicted output variable…, the third predictive model receiving as inputs: (1) the second residual value, and (2) the plurality of…variables associated with the trip. 

Hong further teaches to perform steps comprising: executing a first predictive model (Hong: Col 8, Lines 9-19. “cascade boosting of a decision tree is shown. Recall that segments of the population in a decision tree are always mutually exclusive. An initial predictive model is built in block 701 that applies to one or more subordinate models for a set of initial training data points. The accuracy performance of the current model, initially the initial predictive model, is observed in block 702. The observed performance of each subordinate model on its corresponding segment (of either the training data or separate validation data) is used to estimate future accuracy performance.” Executing a first predictive model is taught as cascade boosting of a decision tree that is shown by an initial predictive model.) to predict a first expected component of a predicted output variable (Hong: Col 7, Lines 52-54. “The estimates in item (c) may be derived from performance on either training data” The predicted first component is taught as the estimate based on training data.) …, the first predictive model receiving as inputs: the observed dependent variable, and … the observed…variables (Hong: Col 1, Lines 17-24. “Predictive modeling refers to generating a model from a given set of data points (also called "examples" or "records"), where each point is comprised of fields (also called "attributes" or "features" or "variables"), some of which are designated as target fields (also called "dependent variables") whose values are to be predicted from the values of the others (also called "independent variables")” Hong teaches generating the predictive model based on inputted variables. Observed dependent variable is taught as the target field also called the dependent variable) …; …

executing a second predictive model to predict a second component (Hong: Col 7, Lines 52-54. “The estimates in item (c) may be derived from performance on either training data” The predicted second component is taught as the estimate based on training data.) of the predicted output variable (Hong: Col 8, Lines 9-19. “cascade boosting of a decision tree is shown. Recall that segments of the population in a decision tree are always mutually exclusive. An initial predictive model is built in block 701 that applies to one or more subordinate models for a set of initial training data points. The accuracy performance of the current model, initially the initial predictive model, is observed in block 702. The observed performance of each subordinate model on its corresponding segment (of either the training data or separate validation data) is used to estimate future accuracy performance.” A second predictive model configured to predict a second component of the predicted output variable is taught as cascade boosting of one or more subordinate models for a set of initial training data points [variables].) …, …, and (2) … and the plurality of…variables (Hong: Col 1, Lines 17-24. “Predictive modeling refers to generating a model from a given set of data points (also called "examples" or "records"), where each point is comprised of fields (also called "attributes" or "features" or "variables"), some of which are designated as target fields (also called "dependent variables") whose values are to be predicted from the values of the others (also called "independent variables")” Hong teaches generating the predictive model based on target fields which are a given set of data points[received as inputs].) …;…

executing a third predictive model to predict a third component (Hong: Col 7, Lines 52-54. “The estimates in item (c) may be derived from performance on either training data” The predicted third component is taught as the estimate based on training data.) of the predicted output variable (Hong: Col 8, Lines 9-19. “cascade boosting of a decision tree is shown. Recall that segments of the population in a decision tree are always mutually exclusive. An initial predictive model is built in block 701 that applies to one or more subordinate models for a set of initial training data points. The accuracy performance of the current model, initially the initial predictive model, is observed in block 702. The observed performance of each subordinate model on its corresponding segment (of either the training data or separate validation data) is used to estimate future accuracy performance.” A third predictive model configured to predict a third component of a predicted output variable is taught as cascade boosting of one or more subordinate models for a set of initial training data points [variables].) …, ,… and (2) the plurality of…variables (Hong: Col 1, Lines 17-24. “Predictive modeling refers to generating a model from a given set of data points (also called "examples" or "records"), where each point is comprised of fields (also called "attributes" or "features" or "variables"), some of which are designated as target fields (also called "dependent variables") whose values are to be predicted from the values of the others (also called "independent variables")” Variables of the third predictive model are taught as generating the predictive model based on a set of given data points [variables].)... 

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified User feedback variables of Radhakrishnan with cascade boosting of predictive models of Hong in order to allow using cascade boosting models to avoid restricting attention to decision trees, thereby using statistically sound estimates to determine which leaves performs well and cover the most training points (Hong: Col 7. Lines 40-45. “Cascade boosting uses statistically Sound estimates (see V. N. Vapnik, “Statistical Learning Theory”, John Wiley and Sons: 1998) to determine which leaves are likely to perform well, while the PART method just retains the leaf covering the most training points.”).

Radhakrishnan in view of Hong do not explicitly disclose determining a first residual value as a difference between the predicted output variable and the first expected component;… 
Huang further teaches (Huang: Col 5, Lines 40-44. “With the cascaded application of the prediction methodology to the resultant process of the preferred embodiment, three predictors, P, P and Ps, are provided.”) determining a first residual value as a difference between the predicted output variable and the first expected component (Huang: Abstract “…a residual as the difference between the Signal and its predicted value …The residual value to be used for encoding the input Signal increment is determined as the difference between the Signal value and the Selected predictor value.” Determining a first residual value as a difference between the predicted output variable and the first expected component is taught as residual value to be used for encoding the input Signal increment is determined as the difference between the Signal value and the Selected predictor value[i.e. a difference between the predicted output variable and the first expected component].);… the second predictive model receiving as inputs: (1) the first residual value (Huang: Col. 6, Lines 43-50. “That predicted value of Residual 1, R, , is rounded to the nearest integer, IR, ), at step 233. The second predictor, P2, is developed from the predicted value of Residual 1 as the integer value of the sum of S, and R, i.e., P=S+R,'). A second residual value, R, (or Residual 2), is computed, at step 236, as the difference between the value of Residual 1 (from step 221) and the predicted value of Residual 1.” The second predictive model receiving as inputs: (1) the first residual value is taught as Residual 1 being input to the second predictor.)… and determining a second residual value as a difference between the predicted output variable and the (Huang: Col 6, Lines 45-50. “The second predictor, P2, is developed from the predicted value of Residual 1 as the integer value of the sum of S, and R, i.e., P=S+R,'). A second residual value, R, (or Residual 2), is computed, at step 236, as the difference between the value of Residual 1 (from step 221) and the predicted value of Residual 1.” Determining a second residual value as a difference between the predicted output variable and the sum of the first and second components is taught as a second residual value, R, (or Residual 2), is computed, at step 236, as the difference between the value of Residual 1 (from step 221) and the predicted value of Residual 1.);… the third predictive model receiving as inputs: (1) the second residual value (Huang: Col. 6 Lines 55-61. “The third predictor, Ps, is developed from the predicted value of Residual 2 as the integer value of the sum of S, R, and R,’ i.e., Ps=S+ R+R,’). A third residual value, R. (or Residual 3), may be computed, at step 251, as the difference between the value of Residual 2 (from step 236) and the predicted value of Residual 2” The third predictive model receiving as inputs: (1) the second residual value is taught as the third predictor that receives the second residual value.),

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified combination of Radhakrishnan and Hong with the residual values and predictors of Huang in order to allow using a single prediction value to be selected as a statistical representation of the multiple predictor values, thereby a good trade-off between achieving an improved coding efficiency and computational complexity (Huang: Col 5. Lines 30-45. “It is believed by the inventors that the number used for the preferred embodiment represents a good trade-off between coding-efficiency improvement and computational complexity. …Single predictor for coding of the input Signal Sample value, one needs to determine a predictor which is Statistically representative of the three computed predictors, P, P and P.”).

Claim 1 is a system claim corresponding to method claim 6 and is rejected for the same reasons as given in the rejection of that claim. Similarly, claim 11 is a computer program product claim corresponding to system claim 1 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 2, Radhakrishnan in view of Hong and Huang teaches the system of claim 1, Radhakrishnan further teaches wherein the instructions further cause the one or more processors (Radhakrishnan: Paragraph [0025] “the use of instructions that are executable by one or more processors”) to perform steps comprising: 

determining for each driver (Radhakrishnan: Paragraph [0063] “The rating information of each participant may be recorded as part of that user's profile information, and thus stored in the profile store 350. Thus, each feedback results in a respective driver rating update 387 (provided by the customer) or customer rating update 391 (provided from the driver)” Driver is taught as the driver.) …, and storing the driver … in a driver record (Radhakrishnan: Paragraph [0100] “instances of customer pickups and drop-offs may be recorded, including recording the pickup locations (610) and drop-offs (620) the information may be stored for use as historical data.” A driver record is taught as instances of customer pickups and drop-offs may be stored for use as historical data [in a driver record].); 

determining for each driver…driver…driver for a plurality of historical trips and…(Radhakrishnan: Paragraph [0100] “instances of customer pickups and drop-offs may be recorded, including recording the pickup locations (610) and drop-offs (620) the information may be stored for use as historical data.” A driver record is taught as instances of customer pickups and drop-offs may be stored for use as historical data [in a driver record].);

and determining for each driver… the drivers … driver for a plurality of historical trips (Radhakrishnan: Paragraph [0100] “instances of customer pickups and drop-offs may be recorded, including recording the pickup locations (610) and drop-offs (620) the information may be stored for use as historical data.” A driver record is taught as instances of customer pickups and drop-offs may be stored for use as historical data[in a driver record].)…

Hong further teaches a contribution factor to the driver's historical predicted variable based on the aggregated differences between the historical predicted variables for the driver for a plurality of historical events by the driver and the first expected component of the predicted output variable predicted by the first predictive model for each of the historical trips,… (Hong: Col 15, Lines 35-41. “the current model applying at least one subordinate model to a plurality of data points in possibly intersecting segments and arbitrating among predictions of applicable subordinate models whenever a point falls within two or more segments, and being built from the subpopulation of training points;” A contribution factor to the drivers historical predicted variable based on the aggregated differences between the historical predicted variables for the driver for a plurality of historical trip by the driver is and the first expected component of the predicted output variable predicted by the first predictive model taught a plurality of data points in possibly intersecting segments and arbitrating among predictions [contribution factors] of applicable subordinate models whenever a point falls within two or more segments, and being built from the subpopulation of training points.[the arbitrary predictions are calculated from the sub population of training points or variables]);

a second component of the…historical predicted variable based on the aggregated difference between the historical predicted variables for the…the second component of the predicted output variable predicted by the second predictive model for each of the historical trips (Hong: Col 15, Lines 35-41. “the current model applying at least one subordinate model to a plurality of data points in possibly intersecting segments and arbitrating among predictions of applicable subordinate models whenever a point falls within two or more segments, and being built from the subpopulation of training points;” A second component of the historical predicted variable based on the aggregated difference between the historical predicted variables for the second component of the predicted output variable predicted by the second predictive model is taught as a plurality of data points in possibly intersecting segments and arbitrating among predictions [contribution factors] of applicable subordinate models whenever a point falls within two or more segments, and being built from the subpopulation of training points.[the arbitrary predictions are calculated from the sub population of training points or variables]); 

a third component of the… historical predicted variable based on the aggregated difference between the historical predicted variables for the… and the third component of the predicted output variable predicted by the third predictive model for each of the historical trips (Hong: Col 15, Lines 35-41. “the current model applying at least one subordinate model to a plurality of data points in possibly intersecting segments and arbitrating among predictions of applicable subordinate models whenever a point falls within two or more segments, and being built from the subpopulation of training points;” A third component of the historical predicted variable based on the aggregated difference between the historical predicted variables for the third component of the predicted output variable predicted by the third predictive model is taught as a plurality of data points in possibly intersecting segments and arbitrating among predictions [contribution factors] of applicable subordinate models whenever a point falls within two or more segments, and being built from the subpopulation of training points.[the arbitrary predictions are calculated from the sub population of training points or variables]). 

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified User feedback variables of Radhakrishnan with cascade boosting of predictive models of Hong in order to allow using cascade boosting models to avoid restricting attention to decision trees, thereby using statistically sound estimates to determine which leaves performs well and cover the most training points (Hong: Col 7. Lines 40-45. “Cascade boosting uses statistically Sound estimates (see V. N. Vapnik, “Statistical Learning Theory”, John Wiley and Sons: 1998) to determine which leaves are likely to perform well, while the PART method just retains the leaf covering the most training points.”).

Claim 2 is a system claim corresponding to method claim 7 and is rejected for the same reasons as given in the rejection of that claim. Similarly, claim 12 is a computer program product claim corresponding to system claim 2 and is rejected for the same reasons as given in the rejection of that claim.
Regarding claim 3, Radhakrishnan in view of Hong and Huang teaches the system of claim 1, Radhakrishnan further teaches wherein: a trip record includes a plurality of experimental endogenous variables associated with the rider, the driver and the trip (Radhakrishnan: Paragraph [0051] “position information, such as pickup and drop-off locations and route information or distance traveled, and (ii) stop/wait time” Plurality of observed endogenous variables associated with the rider, the driver and the trip observed after the start of the trip are taught as position information, such as pickup and drop-off locations and route information or distance traveled, and (ii) stop/wait time.); the instructions further cause the one or more processors to perform steps comprising (Radhakrishnan: Paragraph [0025] “the use of instructions that are executable by one or more processors”): 
… driver (Radhakrishnan: Paragraph [0063] “The rating information of each participant may be recorded as part of that user's profile information, and thus stored in the profile store 350. Thus, each feedback results in a respective driver rating update 387 (provided by the customer) or customer rating update 391 (provided from the driver)” Driver is taught as the driver.)… endogenous variable (Radhakrishnan: Paragraph [0051] “position information, such as pickup and drop-off locations and route information or distance traveled, and (ii) stop/wait time” Plurality of observed endogenous variables are taught as position information, such as pickup and drop-off locations and route information or distance traveled, and (ii) stop/wait time.)…

… Driver … Drivers … Driver (Radhakrishnan: Paragraph [0063] “The rating information of each participant may be recorded as part of that user's profile information, and thus stored in the profile store 350. Thus, each feedback results in a respective driver rating update 387 (provided by the customer) or customer rating update 391 (provided from the driver)” Driver is taught as the driver.)…
Radhakrishnan does not explicitly disclose determining a third residual value as a difference between the observed dependent rating and a sum of the first, second, and third predicted components; executing a fourth predictive model to predict a fourth, experimental component of the predicted output variable rating for the arbitrary …, the fourth predictive model receiving as inputs:(1) the third residual value, and (2) the plurality of experimental endogenous variables associated with the independent variables of the fourth predictive model; and determining for each … a fourth component of to the … historical predicted variable based on the aggregated differences between the historical predicted variables for the… for a plurality of historical trips and fourth component of the predicted variable predicted by the fourth predictive model for each of the historical trips.
(Hong: Col 1, Lines 17-24. “Predictive modeling refers to generating a model from a given set of data points (also called "examples" or "records"), where each point is comprised of fields (also called "attributes" or "features" or "variables"), some of which are designated as target fields (also called "dependent variables") whose values are to be predicted from the values of the others (also called "independent variables")” Hong teaches observed dependent rating as the target fields or dependent variables.) …a fourth predictive model configured to predict a fourth, experimental component (Hong: Col 7, Lines 52-54. “The estimates in item (c) may be derived from performance on either training data” The predicted fourth experimental component is taught as the estimate based on training data.) of the predicted output variable rating (Hong: Col 8, Lines 9-19. “cascade boosting of a decision tree is shown. Recall that segments of the population in a decision tree are always mutually exclusive. An initial predictive model is built in block 701 that applies to one or more subordinate models for a set of initial training data points. The accuracy performance of the current model, initially the initial predictive model, is observed in block 702. The observed performance of each subordinate model on its corresponding segment (of either the training data or separate validation data) is used to estimate future accuracy performance.” A fourth predictive model configured to predict a second component of the predicted output variable is taught as cascade boosting of one or more subordinate models for a set of initial training data points [variables].) for the arbitrary…, the fourth predictive model receiving as inputs (Hong: Col 1, Lines 17-24. “Predictive modeling refers to generating a model from a given set of data points (also called "examples" or "records"), where each point is comprised of fields (also called "attributes" or "features" or "variables"), some of which are designated as target fields (also called "dependent variables") whose values are to be predicted from the values of the others (also called "independent variables")” Hong teaches generating the predictive model based on target fields which are a given set of data points [received as inputs].):…, and the plurality of experimental… variables associated with the independent variables of the fourth predictive model (Hong: Col 1, Lines 17-24. “Predictive modeling refers to generating a model from a given set of data points (also called "examples" or "records"), where each point is comprised of fields (also called "attributes" or "features" or "variables"), some of which are designated as target fields (also called "dependent variables") whose values are to be predicted from the values of the others (also called "independent variables")” Hong teaches generating the predictive model based on a set of given data points (values are to be predicted from the values of the others)[independent variables].);

determining a fourth component of to the driver's historical predicted variable based on the aggregated differences between the historical predicted variables for… for a plurality of historical trips and fourth component of the predicted variable predicted by the fourth predictive model for each of the historical trips (Hong: Col 15, Lines 35-41. “the current model applying at least one subordinate model to a plurality of data points in possibly intersecting segments and arbitrating among predictions of applicable subordinate models whenever a point falls within two or more segments, and being built from the subpopulation of training points;” A fourth model is taught as the current model. A fourth component of the historical predicted variable based on the aggregated difference between the historical predicted variables for the fourth component of the predicted output variable predicted by the fourth predictive model is taught as a plurality of data points in possibly intersecting segments and arbitrating among predictions [contribution factors] of applicable subordinate models whenever a point falls within two or more segments, and being built from the subpopulation of training points.[the arbitrary predictions are calculated from the sub population of training points or variables between the other components]). 

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified User feedback variables of Radhakrishnan with cascade boosting of predictive models of Hong in order to allow using cascade boosting models to avoid restricting attention to decision trees, thereby using statistically sound estimates to determine which leaves performs well and cover the most training points (Hong: Col 7. Lines 40-45. “Cascade boosting uses statistically Sound estimates (see V. N. Vapnik, “Statistical Learning Theory”, John Wiley and Sons: 1998) to determine which leaves are likely to perform well, while the PART method just retains the leaf covering the most training points.”).

Huang further teaches determining a third residual value as a difference between the … and a sum of the first, second, and third predicted components;… the third residual value (Huang: Col 6, Lines 55-63. “The third predictor, Ps, is developed from the predicted value of Residual 2 as the integer value of the sum of S, R, and R,’ i.e., Ps=S+ R+R,’). A third residual value, R. (or Residual 3), may be computed, at step 251, as the difference between the value of Residual 2 (from step 236) and the predicted value of Residual 2 (although Residual 3 is not needed if no further predictors are to be developed).” Determining a third residual value as a difference between the …  is taught as a third residual value, R. (or Residual 3), may be computed, at step 251, as the difference between the value of Residual 2 (from step 236) and the predicted value of Residual 2. A sum of the first, second, and third predicted components is taught as he sum of S, R, and R,’ i.e., Ps=S+ R+R,. )

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified combination of Radhakrishnan and Hong with the residual values and predictors of Huang in order to allow using a single prediction value to be selected as a statistical representation of the multiple predictor values, thereby a good trade-off between achieving an improved coding efficiency and computational complexity (Huang: Col 5. Lines 30-45. “It is believed by the inventors that the number used for the preferred embodiment represents a good trade-off between coding-efficiency improvement and computational complexity. …Single predictor for coding of the input Signal Sample value, one needs to determine a predictor which is Statistically representative of the three computed predictors, P, P and P.”).

Claim 3 is a system claim corresponding to method claim 8 and is rejected for the same reasons as given in the rejection of that claim. Similarly, claim 13 is a computer program product claim corresponding to system claim 3 and is rejected for the same reasons as given in the rejection of that claim.
Regarding claim 4, Radhakrishnan in view of Hong and Huang teaches the system of claim 1, Radhakrishnan further teaches wherein the instructions further cause the one or more processors (Radhakrishnan: Paragraph [0109-110] “rating user interfaces that are displayed to a user to enable the user to provide feedback or share information about a service… the rating user interface has been provided to a user in response to determining that a transportation service has been completed for the user” One or more messages to a driver during an trip which improve the likelihood of them getting a higher performance rating at the completion of the trip is taught as rating user interfaces that are displayed to a user to enable the user to provide feedback or share information about a service after the service is completed.); 
storing a set of driver screening criteria indicating threshold … for a driver to obtain a target average performance rating by a target number of rated trips provided by the driver (Radhakrishnan: Paragraph [0121-0122] “…storage in conjunction with the selected features of the set of selectable category features when the confirmation button 950 is selected…a portion or a region of the rating user interface can be moved down or extended in order to display the set of selectable category features 930. For example, initially, only the overall user rating 910 may be displayed to the user. In response to the user selection of the quantitative user rating, a panel may be visually moved down to show the set of selectable category features 930….visually moved down to show the set of selectable category features” A set of driver screening criteria indicating threshold probability levels for a second entity to obtain a target average performance rating by a target number of rated events is taught as a portion or a region of the rating user interface can be moved down or extended in order to display the set of selectable category features, only the overall user rating may be displayed to the user, and visually moved down to show the set of selectable category features.), 
wherein the threshold … are determined with respect to a set of drivers having values of the performance contribution factor below a predetermined level, and wherein the driver screening criteria include a minimum … threshold value and a maximum … threshold value (Radhakrishnan: Paragraph [120] “the set of selectable category features 930 may be displayed in response to determining that the quantitative user rating 915 that is provided by the user is below a certain threshold, e.g., three or less stars as in the embodiment of FIG. 9B. The determination of the quantitative user rating 915 relative to a threshold may be performed” Threshold probability levels are determined with respect to a set of drivers having values of the performance contribution factor below a predetermined level is taught as a selected set of user ratings that is provided by the user is below a certain threshold. The minimum threshold is taught as below a certain threshold. The maximum threshold value is taught as relative to a threshold. The driver screening criteria is taught as the category features); 
determining for a driver, based on number of trips made by the driver and the driver's average performance rating, … of the driver obtaining the target average performance rating by the target number of trips (Radhakrishnan: Paragraph [0122] a portion or a region of the rating user interface can be moved down or extended in order to display the set of selectable category features 930. For example, initially, only the overall user rating 910 may be displayed to the user. In response to the user selection of the quantitative user rating, a panel may be visually moved down to show the set of selectable category features 930….visually moved down to show the set of selectable category features” Drivers average performance rating is taught as the overall user rating can be displayed based on selectable category features.); 
comparing the determined … for the driver (Radhakrishnan: Paragraph [0063] “The rating information of each participant may be recorded as part of that user's profile information, and thus stored in the profile store 350. Thus, each feedback results in a respective driver rating update 387 (provided by the customer) or customer rating update 391 (provided from the driver)” The driver is taught as the driver.) with the driver screening criteria (Radhakrishnan: Paragraph [120] “the set of selectable category features 930 may be displayed in response to determining that the quantitative user rating” The driver screening criteria is taught as the category features); 
and responsive to the determined …for the driver being between the minimum and maximum threshold …values of the driver screening criteria (Radhakrishnan: Paragraph [120] “the set of selectable category features 930 may be displayed in response to determining that the quantitative user rating 915 that is provided by the user is below a certain threshold, e.g., three or less stars as in the embodiment of FIG. 9B. The determination of the quantitative user rating 915 relative to a threshold may be performed” Threshold probability levels are determined with respect to a set of drivers having values of the performance contribution factor below a predetermined level is taught as a selected set of user ratings that is provided by the user is below a certain threshold. The minimum threshold is taught as below a certain threshold. The maximum threshold value is taught as relative to a threshold. The drivers screening criteria is taught as the category features), storing an indication for the messaging module to automatically transmit at least one message to a rider during a subsequent trip made by the driver (Radhakrishnan: Paragraph [0109-110] “rating user interfaces that are displayed to a user to enable the user to provide feedback or share information about a service… the rating user interface has been provided to a user in response to determining that a transportation service has been completed for the user” One or more messages to a driver during a trip which improve the likelihood of them getting a higher performance rating at the completion of the trip is taught as rating user interfaces that are displayed to a user to enable the user to provide feedback or share information about a service after the service is completed.). 
Radhakrishnan does not explicitly disclose probability levels.

Hong further teaches probability levels (Hong: Col 7, Lines 52-54. “The estimates in item (c) may be derived from performance on either training data” The probability levels are taught as the estimate based on training data.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified User feedback variables of Radhakrishnan with cascade boosting of predictive models of Hong in order to allow using cascade boosting models to avoid restricting attention to decision trees, thereby using statistically sound estimates to determine which leaves performs well and cover the most training points (Hong: Col 7. Lines 40-45. “Cascade boosting uses statistically Sound estimates (see V. N. Vapnik, “Statistical Learning Theory”, John Wiley and Sons: 1998) to determine which leaves are likely to perform well, while the PART method just retains the leaf covering the most training points.”).

Claim 4 is a system claim corresponding to method claim 9 and is rejected for the same reasons as given in the rejection of that claim. Similarly, claim 14 is a computer program product claim corresponding to system claim 4 and is rejected for the same reasons as given in the rejection of that claim.
Regarding claim 5, Radhakrishnan in view of Hong and Huang teaches the system of claim 1, Hong further teaches wherein each of the predictive models is a gradient boosted decision tree classifier (Hong: Col 8, Lines 9-14. “a flow diagram for cascade boosting of a decision tree is shown. Recall that segments of the population in a decision tree are always mutually exclusive. An initial predictive model is built in block 701 that applies to one or more subordinate models for a set of initial training data points” Each of the predictive models are a gradient boosted decision tree is taught as cascade boosting of a decision tree applied in a predictive model.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified User feedback variables of Radhakrishnan with cascade boosting of predictive models of Hong in order to allow using cascade boosting models to avoid restricting attention to decision trees, thereby using statistically sound estimates to determine which leaves performs well and cover the most training points (Hong: Col 7. Lines 40-45. “Cascade boosting uses statistically Sound estimates (see V. N. Vapnik, “Statistical Learning Theory”, John Wiley and Sons: 1998) to determine which leaves are likely to perform well, while the PART method just retains the leaf covering the most training points.”).

Claim 5 is a system claim corresponding to method claim 10 and is rejected for the same reasons as given in the rejection of that claim. Similarly, claim 15 is a computer program product claim corresponding to system claim 5 and is rejected for the same reasons as given in the rejection of that claim.

Regarding claim 16, Radhakrishnan in view of Hong and Huang teaches the system of claim 1, Radhakrishnan further teaches wherein the instructions further cause the one or more processors  (Radhakrishnan: Paragraph [0025] “the use of instructions that are executable by one or more processors”) to perform steps comprising: …variable for the arbitrary driver (Radhakrishnan: Paragraph [0056] “Other parameters may also be used, including parameters on predicted or actual market demand, vehicle availability, time of day, rating of driver, type of vehicle, and quality of service” Determining based on the variable for the arbitrary driver is taught as the parameters used to predict rating of driver. The examiner notes that the variable for an arbitrary driver is taught by Radhakrishnan.), whether the driver receives a subsequent trip request (Radhakrishnan: Paragraph [0067] “Rating information can influence, for example, the ability of a customer to pick a driver when the pool of drivers is limited… On the driver side, the rating information associated with a driver may reflect the driver's courtesy, driving manner, etc. The transport service 300 may prioritize (or emphasize) selection of drivers with good ratings. For example, the invite 314 may first be sent to proximate drivers with highest ratings, then proximate drivers with middle tier ratings.” Whether the driver receives a subsequent trip request is taught by the rating associated with the driver which determines whether a driver will be prioritized.).
 
Hong further teaches ...determining based on the predicted output (Hong: Col 7, Lines 52-54. “The estimates in item (c) may be derived from performance on either training data” Determining based on the predicted output is taught as the estimate based on training data which is applied to the system of Radhakrishnan in order to be used for making a determination.)… 

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified User feedback variables of Radhakrishnan with cascade boosting of predictive models of Hong in order to allow using cascade boosting models to avoid restricting attention to decision trees, thereby using statistically sound estimates to determine which leaves performs well and cover the most training points (Hong: Col 7. Lines 40-45. “Cascade boosting uses statistically Sound estimates (see V. N. Vapnik, “Statistical Learning Theory”, John Wiley and Sons: 1998) to determine which leaves are likely to perform well, while the PART method just retains the leaf covering the most training points.”).

	Claim 19 is similarly rejected refer to claim 16 for further analysis.

Regarding claim 17, Radhakrishnan in view of Hong and Huang teaches the system of claim 1,  wherein the instructions further cause the one or more processors (Radhakrishnan: Paragraph [0025] “the use of instructions that are executable by one or more processors”) to perform steps comprising: 
…variable for the arbitrary driver, a message for the arbitrary driver comprising information for increasing the likelihood of the driver receiving a higher performance rating score (Radhakrishnan: Paragraph [0115] “The user may then be further enabled to communicate a message using the network or communication function. This message may describe the positive aspects of a transportation service… The pre-generated message can include content describing or praising the transportation service.” A message for the arbitrary driver comprising information for increasing the likelihood of the driver receiving a higher performance rating score is taught as the user can communicate a message using a network to describe the positive aspects of a transportation service or praising the service[i.e. increasing the likelihood of the driver receiving a higher performance rating score].); 
and sending the message to the arbitrary driver before, during or after a subsequent trip (Radhakrishnan: Paragraph [0055]“ the customer can enter rating information for the driver, which then can be interpreted as signifying the completion of the transport.” Sending the message to the arbitrary driver before, during or after a subsequent trip is taught by the method of providing the rating information for the driver which can signify the completion of the transport [i.e. after a subsequent trip]. Paragraph [0062] “the participants of the transport service to provide feedback about one another based on their respective experiences” The message includes the provided feedback between the customer and the driver.).

Hong further teaches ...selecting based on the predicted output (Hong: Col 7, Lines 52-54. “The estimates in item (c) may be derived from performance on either training data” Determining based on the predicted output is taught as the estimate based on training data which is applied to the system of Radhakrishnan in order to be used for making a selection.)… 

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified User feedback variables of Radhakrishnan with cascade boosting of predictive models of Hong in order to allow using cascade boosting models to avoid restricting attention to decision trees, thereby using statistically sound estimates to determine which leaves performs well and cover the most training points (Hong: Col 7. Lines 40-45. “Cascade boosting uses statistically Sound estimates (see V. N. Vapnik, “Statistical Learning Theory”, John Wiley and Sons: 1998) to determine which leaves are likely to perform well, while the PART method just retains the leaf covering the most training points.”).

Claim 20 is similarly rejected refer to claim 17 for further analysis.

18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Radhakrishnan (U.S. 2013/0246301) in view of Hong (U.S. Patent No. 6546379), Huang (U.S. Patent No. 6356213) and Hunt (U.S. 20130164715).

Regarding claim 18, (New) Radhakrishnan in view of Hong and Huang teaches the system of claim 1, Radhakrishnan further teaches wherein the instructions further cause the one or more processors (Radhakrishnan: Paragraph [0025] “the use of instructions that are executable by one or more processors”) to perform steps comprising: …for the arbitrary driver (Radhakrishnan: Paragraph [0066] “evaluating the performance of a particular driver”) indicating that the arbitrary driver (Radhakrishnan: Paragraph [0066] “evaluating the performance of a particular driver”)…

Hong further teaches responsive to the predicted output variable (Hong: Col 7, Lines 52-54. “The estimates in item (c) may be derived from performance on either training data” Determining based on the predicted output is taught as the estimate based on training data which is applied to the system of Radhakrishnan in order to be used for making a selection.)…
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified User feedback variables of Radhakrishnan with cascade boosting of predictive models of Hong in order to allow using cascade boosting models to avoid restricting attention to decision trees, thereby using statistically sound estimates to determine which leaves performs well and cover the most training points (Hong: Col 7. Lines 40-45. “Cascade boosting uses statistically Sound estimates (see V. N. Vapnik, “Statistical Learning Theory”, John Wiley and Sons: 1998) to determine which leaves are likely to perform well, while the PART method just retains the leaf covering the most training points.”).

Radhakrishnan in view of Hong and Huang do not explicitly disclose… is unlikely to meet performance rating goals, determining that the arbitrary driver needs additional training to improve performance.

Hunt further teaches …is unlikely to meet performance rating goals, determining that the arbitrary driver needs additional training to improve performance (Barber: Paragraph [0019] “a plurality of vehicle metrics related to driver performance, a single numerical score or numerical ranking can be used to provide feedback to individual drivers. Such a numerical ranking can be used as a management tool to improve driver performance. For example, drivers with relatively poor numerical ranking scores can receive counseling or warnings, which should lead to an improvement in their performance, while drivers receiving relatively better scores receive praise or bonuses that encourage them and others to maintain or improve good performance in operating a vehicle.” Unlikely to meet performance rating goals, determining that the arbitrary driver needs additional training to improve performance is taught as the driver performance score in order to help drivers with relatively poor numerical scores [i.e. unlikely to meet performance goals]to provide them counseling or warnings [i.e. additional training] to improve their performance.).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Radhakrishnan, Hong and Huang with the driver performance improvement system of Hunt in order to allow providing counseling to drivers with poor driver performance, thereby leading to an improvement in their performance (Barber: Paragraph [0019] “a plurality of vehicle metrics related to driver performance, a single numerical score or numerical ranking can be used to provide feedback to individual drivers. Such a numerical ranking can be used as a management tool to improve driver performance. For example, drivers with relatively poor numerical ranking scores can receive counseling or warnings, which should lead to an improvement in their performance, while drivers receiving relatively better scores receive praise or bonuses that encourage them and others to maintain or improve good performance in operating a vehicle.”).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AHSIF A. SHEIKH whose telephone number is (571)272-2607.  The examiner can normally be reached on Mon-Fri 7:30-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/A.A.S./Examiner, Art Unit 2123                                                                                                                                                                                                        
/MICHAEL J HUNTLEY/Primary Examiner, Art Unit 2116