DETAILED ACTION
This office action is in response to application with case number 16/599,783 (filed on 10/11/2019), in which claims 1-20 are presented for examination. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Acknowledgment is made of applicant’s claim no priority for this application submitted on
10/11/2019.
	
Information Disclosure Statement
The information disclosure statements (IDS(s)) submitted on 10/11/2019 & 02/04/2021 have been received and considered.

Examiner Notes
Examiner cites particular paragraphs or columns and lines in the references as applied to Applicant’s claims for the convenience of the Applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the Applicant fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner. The prompt development of a clear issue requires that the replies of the Applicant meet the objections to and rejections of the claims. Applicant should also specifically point out the support for any amendments made to the disclosure (see MPEP §2163.06). Applicant is reminded that the Examiner is entitled to give the Broadest Reasonable Interpretation (BRI) to the language of the claims. Furthermore, the Examiner is not limited to Applicant’s definition which is not specifically set forth in the claims (see MPEP §2111.01).

Claim Rejections - 35 USC § 103
	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or non-obviousness.

Claims 1-2, 4-5, 9-11, 13-14, 18-20 are rejected under 35 USC §103 as being unpatentable over NPL Pub. DoI 10.1109/TITS.2014.2320757 in IEEE transactions on intelligent transportation systems, VOL. 15, NO. 6, DECEMBER 2014 to Li et al. (hereinafter “Li”) in view of Patent No. US 11,164,093 B1 A1 to Zappella (hereinafter “Zappella”)





As per Claim 1, Li discloses a train control system using machine learning for controlling the ramp rate at which a train accelerates after braking (Li, in Fig. 3 [reproduced here for convenience] P. 2563 Col. 2 - P. 2566 Col. 2, discloses judging operation modes (i.e., coasting or accelerating) of subway train control with an expert system and Reinforcement Learning (RL) based on Markov Decision Process [i.e., using Machine Learning]), the train control system comprising:

    PNG
    media_image1.png
    540
    947
    media_image1.png
    Greyscale

Li’s Fig. 3

a data acquisition hub communicatively connected to one or more of sensors and databases associated with one or more locomotives or other components of a train and configured to acquire real-time and historical operational and structural data for use as training data from one or more of systems and components of the train (Li, in Fig. 3 & P. 2564 Col. 2, discloses to obtain the online [i.e., real-time] data and the offline [i.e., historical] data. Li, in Fig.5 [reproduced here for convenience] & P. 2567 Col. 1, further discloses the Input Module is used to input the offline data on the ITO platform including the speed limits, the gradient of the line, the planned trip time, and the train dynamic model parameters);

    PNG
    media_image2.png
    350
    654
    media_image2.png
    Greyscale

Li’s Fig. 5

a virtual system modeling engine configured to simulate in-train forces and train operational characteristics using physics-based equations, kinematic or dynamic modeling of behavior of the train or components of the train during operation when the train is accelerating after braking, and inputs derived from stored historical contextual data comprising one or more of a number of locomotives in the train, age or amount of usage of one or more locomotives of the train or other components of the train, weight distribution of the train, length of the train, speed of the train, control configurations for one or more locomotives or consists of the train, power notch settings of one or more locomotives of the train, braking implemented in the train, positive train control characteristics implemented in the train, grade, temperature, or other characteristics of train tracks on which the train is operating, and engine operational parameters that affect performance of one or more locomotive engines for the train (Li, in P. 2562 Col. 1, discloses a computationally inexpensive tracking control method where a single-coordinate dynamic model that reflects in-train forces. Li, in P. 2565 Col. 2, further discloses the system function (F) of the train dynamic model (14). Li, in Fig. 5 [reproduced here for convenience], P. 2566 Col. 2, & P. 2567 Col(s). 1-2, also discloses Train Model that simulates the actual train acceleration/braking system, and to input the offline [i.e., historical] data on the ITO platform including the speed limits, the gradient of the line, the planned trip time, and the train dynamic model parameters);
a virtual system model database configured to store one or more virtual system models simulated by the virtual system modeling engine, wherein each of the one or more virtual system models includes a mapping between different combinations of the stored historical contextual data and corresponding simulated in-train forces and train operational characteristics that occur when the train is accelerating after braking (Li, in Fig. 3 & P. 2563 Col. 2 – P. 2564 Col. 1, discloses developing an expert system [i.e., virtual system model database], which summarizes the expert rules based on analyzing the data from the YLBS and literature, and derive IF–THEN rules by treating the position, the speed, and the running time as inputs and the accelerating/braking rates as outputs. Li, in P. 2565 Col. 1, also discloses how the output of the expert system store Expert knowledge rules [i.e., model database], e.g. if the speed of the train is lower than vi , the train needs to accelerate [i.e., mapping between different combinations of the stored historical contextual data and corresponding simulated in-train forces and train operational characteristics that occur when the train is accelerating after braking]);
a machine learning engine (Li, in Fig. 3, Fig. 5 & P. 2563-2566, discloses Machine Learning techniques, i.e., Reinforcement Learning) configured to:
calculate relative weights to assign to each of different types of the stored historical contextual data of each of the one or more virtual system models and assigning the relative weights to the stored historical contextual data (Li, in Fig. 3, Fig. 5 & P. 2565 Col. 2, discloses “Reward Estimation” of “Reinforcement Learning” (RL) that is learning how to map situations to actions to optimize a numerical reward signal, and corrects output in the long-term reward [i.e., weight] for a decision. Li, in Algorithm 3.2 [reproduced here for convenience] & P. 2566 Col. 1-2, further discloses the reward function Ui in (16));

    PNG
    media_image3.png
    108
    671
    media_image3.png
    Greyscale

Li’s Equation. 16


[AltContent: arrow][AltContent: arrow]
    PNG
    media_image4.png
    665
    711
    media_image4.png
    Greyscale

Li’s Algorithm 3.2

train a learning system using the weighted stored historical contextual data and the training data to determine a probability of each of the one or more virtual system models providing an accurate representation of actual in-train forces and train operational characteristics that occur during acceleration of the train after braking using a learning function including at least one learning parameter (Li, in Fig. 3 “Reinforcement Learning” & P. 2565 Col(s). 1-2, discloses the output of the Expert System has the parameter Δ u;max , which is the variation of acceleration in time interval Δt  [i.e., at least one learning parameter], and the Reinforcement Learning (RL) is learning how to map situations to actions to optimize a numerical reward signal, and corrects output in the long-term reward [i.e., weight] for a decision. Li, Fig. 5 “Reinforcement Learning”, Algorithm 3.2 & P. 2566 Col. 1, further discloses the probability of selecting a certain action), wherein training the learning system includes:
providing the weighted stored historical contextual data as an input to the learning function (Li, in Fig. 3, discloses the Expert System as input to the Reinforcement Learning. Li, in Fig. 5 & P. 2565 Col. 2, further discloses Reinforcement Learning (RL) corrects output in the long-term reward [i.e., weight] for a decision, and in the train control process, actions are affect not only the immediate reward but also the rewards [i.e., weights] of the following states), the learning function being configured to use the at least one learning parameter to generate an output based on the input (Li in P. 2565 Col. 1, discloses the output of the expert system has the parameter Δ u;max [i.e., at least one learning parameter]. Li, in Algorithm 3.2 & P. 2566 Col. 1-2, further discloses initializing the parameters in step #1, evaluate reward Ui at state Xi  based on using the reward function (16));
causing the learning function to generate the output based on the input (Li, in Fig. 3, Fig. 5 & P. 2565 Col. 2, discloses using input-output. Li, in Algorithm 3.2 & P. 2566 Col. 1-2, further discloses the acceleration calculated by the expert system u0 [i.e., input] at step #2 and the Output ui [i.e., output] at step #8); 
comparing the output to the training data, wherein the training data includes data produced by sensors having captured actual information on in-train forces and train operational characteristics during acceleration of the train after braking (Li, in P. 2565 Col. 1, discloses after designing the expert rules and the heuristic inference method, the expert system for ITO is accomplished, and with the online data [i.e., data produced by sensors]  and the speed limit information, the expert inference method can make an appropriate decision to accelerate or coast. Li, in Algorithm 3.2 & P. 2566 Col. 2, further discloses comparing the Output ui with the expert rules in Section III-A at step #7);
comparing the determined probabilities  (Li, in Fig. 3, Fig. 5 & P. 2566 Col. 1, discloses the probability of selecting a certain action utilize E-greedy probability that has an optimal [implies comparing the determined probabilities] estimated action value [i.e., predetermined threshold probability level]. Li, in Algorithm 3.2 & P. 2566 Col. 2, further discloses to evaluate reward Ui at state Xi based on (16), then obtain the Optimal Δ u;var [implies comparing the determined probabilities], and adjust the output ui at steps #4-7); and
initiating adjustments to one or more of the calculated relative weights assigned to each of the different types of the stored historical contextual data of each of the one or more virtual system models to improve the determined probabilities  (Li, in Fig. 3, Fig. 5, Algorithm 3.2 & P. 2566 Col. 2, discloses to update the value function according to (18) at step 8); and
an energy management system associated with one or more locomotives of the train and configured to adjust one or more of throttle requests, dynamic braking requests, and pneumatic braking requests for the one or more locomotives of the train based at least in part on one of the virtual system models one or more of the simulated in-train forces and train operational characteristics falling within a predetermined acceptable range of values (Li, in P. 2561 Col. 1, discloses a discrete control model and confirmed the fundamental optimality of the accelerate–coast–brake strategy for energy-efficient train operation. Li, in P. 2564 Col. 1, further discloses designing a data-driven inference method to judge the time when the train coast or accelerate to ensure punctuality. Li, in P. 2564 Col. 1, further discloses the inference method for judging operation modes (coasting or accelerating) is summarized as follows: If vi ≤ ˆv, the train should accelerate, and the output of the expert system is described by (8). Li, in Fig. 3, Fig. 5 & P. 2566 Col. 1, further discloses the probability of selecting a certain action choose an action that has an optimal estimated action value, but with probability ε (i.e., greedy probability of RL). Li, in P. 2565 Col. 2, also discloses the basic solutions of ITOR are restricted by the expert rules into a certain range).

    PNG
    media_image5.png
    94
    435
    media_image5.png
    Greyscale

Li’s Equation. 8

Li does not disclose the comparing of the determined probabilities of each of the virtual system models to a predetermined threshold probability level, and selecting the virtual system models with the highest probability. 
Zappella teaches, in Col. 3 ln 59 - Col. 4 ln 15; Col. 4 ln 35-48; Col. 4 ln 64 - Col. 5 ln 13 & Col. 14 ln 56 - Col. 15 ln 19 that is was old and well known at the time of filing in the art of Artificial Intelligence control systems, comparing the determined probabilities of each of the virtual system models to a predetermined threshold probability level, and selecting the model with the highest probability (Zappella, in Col. 3 ln 59 - Col. 4 ln 15; Col. 4 ln 35-48 & Col. 4 ln 64 - Col. 5 ln 13, discloses switching between models based on confidence set. Zappella’s system implements a sequential decision system that automatically switches models based on their model parameter confidence sets. The said system receives results or feedback from the selected action and update the model parameters of the active model according to a sample of past actions and results, so that the active model is changing at the same time as it is being used to generate sequential decisions. The system also update a confidence set of the model parameters, which may contain the optimal parameters for the model with a probability above a threshold probability. The respective confidence sets of the model parameters of the active model and the recent model are periodically compared, and when the comparison indicates that the two models are sufficiently different, the decision system may cause the active model to be replaced with a replacement model. Zappella further discloses that the model replacement is triggered when the overlap of the two confidence sets fall below a quantitative threshold. Zappella’s model is combining the learnings of multiple past active models, so that the new model will perform well under currently observed conditions with high probability. Zappella, in Fig. 2 [reproduced here for convenience] & Col. 14 ln 56 - Col. 15 ln 19, also discloses the system includes a simulation system 250, which continues the training of multiple selection models using targeted training data).
[AltContent: arrow][AltContent: arrow]
    PNG
    media_image6.png
    646
    928
    media_image6.png
    Greyscale

Zappella’s fig. 2
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to modify Li in view of Zappella, as both inventions are directed to the same field of endeavor – Artificial Intelligence control systems and the combination would automatically switch models based on their model parameter confidence sets (see at least Zappella’s Col(s) 1-2).

As per Claim 2, Li as modified by Zappella teaches the train control system of claim 1, accordingly, the rejection of claim 1 above is incorporated.
Li further discloses wherein the machine learning engine includes at least one of a neural network, a support vector machine, a Markov decision process engine, a decision tree based algorithm, or a Bayesian based estimator (Li, in Fig. 3 [reproduced here for convenience] & P. 2565 Col. 1, discloses subway train control with an expert system and Reinforcement Learning (RL) based on Markov Decision Process).

As per Claim 4, Li as modified by Zappella teaches the train control system of claim 1, accordingly, the rejection of claim 1 above is incorporated.
Li further discloses wherein the virtual system modeling engine is configured to simulate the in-train forces and train operational characteristics during a period of time when the train includes at least one locomotive with an associated energy management system that is transitioning from a braking control to an acceleration control (Li, in Fig. 3 [reproduced here for convenience] & P. 2567 Col(s). 1-2, discloses to simulate electric multiple units (EMU) of the interactive impacts among the vehicles. The model is summarized in (22). EMU is composed of six vehicles (three locomotives, i.e., the first, third, and last vehicles, and three carriages, which are expressed as L-C-L-CC- L). The detailed parameters of the train model are shown in Table I).

    PNG
    media_image7.png
    155
    536
    media_image7.png
    Greyscale

Li’s Equation (22)

As per Claim 5, Li as modified by Zappella teaches the train control system of claim 4, accordingly, the rejection of claim 4 above is incorporated.
Li further discloses wherein the virtual system modeling engine is configured to simulate the in-train forces and train operational characteristics during a period of time when the train includes at least one locomotive with an associated energy management system that is transitioning and ramping up from heavy dynamic braking at the bottom of a hill to full throttle on the way back up an adjacent hill in a direction of travel of the train along the train tracks (Li, in Fig. 3 [reproduced here for convenience] & P. 2567 Col. 1, discloses to simulate electric multiple units (EMU) of the interactive impacts among the vehicles. The model is summarized in (22) and λi is a distribution constant determining the accelerating/braking effort of the ith  vehicle, and Fg is the resistance caused by the gradient [implies hill]).

As per Claim 9, Li as modified by Zappella teaches the train control system of claim 1, accordingly, the rejection of claim 1 above is incorporated.
Li further discloses wherein the machine learning engine is configurable by a user in order to adjust the relative weights that are assigned to each of different types of the stored historical contextual data of each of the one or more virtual system models, and wherein one or more of the predetermined acceptable ranges of values for simulated in-train forces and train operational characteristics are configurable by the user (Li, in Fig. 3 & P. 2564, discloses inference method, which is motivated by manual driving. Li, in Fig. 3 [reproduced here for convenience] & P. 2567 Col. 1, also discloses analysis of the manual driving data in the YLBS).

As per claim 10, Li discloses a method of using machine learning for controlling the ramp rate at which a train accelerates after braking (Li, in Fig. 3 [reproduced here for convenience] P. 2563 Col. 2 - P. 2566 Col. 2, discloses an interference method for judging operation modes (i.e., coasting or accelerating) of subway train control with an expert system and Reinforcement Learning (RL) based on Markov Decision Process [i.e., using Machine Learning]), the method comprising:
receiving real-time and historical operational and structural data for use as training data from one or more of systems and components of the train at a data acquisition hub communicatively connected to one or more of sensors and databases associated with one or more locomotives or other components of a train (Li, in Fig. 3 & P. 2564 Col. 2, discloses to obtain the online [i.e., real-time] data and the offline [i.e., historical] data. Li, in Fig.5 [reproduced here for convenience] & P. 2567 Col. 1, further discloses the Input Module is used to input the offline data on the ITO platform including the speed limits, the gradient of the line, the planned trip time, and the train dynamic model parameters);
simulating, using a virtual system modeling engine, in-train forces and train operational characteristics using physics-based equations, kinematic or dynamic modeling of behavior of the train or components of the train during operation when the train is accelerating after braking, and inputs derived from stored historical contextual data comprising one or more of a number of locomotives in the train, age or amount of usage of one or more locomotives of the train or other components of the train, weight distribution of the train, length of the train, speed of the train, control configurations for one or more locomotives or consists of the train, power notch settings of one or more locomotives of the train, braking implemented in the train, positive train control characteristics implemented in the train, grade, temperature, or other characteristics of train tracks on which the train is operating, and engine operational parameters that affect performance of one or more locomotive engines for the train (Li, in P. 2562 Col. 1, discloses a computationally inexpensive tracking control method where a single-coordinate dynamic model that reflects in-train forces. Li, in P. 2565 Col. 2, further discloses the system function (F) of the train dynamic model (14). Li, in Fig. 5 [reproduced here for convenience], P. 2566 Col. 2, & P. 2567 Col(s). 1-2, also discloses Train Model that simulates the actual train acceleration/braking system, and to input the offline [i.e., historical] data on the ITO platform including the speed limits, the gradient of the line, the planned trip time, and the train dynamic model parameters);
storing one or more virtual system models simulated by the virtual system modeling engine in a virtual system model database, wherein each of the one or more virtual system models includes a mapping between different combinations of the stored historical contextual data and corresponding simulated in-train forces and train operational characteristics that occur when the train is accelerating after braking (Li, in Fig. 3 & P. 2563 Col. 2 – P. 2564 Col. 1, discloses developing an expert system [i.e., virtual system model database], which summarizes the expert rules based on analyzing the data from the YLBS and literature, and derive IF–THEN rules by treating the position, the speed, and the running time as inputs and the accelerating/braking rates as outputs. Li, in P. 2565 Col. 1, also discloses how the output of the expert system store Expert knowledge rules [i.e., model database], e.g. if the speed of the train is lower than vi , the train needs to accelerate [i.e., mapping between different combinations of the stored historical contextual data and corresponding simulated in-train forces and train operational characteristics that occur when the train is accelerating after braking]);
calculating, using a machine learning engine, relative weights to assign to each of different types of the stored historical contextual data of each of the one or more virtual system models and assigning the relative weights to the stored historical contextual data (Li, in Fig. 3, Fig. 5 & P. 2565 Col. 2, discloses “Reward Estimation” of “Reinforcement Learning” (RL) that is learning how to map situations to actions to optimize a numerical reward signal, and corrects output in the long-term reward [i.e., weight] for a decision. Li, in Algorithm 3.2 [reproduced here for convenience] & P. 2566 Col. 1-2, further discloses the reward function Ui in (16));
training a learning system with the machine learning engine using the weighted stored historical contextual data and the training data to determine a probability of each of the one or more virtual system models providing an accurate representation of actual in-train forces and train
operational characteristics that occur during acceleration of the train after braking, and using a learning function including at least one learning parameter (Li, in Fig. 3 “Reinforcement Learning” & P. 2565 Col(s). 1-2, discloses the output of the Expert System has the parameter Δ u;max , which is the variation of acceleration in time interval Δt  [i.e., at least one learning parameter], and the Reinforcement Learning (RL) is learning how to map situations to actions to optimize a numerical reward signal, and corrects output in the long-term reward [i.e., weight] for a decision. Li, Fig. 5 “Reinforcement Learning”, Algorithm 3.2 & P. 2566 Col. 1, further discloses the probability of selecting a certain action), wherein training the learning system includes:
providing the weighted stored historical contextual data as an input to the learning function (Li, in Fig. 3, discloses the Expert System as input to the Reinforcement Learning. Li, in Fig. 5 & P. 2565 Col. 2, further discloses Reinforcement Learning (RL) corrects output in the long-term reward [i.e., weight] for a decision, and in the train control process, actions are affect not only the immediate reward but also the rewards [i.e., weights] of the following states), the learning function being configured to use the at least one learning parameter to generate an output based on the input (Li in P. 2565 Col. 1, discloses the output of the expert system has the parameter Δ u;max [i.e., at least one learning parameter]. Li, in Algorithm 3.2 & P. 2566 Col. 1-2, further discloses initializing the parameters in step #1, evaluate reward Ui at state Xi  based on using the reward function (16));
causing the learning function to generate the output based on the input (Li, in Fig. 3, Fig. 5 & P. 2565 Col. 2, discloses using input-output. Li, in Algorithm 3.2 & P. 2566 Col. 1-2, further discloses the acceleration calculated by the expert system u0 [i.e., input] at step #2 and the Output ui [i.e., output] at step #8);
comparing the output to the training data, wherein the training data includes data produced by sensors having captured actual information on in-train forces and train operational characteristics during acceleration of the train after braking (Li, in P. 2565 Col. 1, discloses after designing the expert rules and the heuristic inference method, the expert system for ITO is accomplished, and with the online data [i.e., data produced by sensors]  and the speed limit information, the expert inference method can make an appropriate decision to accelerate or coast. Li, in Algorithm 3.2 & P. 2566 Col. 2, further discloses comparing the Output ui with the expert rules in Section III-A at step #7);
comparing the determined probabilities (Li, in Fig. 3, Fig. 5 & P. 2566 Col. 1, discloses the probability of selecting a certain action utilize E-greedy probability that has an optimal [implies comparing the determined probabilities] estimated action value [i.e., predetermined threshold probability level]. Li, in Algorithm 3.2 & P. 2566 Col. 2, further discloses to evaluate reward Ui at state Xi based on (16), then obtain the Optimal Δ u;var [implies comparing the determined probabilities], and adjust the output ui at steps #4-7); and
initiating adjustments to one or more of the calculated relative weights assigned to each of the different types of the stored historical contextual data  (Li, in Fig. 3, Fig. 5, Algorithm 3.2 & P. 2566 Col. 2, discloses to update the value function according to (18) at step 8); and
adjusting one or more of throttle requests, dynamic braking requests, and pneumatic braking requests for the one or more locomotives of the train using an energy management system associated with the one or more locomotives of the train based at least in part on one of the virtual system models with (Li, in P. 2561 Col. 1, discloses a discrete control model and confirmed the fundamental optimality of the accelerate–coast–brake strategy for energy-efficient train operation. Li, in P. 2564 Col. 1, further discloses designing a data-driven inference method to judge the time when the train coast or accelerate to ensure punctuality. Li, in P. 2564 Col. 1, further discloses the inference method for judging operation modes (coasting or accelerating) is summarized as follows: If vi ≤ ˆv, the train should accelerate, and the output of the expert system is described by (8). Li, in Fig. 3, Fig. 5 & P. 2566 Col. 1, further discloses the probability of selecting a certain action choose an action that has an optimal estimated action value, but with probability ε (i.e., greedy probability of RL). Li, in P. 2565 Col. 2, also discloses the basic solutions of ITOR are restricted by the expert rules into a certain range).
Li, does not disclose the comparing of the determined probabilities of each of the virtual system models to a predetermined threshold probability level, and selecting the virtual system models with the highest probability. 
Zappella teaches, in Col. 3 ln 59 - Col. 4 ln 15; Col. 4 ln 35-48; Col. 4 ln 64 - Col. 5 ln 13 & Col. 14 ln 56 - Col. 15 ln 19 that is was old and well known at the time of filing in the art of Artificial Intelligence control systems, comparing the determined probabilities of each of the virtual system models to a predetermined threshold probability level, and selecting the model with the highest probability (Zappella, in Col. 3 ln 59 - Col. 4 ln 15; Col. 4 ln 35-48 & Col. 4 ln 64 - Col. 5 ln 13, discloses switching between models based on confidence set. Zappella’s system implements a sequential decision system that automatically switches models based on their model parameter confidence sets. The said system receives results or feedback from the selected action and update the model parameters of the active model according to a sample of past actions and results, so that the active model is changing at the same time as it is being used to generate sequential decisions. The system also update a confidence set of the model parameters, which may contain the optimal parameters for the model with a probability above a threshold probability. The respective confidence sets of the model parameters of the active model and the recent model are periodically compared, and when the comparison indicates that the two models are sufficiently different, the decision system may cause the active model to be replaced with a replacement model. Zappella further discloses that the model replacement is triggered when the overlap of the two confidence sets fall below a quantitative threshold. Zappella’s model is combining the learnings of multiple past active models, so that the new model will perform well under currently observed conditions with high probability. Zappella, in Fig. 2 [reproduced here for convenience] & Col. 14 ln 56 - Col. 15 ln 19, also discloses the system includes a simulation system 250, which continues the training of multiple selection models using targeted training data).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to modify Li in view of Zappella, as both inventions are directed to the same field of endeavor – Artificial Intelligence control systems and the combination would automatically switch models based on their model parameter confidence sets (see at least Zappella’s Col(s) 1-2).


As per Claim 11, Li as modified by Zappella teaches the method of claim 10, accordingly, the rejection of claim 10 above is incorporated.
Li further discloses wherein the machine learning engine includes at least one of a neural network, a support vector machine, a Markov decision process engine, a decision tree based algorithm, or a Bayesian based estimator (Li, in Fig. 3 [reproduced here for convenience] & P. 2565 Col. 1, discloses subway train control with an expert system and Reinforcement Learning (RL) based on Markov Decision Process).


As per Claim 13, Li as modified by Zappella teaches the method of claim 10, accordingly, the rejection of claim 10 above is incorporated.
Li further discloses wherein the virtual system modeling engine is configured to simulate the in-train forces and train operational characteristics during a period of time when the train includes at least one locomotive with an associated energy management system that is transitioning from a braking control to an acceleration control (Li, in Fig. 3 [reproduced here for convenience] & P. 2567 Col(s). 1-2, discloses to simulate electric multiple units (EMU) of the interactive impacts among the vehicles. The model is summarized in (22). EMU is composed of six vehicles (three locomotives, i.e., the first, third, and last vehicles, and three carriages, which are expressed as L-C-L-CC- L). The detailed parameters of the train model are shown in Table I).

As per Claim 14, Li as modified by Zappella teaches the method of claim 13, accordingly, the rejection of claim 13 above is incorporated.
Li further discloses wherein the virtual system modeling engine is configured to simulate the in-train forces and train operational characteristics during a period of time when the train includes at least one locomotive with an associated energy management system that is transitioning and ramping up from heavy dynamic braking at the bottom of a hill to full throttle on the way back up an adjacent hill in a direction of travel of the train along the train tracks (Li, in Fig. 3 [reproduced here for convenience] & P. 2567 Col. 1, discloses to simulate electric multiple units (EMU) of the interactive impacts among the vehicles. The model is summarized in (22) and λi is a distribution constant determining the accelerating/braking effort of the ith  vehicle, and Fg is the resistance caused by the gradient).

As per Claim 18, Li as modified by Zappella teaches the method of claim 10, accordingly, the rejection of claim 10 above is incorporated.
Li further discloses wherein the machine learning engine is configurable by a user in order to adjust the relative weights that are assigned to each of different types of the stored historical contextual data of each of the one or more virtual system models, and wherein the predetermined acceptable ranges of values for simulated in-train forces and train operational characteristics are configurable by the user (Li, in Fig. 3 & P. 2564, discloses inference method, which is motivated by manual driving. Li, in Fig. 3 [reproduced here for convenience] & P. 2567 Col. 1, also discloses analysis of the manual driving data in the YLBS).




















As per claim 19, Li discloses a locomotive control system (Li, in Fig. 3 [reproduced here for convenience] P. 2563 Col. 2 - P. 2566 Col. 2, a subway train control with an expert system and Reinforcement Learning (RL) based on Markov Decision Process), comprising:
a learning system  (Li, in Fig. 3, Fig. 5 & P. 2563-2566, discloses Machine Learning techniques, i.e., Reinforcement Learning) configured to:
receive real-time and historical operational and structural data for use as training data from one or more systems or components of the train at a data acquisition hub communicatively connected to one or more of sensors and databases associated with one or more locomotives or other components of a train (Li, in Fig. 3 & P. 2564 Col. 2, discloses to obtain the online [i.e., real-time] data and the offline [i.e., historical] data. Li, in Fig.5 [reproduced here for convenience] & P. 2567 Col. 1, further discloses the Input Module is used to input the offline data on the ITO platform including the speed limits, the gradient of the line, the planned trip time, and the train dynamic model parameters);
simulate, using a virtual system modeling engine, in-train forces and train operational characteristics using physics-based equations, kinematic or dynamic modeling of behavior of the train or components of the train during operation when the train is accelerating after braking, and inputs derived from stored historical contextual data comprising one or more of a number of locomotives in the train, age or amount of usage of one or more locomotives of the train or other components of the train, weight distribution of the train, length of the train, speed of the train, control configurations for one or more locomotives or consists of the train, power notch settings of one or more locomotives of the train, braking implemented in the train, positive train control characteristics implemented in the train, grade, temperature, or other characteristics of train tracks on which the train is operating, and engine operational parameters that affect performance of one or more locomotive engines for the train (Li, in P. 2562 Col. 1, discloses a computationally inexpensive tracking control method where a single-coordinate dynamic model that reflects in-train forces. Li, in P. 2565 Col. 2, further discloses the system function (F) of the train dynamic model (14). Li, in Fig. 5 [reproduced here for convenience], P. 2566 Col. 2, & P. 2567 Col(s). 1-2, also discloses Train Model that simulates the actual train acceleration/braking system, and to input the offline [i.e., historical] data on the ITO platform including the speed limits, the gradient of the line, the planned trip time, and the train dynamic model parameters);
store one or more virtual system models simulated by the virtual system modeling engine in a virtual system model database, wherein each of the one or more virtual system models includes a mapping between different combinations of the stored historical contextual data and corresponding simulated in-train forces and train operational characteristics that occur when the train is accelerating after braking (Li, in Fig. 3 & P. 2563 Col. 2 – P. 2564 Col. 1, discloses developing an expert system [i.e., virtual system model database], which summarizes the expert rules based on analyzing the data from the YLBS and literature, and derive IF–THEN rules by treating the position, the speed, and the running time as inputs and the accelerating/braking rates as outputs. Li, in P. 2565 Col. 1, also discloses how the output of the expert system store Expert knowledge rules [i.e., model database], e.g. if the speed of the train is lower than vi , the train needs to accelerate [i.e., mapping between different combinations of the stored historical contextual data and corresponding simulated in-train forces and train operational characteristics that occur when the train is accelerating after braking]);
calculate, using a machine learning engine, relative weights to assign to each of different types of the stored historical contextual data of each of the one or more virtual system models and assigning the relative weights to the stored historical contextual data (Li, in Fig. 3, Fig. 5 & P. 2565 Col. 2, discloses “Reward Estimation” of “Reinforcement Learning” (RL) that is learning how to map situations to actions to optimize a numerical reward signal, and corrects output in the long-term reward [i.e., weight] for a decision. Li, in Algorithm 3.2 [reproduced here for convenience] & P. 2566 Col. 1-2, further discloses the reward function Ui in (16));
train a learning system with the machine learning engine using the weighted stored historical contextual data and the training data to determine a probability of each of the one or more virtual system models providing an accurate representation of actual in-train forces and train
operational characteristics that occur during acceleration of the train after braking, and using a learning function including at least one learning parameter (Li, in Fig. 3 “Reinforcement Learning” & P. 2565 Col(s). 1-2, discloses the output of the Expert System has the parameter Δ u;max , which is the variation of acceleration in time interval Δt  [i.e., at least one learning parameter], and the Reinforcement Learning (RL) is learning how to map situations to actions to optimize a numerical reward signal, and corrects output in the long-term reward [i.e., weight] for a decision. Li, Fig. 5 “Reinforcement Learning”, Algorithm 3.2 & P. 2566 Col. 1, further discloses the probability of selecting a certain action), wherein training the learning system includes:
providing the weighted stored historical contextual data as an input to the learning function (Li, in Fig. 3, discloses the Expert System as input to the Reinforcement Learning. Li, in Fig. 5 & P. 2565 Col. 2, further discloses Reinforcement Learning (RL) corrects output in the long-term reward [i.e., weight] for a decision, and in the train control process, actions are affect not only the immediate reward but also the rewards [i.e., weights] of the following states), the learning function being configured to use the at least one learning parameter to generate an output based on the input (Li in P. 2565 Col. 1, discloses the output of the expert system has the parameter Δ u;max [i.e., at least one learning parameter]. Li, in Algorithm 3.2 & P. 2566 Col. 1-2, further discloses initializing the parameters in step #1, evaluate reward Ui at state Xi  based on using the reward function (16));
causing the learning function to generate the output based on the input (Li, in Fig. 3, Fig. 5 & P. 2565 Col. 2, discloses using input-output. Li, in Algorithm 3.2 & P. 2566 Col. 1-2, further discloses the acceleration calculated by the expert system u0 [i.e., input] at step #2 and the Output ui [i.e., output] at step #8);
comparing the output to the training data, wherein the training data includes data produced by sensors having captured actual information on in-train forces and train operational characteristics during acceleration of the train after braking (Li, in P. 2565 Col. 1, discloses after designing the expert rules and the heuristic inference method, the expert system for ITO is accomplished, and with the online data [i.e., data produced by sensors]  and the speed limit information, the expert inference method can make an appropriate decision to accelerate or coast. Li, in Algorithm 3.2 & P. 2566 Col. 2, further discloses comparing the Output ui with the expert rules in Section III-A at step #7);
comparing the determined probabilities (Li, in Fig. 3, Fig. 5 & P. 2566 Col. 1, discloses the probability of selecting a certain action utilize E-greedy probability that has an optimal [implies comparing the determined probabilities] estimated action value [i.e., predetermined threshold probability level]. Li, in Algorithm 3.2 & P. 2566 Col. 2, further discloses to evaluate reward Ui at state Xi based on (16), then obtain the Optimal Δ u;var [implies comparing the determined probabilities], and adjust the output ui at steps #4-7); and
initiating adjustments to one or more of the calculated relative weights assigned to each of the different types of the stored historical contextual data  (Li, in Fig. 3, Fig. 5, Algorithm 3.2 & P. 2566 Col. 2, discloses to update the value function according to (18) at step 8); and
adjust one or more of throttle requests, dynamic braking requests, and pneumatic braking requests for the one or more locomotives of the train using an energy management system associated with the one or more locomotives of the train based at least in part on one of the virtual system models with (Li, in P. 2561 Col. 1, discloses a discrete control model and confirmed the fundamental optimality of the accelerate–coast–brake strategy for energy-efficient train operation. Li, in P. 2564 Col. 1, further discloses designing a data-driven inference method to judge the time when the train coast or accelerate to ensure punctuality. Li, in P. 2564 Col. 1, further discloses the inference method for judging operation modes (coasting or accelerating) is summarized as follows: If vi ≤ ˆv, the train should accelerate, and the output of the expert system is described by (8). Li, in Fig. 3, Fig. 5 & P. 2566 Col. 1, further discloses the probability of selecting a certain action choose an action that has an optimal estimated action value, but with probability ε (i.e., greedy probability of RL). Li, in P. 2565 Col. 2, also discloses the basic solutions of ITOR are restricted by the expert rules into a certain range).
Li, does not disclose the comparing of the determined probabilities of each of the virtual system models to a predetermined threshold probability level, and selecting the virtual system models with the highest probability. 
Zappella teaches, in Col. 3 ln 59 - Col. 4 ln 15; Col. 4 ln 35-48; Col. 4 ln 64 - Col. 5 ln 13 & Col. 14 ln 56 - Col. 15 ln 19 that is was old and well known at the time of filing in the art of Artificial Intelligence control systems, comparing the determined probabilities of each of the virtual system models to a predetermined threshold probability level, and selecting the model with the highest probability (Zappella, in Col. 3 ln 59 - Col. 4 ln 15; Col. 4 ln 35-48 & Col. 4 ln 64 - Col. 5 ln 13, discloses switching between models based on confidence set. Zappella’s system implements a sequential decision system that automatically switches models based on their model parameter confidence sets. The said system receives results or feedback from the selected action and update the model parameters of the active model according to a sample of past actions and results, so that the active model is changing at the same time as it is being used to generate sequential decisions. The system also update a confidence set of the model parameters, which may contain the optimal parameters for the model with a probability above a threshold probability. The respective confidence sets of the model parameters of the active model and the recent model are periodically compared, and when the comparison indicates that the two models are sufficiently different, the decision system may cause the active model to be replaced with a replacement model. Zappella further discloses that the model replacement is triggered when the overlap of the two confidence sets fall below a quantitative threshold. Zappella’s model is combining the learnings of multiple past active models, so that the new model will perform well under currently observed conditions with high probability. Zappella, in Fig. 2 [reproduced here for convenience] & Col. 14 ln 56 - Col. 15 ln 19, also discloses the system includes a simulation system 250, which continues the training of multiple selection models using targeted training data).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to modify Li in view of Zappella, as both inventions are directed to the same field of endeavor – Artificial Intelligence control systems and the combination would automatically switch models based on their model parameter confidence sets (see at least Zappella’s Col(s) 1-2).

As per Claim 20, Li as modified by Zappella teaches the locomotive control system of claim 19, accordingly, the rejection of claim 19 above is incorporated.
Li further discloses wherein the training data includes configuration and operational data associated with the inputs derived from stored historical contextual data, the training data being generated by one or more systems or components of the train while the train is being operated by an experienced train operator (Li, in Fig. 3 & P. 2563 Col. 2 – P. 2564 Col. 1, discloses developing an expert system [i.e., virtual system model database], which summarizes the expert rules based on analyzing the data from the YLBS and literature). 
Li is silent on wherein the output generated by the learning function represents a goal or objective that the machine learning engine is configured to cause the learning system to match by modifying the at least one learning parameter until the difference between the output and the training data is less than a predetermined threshold difference.
Zappella teaches, in Col. 3 ln 59 - Col. 4 ln 15; Col. 4 ln 35-48; Col. 4 ln 64 - Col. 5 ln 13 & Col. 14 ln 56 - Col. 15 ln 19 that is was old and well known at the time of filing in the art of Artificial Intelligence control systems, wherein the output generated by the learning function represents a goal or objective that the machine learning engine is configured to cause the learning system to match by modifying the at least one learning parameter until the difference between the output and the training data is less than a predetermined threshold difference (Zappella, in Col. 3 ln 59 - Col. 4 ln 15; Col. 4 ln 35-48 & Col. 4 ln 64 - Col. 5 ln 13, discloses switching between models based on confidence set. Zappella’s system implements a sequential decision system that automatically switches models based on their model parameter confidence sets. The said system receives results or feedback from the selected action and update the model parameters of the active model according to a sample of past actions and results, so that the active model is changing at the same time as it is being used to generate sequential decisions. The system also update a confidence set of the model parameters, which may contain the optimal parameters for the model with a probability above a threshold probability [goal or objective that the machine learning engine]. The respective confidence sets of the model parameters of the active model and the recent model are periodically compared, and when the comparison indicates that the two models are sufficiently different, the decision system may cause the active model to be replaced with a replacement model. Zappella further discloses that the model replacement is triggered when the overlap of the two confidence sets fall below a quantitative threshold [i.e., less than a predetermined threshold difference]. Zappella’s model is combining the learnings of multiple past active models, so that the new model will perform well under currently observed conditions with high probability. Zappella, in Fig. 2 [reproduced here for convenience] & Col. 14 ln 56 - Col. 15 ln 19, also discloses the system includes a simulation system 250, which continues the training of multiple selection models using targeted training data).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to modify Li in view of Zappella, as both inventions are directed to the same field of endeavor – Artificial Intelligence control systems and the combination would automatically switch models based on their model parameter confidence sets (see at least Zappella’s Col(s) 1-2).






















Claims 3, 6-8, 12 & 15-17 are rejected under 35 USC §103 as being unpatentable over Li (DoI 10.1109/TITS.2014.2320757) in view of Zappella (US 11,164,093 B1), and further in view of PG Pub. No. US 2022/0067850 A1 to Bhasme et al. (hereinafter “Bhasme”)

As per Claim 3, Li as modified by Zappella teaches the train control system of claim 2, accordingly, the rejection of claim 2 above is incorporated.
Li further discloses wherein the machine learning engine  (Li, in Fig. 3, Fig. 5 & P. 2565 Col(s). 1-2, discloses the output of the Expert System has the parameter Δ u;max, which is the variation of acceleration in time interval Δt  [i.e., at least one learning parameter], the parameter Δu; is adjusted via Reinforcement Learning (RL) and the Reinforcement Learning (RL) is learning how to map situations to actions to optimize a numerical reward signal, and corrects output in the long-term reward for a decision).
Li does not disclose that the machine learning engine includes a neural network configured to train the neural network, includes a plurality of first outputs from the neural network generated based on the inputs, and the at least one learning parameter includes a characteristic of the neural network.
Zappella teaches, in Col. 3, ln 10-33 & Col. 16 ln 53-65 that is was old and well known at the time of filing in the art of Artificial Intelligence control systems, includes a neural network configured to train the neural network, includes a characteristic of the neural network (Zappella, in Col. 3, ln 10-33 & Col. 16 ln 53-65, discloses that computer models is stored as more complex data structures that specify relationships between the different parameters, such as neural networks, and a wide variety of machine learning algorithms may be supported natively by the MLS libraries, including neural network algorithms).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to modify Li in view of Zappella, as both inventions are directed to the same field of endeavor – Artificial Intelligence control systems and the combination would automatically switch models based on their model parameter confidence sets (see at least Zappella’s Col(s) 1-2).
The combination of Li & Zappella are silent on includes a plurality of first outputs from the neural network generated based on the inputs.
Bhasme teaches, in Fig. 3 & ¶79 that is was old and well known at the time of filing in the art of Artificial Intelligence control systems, includes a plurality of first outputs from the neural network generated based on the inputs (Bhasme, in Fig. 2A [reproduced here for convenience] & ¶79, discloses an illustrative machine learning model 200, that is an energy management strategy that maps one or more inputs to one or more control outputs).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to modify Li & Zappella in view of Bhasme, as all inventions are directed to the same field of endeavor – Artificial Intelligence control systems and the combination would provide for selecting a model from the plurality of candidate models that has a highest reward with respect to the context (see at least Bhasme’s ¶¶7-9).



    PNG
    media_image8.png
    764
    1034
    media_image8.png
    Greyscale

Bhasme’s Fig. 2A


As per Claim 6, Li as modified by Zappella teaches the train control system of claim 1, accordingly, the rejection of claim 1 above is incorporated.
Li further discloses wherein the real-time and historical operational and structural data acquired by the data acquisition hub for use as training data includes one or more of structural stresses on one or more knuckles interconnecting one or more of locomotives and non-powered rail cars of the train, and measured vibrations (Li, in Fig. 3 [reproduced here for convenience] & P. 2567 Col. 1, discloses to simulate electric multiple units (EMU) of the interactive impacts among the vehicles. The model is summarized in (22) and Fd represents the interactive impacts [implies measured vibrations] among the vehicles).
The combination of Li & Zappella are silent on vibrations of engine components caused by harmonic nodes encountered while ramping up power output of one or more of the locomotive engines of the train.
Bhasme teaches, in ¶203 that is was old and well known at the time of filing in the art of Artificial Intelligence control systems, vibrations of engine components caused by harmonic nodes encountered while ramping up power output of one or more of the locomotive engines of the train (Bhasme, in ¶203, discloses techniques are provided for selecting a model from a model category so as to increase one or more rewards. The reward function r(C, m) is provided to increase drive cycle efficiency (kWh/km), energy recovered during regenerative braking instances, etc., and/or decrease thermal losses, cyclical oscillation  of high energy device [i.e., vibrations of engine components], second derivative of high energy device power demand, etc.).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to modify Li & Zappella in view of Bhasme, as all inventions are directed to the same field of endeavor – Artificial Intelligence control systems and the combination would provide for selecting a model from the plurality of candidate models that has a highest reward with respect to the context (see at least Bhasme’s ¶¶7-9).

As per Claim 7, Li as modified by Zappella & Bhasme teaches the train control system of claim 6, accordingly, the rejection of claim 6 above is incorporated.
Li further discloses wherein the energy management system is configured to adjust one or more of throttle requests, dynamic braking requests, and pneumatic braking requests for the one or more associated locomotives of the train using a microprocessor based locomotive control system, a cab electronics system, and an electronic pneumatic brake system mounted within a cab of each of the one or more locomotives (Li, in Fig. 3 [reproduced here for convenience] & P. 2567 Col. 1, discloses to simulate electric multiple units (EMU) [i.e., microprocessor] of the interactive impacts among the vehicles).

As per Claim 8, Li as modified by Zappella teaches the train control system of claim 7, accordingly, the rejection of claim 7 above is incorporated.
Li further discloses wherein the energy management system is configured to adjust one or more of throttle requests, dynamic braking requests, and pneumatic braking requests for the one or more associated locomotives of the train while transitioning and ramping up from heavy dynamic braking at the bottom of a hill to full throttle on the way back up an adjacent hill in a direction of travel of the train along the train tracks, and while increasing a ramp rate and maintaining the structural stresses on one or more knuckles and the vibrations of engine components within predetermined acceptable ranges of values (Li, in Fig. 3 [reproduced here for convenience] & P. 2567 Col. 1, discloses to simulate electric multiple units (EMU) of the interactive impacts among the vehicles. The model is summarized in (22) and λi is a distribution constant determining the accelerating/braking effort of the ith  vehicle, and Fg is the resistance caused by the gradient [implies hill]).

As per Claim 12, Li as modified by Zappella teaches the method of claim 11, accordingly, the rejection of claim 11 above is incorporated.
Li further discloses wherein the machine learning engine includes a neural network, and the machine learning engine is configured to train  (Li, in Fig. 3, Fig. 5 & P. 2565 Col(s). 1-2, discloses the output of the Expert System has the parameter Δ u;max, which is the variation of acceleration in time interval Δt  [i.e., at least one learning parameter], the parameter Δu; is adjusted via Reinforcement Learning (RL) and the Reinforcement Learning (RL) is learning how to map situations to actions to optimize a numerical reward signal, and corrects output in the long-term reward for a decision).
Li does not disclose that the machine learning engine includes a neural network configured to train the neural network, includes a plurality of first outputs from the neural network generated based on the inputs, and the at least one learning parameter includes a characteristic of the neural network.
Zappella teaches, in Col. 3, ln 10-33 & Col. 16 ln 53-65 that is was old and well known at the time of filing in the art of Artificial Intelligence control systems, includes a neural network configured to train the neural network, includes a characteristic of the neural network (Zappella, in Col. 3, ln 10-33 & Col. 16 ln 53-65, discloses that computer models is stored as more complex data structures that specify relationships between the different parameters, such as neural networks, and a wide variety of machine learning algorithms may be supported natively by the MLS libraries, including neural network algorithms).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to modify Li in view of Zappella, as both inventions are directed to the same field of endeavor – Artificial Intelligence control systems and the combination would automatically switch models based on their model parameter confidence sets (see at least Zappella’s Col(s) 1-2).
The combination of Li & Zappella are silent on includes a plurality of first outputs from the neural network generated based on the inputs.
Bhasme teaches, in Fig. 3 & ¶79 that is was old and well known at the time of filing in the art of Artificial Intelligence control systems, includes a plurality of first outputs from the neural network generated based on the inputs (Bhasme, in Fig. 2A [reproduced here for convenience] & ¶79, discloses an illustrative machine learning model 200, that is an energy management strategy that maps one or more inputs to one or more control outputs).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to modify Li & Zappella in view of Bhasme, as all inventions are directed to the same field of endeavor – Artificial Intelligence control systems and the combination would provide for selecting a model from the plurality of candidate models that has a highest reward with respect to the context (see at least Bhasme’s ¶¶7-9).

As per Claim 15, Li as modified by Zappella teaches the method of claim 10, accordingly, the rejection of claim 10 above is incorporated.
Li further discloses wherein the real-time and historical operational and structural data acquired by the data acquisition hub for use as training data includes one or more of structural stresses on one or more knuckles interconnecting one or more of locomotives and nonpowered rail cars of the train, and measured vibrations 
(Li, in Fig. 3 [reproduced here for convenience] & P. 2567 Col. 1, discloses to simulate electric multiple units (EMU) of the interactive impacts among the vehicles. The model is summarized in (22) and Fd represents the interactive impacts [implies measured vibrations] among the vehicles).
The combination of Li & Zappella are silent on vibrations of engine components caused by harmonic nodes encountered while ramping up power output of one or more of the locomotive engines of the train.
Bhasme teaches, in ¶203 that is was old and well known at the time of filing in the art of Artificial Intelligence control systems, vibrations of engine components caused by harmonic nodes encountered while ramping up power output of one or more of the locomotive engines of the train (Bhasme, in ¶203, discloses techniques are provided for selecting a model from a model category so as to increase one or more rewards. The reward function r(C, m) is provided to increase drive cycle efficiency (kWh/km), energy recovered during regenerative braking instances, etc., and/or decrease thermal losses, cyclical oscillation  of high energy device [i.e., vibrations of engine components], second derivative of high energy device power demand, etc.).
It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to modify Li & Zappella in view of Bhasme, as all inventions are directed to the same field of endeavor – Artificial Intelligence control systems and the combination would provide for selecting a model from the plurality of candidate models that has a highest reward with respect to the context (see at least Bhasme’s ¶¶7-9).

As per Claim 16, Li as modified by Zappella teaches the method of claim 15, accordingly, the rejection of claim 15 above is incorporated.
Li further discloses wherein the energy management system is configured to adjust one or more of throttle requests, dynamic braking requests, and pneumatic braking requests for the one or more associated locomotives of the train using a microprocessor based locomotive control system, a cab electronics system, and an electronic pneumatic brake system mounted within a cab of each of the one or more locomotives (Li, in Fig. 3 [reproduced here for convenience] & P. 2567 Col. 1, discloses to simulate electric multiple units (EMU) [i.e., microprocessor] of the interactive impacts among the vehicles).

As per Claim 17, Li as modified by Zappella teaches the method of claim 16, accordingly, the rejection of claim 16 above is incorporated.
Li further discloses wherein the energy management system is configured to adjust one or more of throttle requests, dynamic braking requests, and pneumatic braking requests for the one or more associated locomotives of the train while transitioning and ramping up from heavy dynamic braking at the bottom of a hill to full throttle on the way back up an adjacent hill in a direction of travel of the train along the train tracks, and while increasing a ramp rate and maintaining the structural stresses on one or more knuckles and the vibrations of engine components within predetermined acceptable ranges of values (Li, in Fig. 3 [reproduced here for convenience] & P. 2567 Col. 1, discloses to simulate electric multiple units (EMU) of the interactive impacts among the vehicles. The model is summarized in (22) and λi is a distribution constant determining the accelerating/braking effort of the ith  vehicle, and Fg is the resistance caused by the gradient [implies hill]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Kandemir et al. (PG Pub. US 2020/0326718 A1, and also published as EP 3722894 A1), discloses a training system that (1) accesses model data defining a Bayesian Neural Network and training data comprising sensor measurements and associated states of the environment and the physical system, (2) trains the Bayesian neural network based on the training data by integrating out weights of the Bayesian neural network to obtain a marginal likelihood function of the Bayesian neural network and maximizing the marginal likelihood function to tune hyperparameters of the integrated-out weights of the Bayesian neural network so as to obtain a trained Bayesian neural network, (3) outputs trained model data representing the trained Bayesian neural network.
 
Markyvech (PG Pub. US 7,121,977 B2), discloses (1) calculating a throttle ramp rate offset based on an estimated weight of the vehicle when a high throttle demand upon an engine is present, (2) adjusting a default high throttle ramp rate based on the calculated throttle ramp rate offset. Markyvech’s invention allows quicker initiation of vehicle acceleration by allowing higher ramp rate to be applied for a portion of time, followed by application of slower, adjusted ramp rate once clutch engagement begins, thereby reducing chance of difficult vehicle launch and possibility of damage due to overly rapid acceleration of engine. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Tarek Elarabi whose telephone number is (313)446-4911. The examiner can normally be reached on Monday thru Thursday; 6:00 AM - 4:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Elaine Gort can be reached on (571)272-6781. The fax phone number for the organization where this application or proceeding is assigned is (571)273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair.
 Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or (571)272-1000.




/TAREK ELARABI/Examiner, Art Unit 3661