DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 3/16/2021 has been entered.
Claims 1, 4-6, 8, 10-12, 14, 16-18, 20, and 23 have been amended. Claims 1-25 have been examined.

Response to Arguments
Applicant’s arguments, see pp. 11-12, filed 3/16/2021, with respect to the rejection(s) of claim(s) 1, 8, 14, 20 and 23 under 35 USC § 103 have been fully considered and are persuasive.  Therefore, the rejections have been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of “Reinforcement learning design for cancer clinical trials” by Zhao et al. (“Zhao”), U.S. Patent Application Publication 2008/0126277 by Williams et al. (“Williams”), and U.S. Patent Application Publication 2018/0025274 by Beller et al. (“Beller”), respectively.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-3, 7-9, 13-15, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over prior art of record U.S. Patent Application Publication 2004/0247748 by Bronkema (“Bronkema”) in view of prior art of record U.S. Patent Application Publication 2007/0072156 by Kaufman et al. (“Kaufman”), “Reinforcement learning design for cancer clinical trials” by Zhao et al. (“Zhao”) and U.S. Patent Application Publication 2008/0126277 by Williams et al. (“Williams”).

In regard to claim 1, Bronkema discloses:
1. A computer-implemented method comprising: See Bronkema, Fig. 1B, depicting a distributed computer system. Also see at least Fig. 3, broadly depicting a method.
querying, by a processor, a plurality of model data from a distributed data source based at least in part on one or more user characteristics; See Bronkema, Fig. 1B, elements 30, 30A, and 39 depicting model data. Also see ¶ 0141, e.g. “In particular, from profile/package database 39 (FIG. 8A), package designer 15 can determine, for example, the personal physical and emotional characteristics, and stress level, of the user and manifests those characteristics and stress level in, for example, but not limited to, the animated figure known as Charlie. In addition, during data analysis, when the status of the user's behavior ("problem/possible problem/no problem") is determined, package designer 15 (FIG. 8A) can prepare self-reflective information that is specialized for the current status.”
gathering, by the processor, a plurality of … data associated with a condition of a user; See Bronkema, Fig. 5A and ¶ 0105, e.g. “data collector 19.” Bronkema does not expressly disclose sensor data. However, Kaufman teaches this. See Kaufman, ¶ 0076, e.g. “the activity tracking client module 330 of the Lifestyle Coach device 205 may receive data automatically from the pedometer or accelerometer.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Bronkema’s data gathering with Kaufman’s sensor data in order to automatically collect data for tracking as suggested by Kaufman.
generating, by the processor, a policy comprising an end goal and one or more sub-goals based at least in part on the model data and the sensor data; See ¶ 0019, e.g. “Each goal has an associated set of action plans that the user agrees to work towards. By organizing goals into action plans, the system focuses the user on aspects of the problem that can seem more attainable psychologically and physically than the entire problem.” Also see Bronkema, ¶ 0087, e.g. “The illustrative embodiment of the system of the present invention can include a package designer 15 (see FIGS. 8A-B) that also can receive data and can prepare and continually update a user-specific program which can be a set of action plans, tips, etc. tailored to the user's current situation.” Also see Fig. 2A and ¶ 0089, e.g. “Package designer 15 can create … a user-specific program including customized set of tips, action plans, and animations based on the user's collected and analyzed data, and also based on the types of behavior identified according to the user's habitual behavior.” See ¶ 0186, e.g. “A user, system, or professional can determine, for example, the amount of calcium the user needs to consume over a period of time and help the user to adjust food intake, exercise, and medications, for example, to achieve that goal.” Also see Kaufman, ¶ 0092 and 0098, e.g. “determine 640 a user's initial activity level.”
Bronkema does not expressly disclose wherein the policy is generated using machine learning that learns an action-state value function and uses backward induction to estimate the policy; However, this is taught by Zhao. See Zhao, Summary section, e.g. “A temporal-difference learning method called Q-learning is utilized which involves learning an optimal policy from a single training set of finite longitudinal patient 0, Q̂1, …, Q̂T} for estimating optimal policies.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Bronkema’s action plans with Zhao’s Q-learning in order to utilize important breakthroughs in reinforcement learning, for constructing decision rules for sequential decision making to achieve the best clinical outcomes as suggested by Zhao (see top of p. 3), as well as finding an optimal strategy from an unknown system that only requires state knowledge to determine the best action (see top of p. 6).
iteratively adapting the policy, by the processor, based at least in part on one or more detected changes in the sensor data collected over a period of time to adjust at least one of the one or more sub-goals; and See Bronkema, Fig. 2A, items 15 and 23 depicting an iterative policy adaptation, along with ¶ 0089, e.g. “Package designer 15 … can continually update a user-specific program including customized set of tips, action plans, and animations based on the user's collected and analyzed data, and also based on the types of behavior identified according to the user's habitual behavior.” Also see Kaufman, Fig. 6(a), depicting an iterative process for adapting a policy.
the iteratively adapting comprising:
Bronkema does not expressly disclose: determining a  confidence level indicating a likelihood of meeting the end goal. However, this is taught by Williams. See Williams, ¶ 0180, e.g. “As with individual baseline outcomes, predicted intervention outcomes may 
comparing the confidence level to an acceptance threshold; based at least in part on determining that the confidence level does not meet the acceptance threshold, searching for a plurality of additional actions between a capability of the user and a plurality of model actions from the model data; and See Bronkema ¶ 0179, e.g. “Continuing still further to refer primarily to FIGS. 13B-D, if the matching process indicates problem range 1137 (FIG. 13C), that is, for example, a range of indicative actions 1123 (FIG. 13C) combinations that reaches a "problem" threshold, the user's actions are deemed to sufficiently reflect behavior pattern A 1104 (FIGS. 13B/C) with which indicative actions 1123 (FIG. 13C) are associated. … feedback provider 17 (FIG. 1A) can provide advance alarms and tips 1145 (FIG. 13C), server-based analysis messages 1149 (FIG. 13C) specific for one of behavior patterns 1103 (FIG. 13B).” Also see ¶ 0134, e.g. “Behavior updater 1309 can modify the user's set of identifies behaviors, if any modifications are necessary, and associated data.” Also see Fig. 6B, element 151, depicting a step for modification of a user package for program selection. Also see ¶ 0106, e.g. “If the results of the fitness tests indicate that the user's condition requires a medical professional's release (decision step 119) and if the user does not sign a disclaimer (decision step 121), the user may not be allowed to continue using the confidence level, but this is taught by Williams (see the above rejection of claim 4). Note that Bronkema’s “matching” provides a broad but reasonable interpretation for comparing. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Williams’ confidence level with Bronkema’s threshold in order to monitor progress as essentially taught by Bronkema.
based at least in part on determining that the confidence level meets the acceptance threshold, selecting the policy and the one or more sub-goals for display to the user; See Bronkema, ¶ 0018-0019, e.g. “In addition, Charlie can alert the user to important information, such as how the user's daily activity impacts the action plans related to the problem. … If the system determines that there is a system-recognizable problem, the system and/or a professional designs a user-specific package for the user that provides the user with a list of, for example, actions plans as well as tips, comments, etc., that the user can access and use.” Also see ¶ 0087, e.g. “The illustrative embodiment of the system of the present invention can include a package designer 15 (see FIGS. 8A-B) that also can receive data and can prepare and continually update a user-specific program which can be a set of action plans, tips, etc. tailored to the user's current situation.” Also see Fig. 2A, element 17, e.g. “Feedback Provider,” as well as ¶ 0089, e.g. “Feedback provider 17 (FIG. 2A) can present the user with feedback, which can include a self-reflective device (see FIGS. 8C/D), concerning the user's progress with respect to identified types of behavior.”
providing the policy and the one or more sub-goals to the user in combination with an interpretation of the policy and the one or more sub-goals that explains why the one or more sub-goals have been selected and one or more impacts of adjustment of at least one or the one or more sub-goals. See Bronkema, ¶ 0107, e.g. “Feedback provider 17 can present the user with a healthy weight range for the user, a weight loss recommendation, and the number of calories required to maintain or achieve the desired weight.” Thereby, Bronkema provides an explanation that the calorie goal is to achieve a desired weight. Also see ¶ 0116-0119 for specific disclosure of selected goals and sub-goals. Also see at least Fig. 11B and 11C, depicting the adjustment of subgoals to be used by the feedback provider for providing an impact. Also see Bronkema, Fig. 9A, element 17 along with ¶ 0143, e.g. “Action plan presenter 1703 visually relates goals and action plans for the user's review.” Also see Bronkema ¶ 0144 and 0165 for additional treatment of Also see Kaufman, ¶ 0093-0094, e.g. “Lifestyle Coach application software may suggest 665 certain physical activities.”

	In regard to claim 2, Bronkema discloses:
2. The computer-implemented method of claim 1, further comprising determining a plurality of behavioral patterns of the user based at least in part on the one or more user characteristics and the sensor data. See Bronkema, ¶ 0015, e.g. “behavioral patterns.”


3. The computer-implemented method of claim 2, wherein the one or more sub-goals comprise one or more personalized thresholds derived from the behavioral patterns. See Bronkema, ¶ 0186, e.g. “A user, system, or professional can determine, for example, the amount of calcium the user needs to consume over a period of time and help the user to adjust food intake, exercise, and medications, for example, to achieve that goal.”

	In regard to claim 4, Bronkema discloses:
4. The computer-implemented method of claim 3, further comprising 
using a table lookup or statistical functional approximation to determine a sequence of decision rules based at least in part on the one or more sub-goals and See Bronkema, ¶ 0090, e.g. “user tables.” Also see Fig. 10B, depicting determination of decision rules. 
Bronkema does not expressly disclose: the confidence level indicating the likelihood of meeting the end goal using the sequence of decision rules. However, this is taught by Williams. See Williams, ¶ 0180, e.g. “As with individual baseline outcomes, predicted intervention outcomes may be a single number or may be probability distributions, indicating the likelihood of a variety of outcomes.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Bronkema’s decisions with Williams’ probabilities in order to predict the likelihood of an outcome as suggested by Williams.


5. The computer-implemented method of claim 4, further comprising: 
prompting the user to perform one or more of the additional actions; See Bronkema ¶ 0179, e.g. “Continuing still further to refer primarily to FIGS. 13B-D, if the matching process indicates problem range 1137 (FIG. 13C), that is, for example, a range of indicative actions 1123 (FIG. 13C) combinations that reaches a "problem" threshold, the user's actions are deemed to sufficiently reflect behavior pattern A 1104 (FIGS. 13B/C) with which indicative actions 1123 (FIG. 13C) are associated. … feedback provider 17 (FIG. 1A) can provide advance alarms and tips 1145 (FIG. 13C), server-based analysis messages 1149 (FIG. 13C) specific for one of behavior patterns 1103 (FIG. 13B).” As noted above, Bronkema does not expressly disclose a confidence level, but this is taught by Williams (see the above rejection of claim 4). Note that Bronkema’s “matching” provides a broad but reasonable interpretation for comparing. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Williams’ confidence level with Bronkema’s threshold in order to monitor progress as essentially taught by Bronkema.
collecting a plurality of additional user data based at least in part on performance of the one or more of the additional actions by the user; and further adapting the policy based at least in part on the additional user data. See Bronkema, Fig. 2A, items 15 and 23 depicting an iterative policy adaptation, along with ¶ 0089, e.g. “Package designer 15 … can continually update a user-specific program including customized set of tips, action plans, and animations based on the user's collected and 

	In regard to claim 7, Bronkema discloses:
7. The computer-implemented method of claim 1, further comprising adjusting at least one sub-goal based at least in part on detecting that the user has exceeded or missed a previously determined instance of at least one of the one or more sub-goals. See ¶ 0179, e.g. “Continuing still further to refer primarily to FIGS. 13B-D, if the matching process indicates problem range 1137 (FIG. 13C), that is, for example, a range of indicative actions 1123 (FIG. 13C) combinations that reaches a "problem" threshold, the user's actions are deemed to sufficiently reflect behavior pattern A 1104 (FIGS. 13B/C) with which indicative actions 1123 (FIG. 13C) are associated. … feedback provider 17 (FIG. 1A) can provide advance alarms and tips 1145 (FIG. 13C), server-based analysis messages 1149 (FIG. 13C) specific for one of behavior patterns 1103 (FIG. 13B).”

	In regard to claim 8, Bronkema discloses:
8. A system comprising: a memory; and a processor coupled with the memory, the processor configured to: See Bronkema, at least Fig. 4A, depicting a personal device and computer, elements 61 and 63.Also see ¶ 0095, e.g. “In the illustrative embodiment, personal device 61 can be, for example, a standard PDA, such as a Palm OS device, with enough memory for applications code and data. Personal computer 63, in the illustrative embodiment, can execute an operating system and applications code 
All further limitations of claim 8 have been addressed in the above rejection of claim 1.

	In regard to claims 9-11 and 13, parent claim 8 is addressed above. All further limitations have been addressed in the above rejections of claims 2-5 and 7, respectively. 

	In regard to claim 14, Bronkema discloses:
14. A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processing circuit to cause the processing circuit to: See Bronkema, ¶ 0095, e.g. “In the illustrative embodiment, personal device 61 can be, for example, a standard PDA, such as a Palm OS device, with enough memory for applications code and data. Personal computer 63, in the illustrative embodiment, can execute an operating system and applications code that allow synchronization between personal device 61 and personal computer 63, with enough free disk space to enable synchronization, archiving, and storage of application files.”
All further limitations of claim 14 have been addressed in the above rejection of claim 1.

	In regard to claims 15-17 and 19, parent claim 14 is addressed above. All further limitations have been addressed in the above rejections of claims 2-5 and 7, respectively. 

Claims 6, 12, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Bronkema in view of Kaufman, Zhao, and Williams as cited above, and further in view of U.S. Patent 9,764,162 to Willcut et al. (“Willcut”).

	In regard to claim 6, Bronkema discloses:
6. The computer-implemented method of claim 1, wherein iteratively adapting the policy comprises performing an updated evaluation of the sensor data collected over the period of time in combination with a plurality of previously collected instances of the sensor data … See ¶ 0093, e.g. “For example, the method of the present invention can include the step of continually accumulating personal, physical, and behavioral information from the user (method step 41). In this step, the user may provide information such as, for example, gender, age, height, weight, food intake information, mood, psychological data, location, possible thoughts and beliefs, and exercise information. These data may be collected over a continuous period such as, for example, 14 days. The method may further include the step of continuously monitoring and analyzing the user's behavior inferred from data collected in the previous step (method step 43).” Bronkema does not expressly disclose using statistical and reinforcement learning to classify the one or more detected changes. However, Willcut 
Bronkema does not expressly disclose and further wherein the policy is learned using a parametric regression model for a Q-function at each stage of a multiple stage analysis, and each stage has a different outcome goal. However, this is taught by Zhao. See Zhao, bottom of p. 6, e.g. “According to the recursive form of Q-learning in (1), we T recursively back to Q̂0 at the beginning.” Also see Zhao, section 3, bottom of p. 7, e.g. “… Q-learning is a generalization of the familiar regression model. When the dimension of the action space is small, linear regression methods can sometimes be adequate; but, more generally, quadratic regression or higher order regression is desirable for estimating the Q-function.”

	In regard to claims 12, parent claim 8 is addressed above. All further limitations have been addressed in the above rejection of claim 6. 

 	In regard to claims 18, parent claim 14 is addressed above. All further limitations have been addressed in the above rejection of claim 6. 

Claims 20-21 and 23-24 are rejected under 35 U.S.C. 103 as being unpatentable over Bronkema in view of Kaufman, Zhao, and U.S. Patent Application Publication 2018/0025274 by Beller et al. (“Beller”).

	In regard to claim 20, Bronkema discloses:
20. A computer-implemented method comprising: See Bronkema, Fig. 1B, depicting a distributed computer system. Also see at least Fig. 3, broadly depicting a method.
generating, by a processor, a policy comprising an end goal and one or more sub-goals based at least in part on a plurality of model data and … data; See ¶ 0019, e.g. 
Bronkema does not expressly disclose sensor data. However, Kaufman teaches this. See Kaufman, ¶ 0076, e.g. “the activity tracking client module 330 of the Lifestyle Coach device 205 may receive data automatically from the pedometer or accelerometer.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Bronkema’s data gathering with Kaufman’s sensor data in order to automatically collect data for tracking as suggested by Kaufman.
wherein the policy is generated using machine learning that learns an action-state value function and uses backward induction to estimate the policy; However, this is taught by Zhao. See Zhao, Summary section, e.g. “A temporal-difference learning method called Q-learning is utilized which involves learning an optimal policy from a single training set of finite longitudinal patient trajectories. Approximating the Q-function with time-indexed parameters can be achieved by using support vector regression or extremely randomized trees.” Also see top of p. 7, e.g. “Once the backwards estimation process is done, we save the sequence of {Q̂0, Q̂1, …, Q̂T} for estimating optimal policies.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Bronkema’s action plans with Zhao’s Q-learning in order to utilize important breakthroughs in reinforcement learning, for constructing decision rules for sequential decision making to achieve the best clinical outcomes as suggested by Zhao (see top of p. 3), as well as finding an optimal strategy from an unknown system that only requires state knowledge to determine the best action (see top of p. 6).
providing the policy and the one or more sub-goals to a user; See Bronkema, Fig. 9A, element 17 along with ¶ 0143, e.g. “Action plan presenter 1703 visually relates goals and action plans for the user's review.” Also see Kaufman, ¶ 0093-0094, e.g. “Lifestyle Coach application software may suggest 665 certain physical activities.”
receiving a policy adjustment request from a dialog system … to modify one or more aspects of the policy; See Bronkema, Fig. 15E-F, depicting a dialog for policy modification. 
through a question answering interface that provides a natural language response. However, this is taught by Beller, which teaches the use of a natural language QA system which utilizes a confidence threshold to trigger data display based on a user's goals. See ¶ 0043 and 0077. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Bronkema’s dialog with Beller’s QA system in order to automatically answer questions from a user using relevant content as essentially suggested by Beller.
generating, by the processor, one or more projected policy variations of the policy based at least in part on the policy adjustment request, a condition of the user associated with the sensor data, and the model data in combination with an interpretation of the policy and the one or more sub-goals that explains why the one or more sub-goals have been selected and one or more impacts of adjustment of the one or the one or more aspects of the policy; and See Bronkema, Fig. 3, depicting generation of a policy. Also see Bronkema, ¶ 0107, e.g. “Feedback provider 17 can present the user with a healthy weight range for the user, a weight loss recommendation, and the number of calories required to maintain or achieve the desired weight.” Thereby, Bronkema provides an explanation that the calorie goal is to achieve a desired weight. Also see ¶ 0116-0119 for specific disclosure of selected goals and sub-goals. Also see at least Fig. 11B and 11C, depicting the adjustment of a policy to be used by the feedback provider for providing an impact. Also see Bronkema, Fig. 9A, element 17 along with ¶ 0143, e.g. “Action plan presenter 1703 visually relates goals and action plans for the 
capturing a plurality of different scenarios for the user in the one or more projected policy variations of the policy; and See ¶ 0197 along with Fig. 15E depicting a plurality of scenarios for selection by a user.
confirming, by the processor, a user selection of one of the one or more projected policy variations as an updated version of the policy. See Bronkema, Fig. 8B, e.g. “Save Action Plan(s). Also see ¶ 0197, e.g. “Referring now to FIGS. 15E/F, the user can customize a workout schedule from server 65 (FIG. 4A) to augment or replace what the user has scheduled using screens illustrated in FIGS. 14F/G.”

	In regard to claim 21, Bronkema discloses:
21. The computer-implemented method of claim 20, wherein the policy adjustment request comprises an expected deviation in one or more user actions preventing at least one of the one or more sub-goals of the policy from being met. Also see Bronkema, ¶ 0087, e.g. “The illustrative embodiment of the system of the present invention can include a package designer 15 (see FIGS. 8A-B) that also can receive data and can prepare and continually update a user-specific program which can be a set of action plans, tips, etc. tailored to the user's current situation.” Also see Fig. 2A and ¶ 0089, e.g. “Package designer 15 can create … a user-specific program including customized set of tips, action plans, and animations based on the user's collected and 

	In regard to claim 23, Bronkema discloses:
23. A system comprising: a memory; and a processor coupled with the memory, the processor configured to: See Bronkema, at least Fig. 4A, depicting a personal device and computer, elements 61 and 63.Also see ¶ 0095, e.g. “In the illustrative embodiment, personal device 61 can be, for example, a standard PDA, such as a Palm OS device, with enough memory for applications code and data. Personal computer 63, in the illustrative embodiment, can execute an operating system and applications code that allow synchronization between personal device 61 and personal computer 63, with enough free disk space to enable synchronization, archiving, and storage of application files.”
All further limitations of claim 23 have been addressed in the above rejection of claim 20.

	In regard to claim 24, parent claim 23 is addressed above. All further limitations have been addressed in the above rejections of claim 21. 

s 22 and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Bronkema in view of Kaufman, Zhao, and Beller as cited above, and further in view of Williams.

	In regard to claim 22, Bronkema discloses:
22. The computer-implemented method of claim 20, wherein the one or more projected policy variations of the policy adjust at least one sub-goal to increase a confidence level of the end goal being met based at least in part on the policy adjustment request, the condition of the user associated with the sensor data, and the model data. However, this is taught by Williams. See Williams, ¶ 0180, e.g. “As with individual baseline outcomes, predicted intervention outcomes may be a single number or may be probability distributions, indicating the likelihood of a variety of outcomes.” Also see ¶ 0192, e.g. “Alternatively, the fault tolerance limits 1904 may also increase, or otherwise vary, as the patient 2104 answers each survey. The alterations in the fault tolerance limits 1904 encourage the patient 2104 to proceed along the patient behavioral path 1902 until the patient reaches the predicted intervention outcome 1912.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Bronkema’s decisions with Williams’ probabilities in order to predict the likelihood of an outcome as suggested by Williams.

	In regard to claim 25, parent claim 23 is addressed above. All further limitations have been addressed in the above rejections of claim 22. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
“Robust Hybrid Learning for Estimating Personalized Dynamic Treatment Regimens” by Liu et al. (“Liu”).
“Optimal Pricing for the Competitive and Evolutionary Cloud Market” by Xu et al. (“Xu”). See section 4.1, p. 142, bottom left column, e.g. “When we obtain the convergent table, an optimal policy can be easily found by choosing the action with the highest expected profit Q(s, a) in any state s.” Also see section 4.2, p. 142, bottom left column, e.g. “, we adopt backward induction to find an optimal pricing policy when the market is still evolving by leveraging the convergent table Q.” 
2013/0226846  by Li et al teaches a QA system providing natural language answers. See ¶ 0008.
2018/0218171 by Bellalla et al. (“Bellalla”) Bellalla teaches learning using a parametric regression model. See Bellalla, ¶ 0034.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to James D Rutten whose telephone number is (571)272-3703.  The examiner can normally be reached on M-F 9:00-5:30 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on (571)272-3768.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/James D. Rutten/Primary Examiner, Art Unit 2121