Allowability Notice
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Allowable Claims
The claims 1-20 are allowed.  Specifically, the independent Claims 1,8 and 15 are allowed over the prior art. The dependent Claims are also allowed due to its dependencies to said independent Claim. 

Reasons for allowance
Regarding prior art, Claims 1, 8 and 15. Though the prior arts search,
a) Wang et al. (US11,226,673) discloses affective strategy formulator 208 may be implemented to build a Markov decision process (MDP) model through reinforcement learning based on a collection of status data (emotion-related data, emotion state, and/or semantic data), a collection of actions (normally referring to instructions), state conversion distribution function (the probability of user' emotion state to change after a certain action), reward function (to determine the ultimate purpose of an affective interaction session, e.g., when chatting with a robot, the longer the conversation is, the higher the reward function is). In such embodiments, a well-trained model may be able to formulate an affective and interaction strategy and derive an affective command therefrom directly based on user's various inputs. In such embodiments, user intention computing processor 206 may be configured as a recessive part within the state conversion distribution function..
b)Matsubara et al. (US 2020/0057416) disclose each learning processing section 412 may perform the learning processing using the steepest descent method, a neural network, a DQN (deep Q-network), a Gaussian process, deep learning, or the like. Furthermore, each learning processing section 412 may perform the learning processing without using the reward values, in which case the process of step S7 does not need to be performed. If each agent 41 is realized by an individual PC, these PCs may perform the learning processing of step S9 in a stand-alone state. In this way, there is no need to connect these PCs to a network for communication or the like, and it is possible to reduce the load of processes relating to the network in each PC..
c) Khan et al. (EP 3579154) disclose A computer-implemented method comprising: using one or more synthetic user models to train a particular reinforcement learning agent, each of the one or more synthetic user models comprising a behaviour function and a response generation function, wherein a reinforcement learning agent is configured to produce an output action based on an input state to the reinforcement learning agent, a behaviour function is configured to model a response to different output actions produced by a reinforcement learning agent, and a response generation function is configured to use a behaviour function to generate a response to an output action produced by a reinforcement learning agent, and wherein the training comprises, for each of the one or more synthetic user models: the particular reinforcement learning agent producing an output action based on a current input state; the response generation function of the synthetic user model using the behaviour function of the synthetic user model to generate a response to the output action; appropriately updating the particular reinforcement learning agent based on the response; and iteratively repeating the producing an output action, generating a response and updating the particular reinforcement learning agent steps, wherein for each subsequent iteration the particular reinforcement learning agent takes as an input state the response generation function's response from the previous iteration in order to determine the output action to produce; and providing the particular trained reinforcement learning agent as an output for use by a user application..
d) Kanemaru et al.  (US 2018/0356793) discloses in fig. 6 is a flowchart explaining the operation of the machine learning device in a learning phase according to the embodiment of the present invention;.
The prior arts fail (see PTO 892) and IDS to further teach or suggest a combination, specifically 
Claims 1, 8 and 15: “communicating the first user state to a reinforcement learning agent; using the reinforcement learning agent to select a first motivational action based at least in part on the first user state; receiving a second communication from the user device; determining a second user state based at least in part on the second communication from the user device, wherein the second user state comprises an attribute conveying a tiered reinforcement learning reward categorization; generating a reward based at least in part on the tiered reinforcement learning reward categorization; communicating the reward to the reinforcement learning agent; communicating the second user state to the reinforcement learning agent; updating the reinforcement learning agent; and determining, based at least in part on whether the second user state corresponds to a goal of the user-interaction session, to wind up control of the user-interaction session.”
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD K ISLAM whose telephone number is (571)270-0328. The examiner can normally be reached M-F 9:00 a.m. - 5:00 p.m..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, John Breene can be reached on 571-272-4107. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

M.K.I
Primary Examiner
Art Unit 2864



/MOHAMMAD K ISLAM/Primary Examiner, Art Unit 2864