DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Amendments
This action is in response the claims filed 06/08/2022 in which claims 1, 3-6, 10, 12, 14, 16 and 18 have been amended. Currently claims 1-20 are pending. 
Reasons for Allowance
The following is an examiner’s statement of reasons for allowance: Claims 1, 10, and 16 are considered allowable since when reading the claims in light of the specification, as per MPEP § 2111.01, none of the references of record either alone or in combination fairly disclose or suggest the combination of limitations specified in the independent claims, including at least:
In each of Claims 1, 10, and 16:
	… when the environment is in the first state, concurrently performing two or more of the plurality of discrete actions within the environment that have estimated action probabilities greater than a threshold, or concurrently performing two or more of the plurality of discrete actions, each discrete action performed with a probability of a corresponding estimated action probability;
Specifically, the combination of when the environment is in the first state, concurrently performing two or more of the plurality of discrete actions within the environment that have estimated action probabilities greater than a threshold, or concurrently performing two or more of the plurality of discrete actions, each discrete action performed with a probability of a corresponding estimated action probability was not taught or fairly suggested in the prior art of record. 
The closest prior art of record is Mnih et al. (“Human-level control through deep reinforcement learning”) which describes reinforcement learning and controlling an agent in a video game environment, however does not disclose independently calculating an estimated action probability of performing the discrete action. Sharma et al. (“Learning to Factor Policies and Action-Value Functions: Factored Action Space Representations for Deep Reinforcement learning) discloses training a single reinforcement agent to calculate an estimated action probability of performing the discrete action, however as noted by the applicant in the remarks on pg. 8, the reference does not teach using a threshold. The other cited references of Weaver et al. (“The Optimal Reward Baseline for Gradient-Based Reinforcement Learning”) and Pandey et al. (“Reinforcement Learning by Comparing Immediate Reward”) do not disclose concurrently performing two or more discrete actions that have estimated action probabilities greater than a threshold as required by the claims. 
Therefore, the independent claims taken as a whole, including the specific features of concurrently performing two or more of the plurality of discrete actions within the environment that have estimated action probabilities greater than a threshold, or concurrently performing two or more of the plurality of discrete actions, each discrete action performed with a probability of a corresponding estimated action probability are non-obvious over the cited prior art and are considered allowable. When taken as a whole, the dependent claims are allowed at least because of the allowable features recited in independent Claims 1, 10, and 16. 
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Conclusion
Claims 1-20 are allowed.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL H HOANG whose telephone number is (571)272-8491. The examiner can normally be reached Mon-Fri 8:30AM-4:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/M.H.H./Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122