Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Amendments
This action is in response to amendments and/or arguments filed on 05/12/2021. As per applicants request, claims 1, 10, and 12 have been amended. No new claims have been added. Claims 3 and 14 have been canceled. Claims 1, 4-10, 12, 15-22 remain pending.

In response to applicant’s amendments and/or arguments filed on 05/12/2021, the 35 U.S.C. 103 rejections made for claims 1, 4-10, 12, and 15-22 in the previous office action have been withdrawn.

Reason for Allowance
The following is an examiners statement of reasons for allowance: Claims 1, 10, and 12 are considered to be allowable as none of the references of record, either alone or in combination, fairly disclose or suggest the combination of limitations specified in the independent claims, including at least:

Regarding Claims 1, 10, and 12
	
“…computing a gradient of a matching loss function that measures differences in the respective second policy outputs generated by the candidate agent policy neural networks, and includes one or more terms that decrease an impact of the matching loss function on the training as the respective weight
assigned to the final agent policy neural network in the mixing data increases during training;”
	The closest prior art of record includes Egorov which discloses a reinforcement learning system using multi-agents which are trained to learn from the behavior of other agents. However, Egorov is silent with regard to the recited limitation of decreasing the impact of the matching loss function when the assigned weight of the policy neural network in the mixing data increases during training.
	In addition, Rusu discloses reinforcement learning technique that trains agents separately and allows for policy distillation to occur. However, Rusu is also silent with regard to the recited limitation of decreasing the impact of the matching loss function when the assigned weight of the policy neural network in the mixing data increases during training.
	Furthermore, Finn et al. “A Connection Between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models” discloses a system comprising Generative adversarial network with inverse reinforcement learning. However, Finn is silent with regard to the recited limitation of decreasing the impact of the matching loss function when the assigned weight of the policy neural network in the mixing data increases during training.
	

Dependent claims 4-19, and 15-22 are allowed as they depend upon an allowable independent claim.

Any comments considered necessary by the applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee. Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VASYL DYKYY whose telephone number is (571)270-5019.  The examiner can normally be reached on M-F 7:30 - 4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/V.D./Examiner, Art Unit 2122   

/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122