EXAMINER' S AMENDMENT
An examiner' s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner' s amendment was given in a telephone interview with Ms. Lisa Tom (Reg. No. 52,291) on April 1, 2021.

The application has been amended as follows: 

2.	Claims 1, 5, 11, 15, and 20  are amended as follows.

1.	(Currently amended) An apparatus, comprising: 
an Artificial Intelligence ("AI") engine that has multiple independent modules on one or more computing platforms, the multiple independent modules comprising an instructor module configured to cooperate with a learner module to train a plurality of AI objects, where the multiple independent modules are configured to have their instructions executed by one or more processors in the one or more computing platforms, where the AI engine has a user interface presented on a display screen for use by one or more users; 
where the instructor module is configured to apply a hierarchical-decomposition reinforcement learning technique to train each AI object of the plurality of AI objects as concept nodes composed in a hierarchical graph incorporated into an AI model, where the instructor 
where the user interface is configured to cooperate with the instructor module to send user input information for the instructor module, where the instructor module uses the user input information to automatically partition the individual sub-tasks into the concept nodes in the AI model and apply the hierarchical-decomposition reinforcement learning technique to train the plurality of AI objects; 
where the learner module is configured to cooperate with one or more data sources to obtain data for training and to conduct the training of the AI objects corresponding to the individual sub-tasks in parallel at the same time; and 
wherein, via decomposing the complex task, the AI engine uses one or more reward functions that are specific to each individual sub-task and then one or more separate reward functions focused for the end solution of the complex task, which the parallel training and the use of reward functions focused for solving each individual sub-task speed up an overall training duration for the complex task on the one or more computing platforms, and resulting AI model, compared to an end-to-end training with a single algorithm for all of the AI objects incorporated into the AI model. 

5.	(Currently amended) The apparatus of claim 1, wherein 
the instructor module is configured to automatically partition the individual sub-tasks into the concept nodes in the AI model in a number of ways, where the ways of partitioning the individual sub-tasks into the concept nodes are selected from a group consisting of i) how to partition the individual sub-tasks is explicitly defined in scripted code from the one or more users, ii) how to partition the individual sub-tasks is based on general guidance in the scripted code from the one or more users, iii) how to partition the individual sub-tasks is based on responses from the one or more users to a presented list of questions, iv) how to partition the individual sub-tasks includes using a clustering technique, and v) any combination of these four, and then the instructor module proposes a hierarchical structure for the hierarchical graph of AI objects making up the AI model. 

11.	(Currently amended) A method to apply Reinforcement Learning for an Artificial Intelligence (AI) model for subsequent deployment of that AI model, comprising: 
applying a hierarchical-decomposition reinforcement learning technique to train each AI object of a plurality of AI objects as concept nodes composed in a hierarchical graph incorporated into the AI model, 
using the hierarchical-decomposition reinforcement learning technique to hierarchically decompose a complex task into multiple smaller, individual sub-tasks making up the complex task, where each of the individual sub-tasks corresponds to its own concept node in the hierarchical graph, and initially train the AI objects corresponding to the individual sub-tasks and then train the AI model on how the individual sub-tasks interact with each other in the complex task in order to deliver an end solution to the complex task; 
automatically partition the individual sub-tasks into the concept nodes in the AI model and apply the hierarchical-decomposition reinforcement learning technique to train the plurality of AI objects; 
cooperating with one or more data sources to obtain data for training and to conduct the training of the AI objects corresponding to the individual sub-tasks in parallel at the same time; and 
using one or more reward functions that are specific to each individual sub-task and then one or more separate reward functions focused for the end solution of the complex task, where a combined parallel training and the use of reward functions focused for solving each individual sub-task speed up an overall training duration for the complex task on one or more computing platforms, and subsequent deployment of the trained AI model, compared to an end-to-end training with a single algorithm for all of the AI objects incorporated into the AI model. 

15.	(Currently amended) The method of claim 11, wherein 
using information supplied from a user interface to automatically partition the individual sub-tasks into the concept nodes in the AI model comprises automatically partitioning the individual sub-tasks into the concept nodes in the AI model in a number of ways, where the ways of partitioning the individual sub-tasks into the concept nodes are selected from a group consisting of i) how to partition the individual sub-tasks is explicitly defined in scripted code from the one or more users, ii) how to partition the individual sub-tasks is based on general guidance in the scripted code from the one or more users, iii) how to partition the individual sub-tasks is based on responses from the one or more users to a presented list of questions, and iv) any 

20.	(Currently amended) A computing device,  comprising: 
two or more modules configured to cooperate with each other in order to apply a hierarchical-decomposition reinforcement learning technique to train each AI object of a plurality of AI objects as concept nodes composed into a hierarchical graph incorporated into an AI model, where each individual sub-task of a decomposed complex task corresponds to its own concept node in the hierarchical graph, where each individual sub-task is initially trained on how to complete its individual sub-task and then an integrator node is trained to choose which of its children nodes to use for solving its individual sub-task at that moment, to deliver an end solution to the complex task, where during training of the individual sub-tasks of the decomposed complex task the AI engine is configured to use i) one or more reward functions that are specific to each individual sub-task and then ii) a separate one or more reward functions that are specific to the end solution of the complex, where the AI engine is configured to conduct the training of a first individual sub-task of the decomposed complex task in parallel at the same time with the training of a second individual sub-task of the decomposed complex; and
a user interface configured to supply user input information that is used to automatically partition the individual sub-tasks into the concept nodes in the AI model and apply the hierarchical-decomposition reinforcement learning technique to train the plurality of AI objects.




REASON FOR ALLOWANCE

1.	The following is an examiner’s statement of reasons for allowance: The instant invention is related to a method, an apparatus, a device for applying reinforcement learning for an Artificial Intelligence (AI) model.

2.	Prior art was found and applied in the previous actions. However, in consideration of applicant’s newly amended claims and arguments filed on 3/8/2021, there is not strong motivation or reasoning to combine references to arrive at the claimed invention. Claims 1-20 are allowed.

3.	Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KATE H LUO whose telephone number is (571)270-5635.  The examiner can normally be reached on 8:00-5:00PM.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/KATE H LUO/Primary Examiner, Art Unit 2488