DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Receipt is acknowledged of amendments/arguments filed on 08/15/2022.
Claims 1-20 are presented for examination.
This application claims benefit of 62/673,144 filed on 05/18/2018 and claims benefit of 62/772,637 filed on 11/29/2018.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-3, 8-13 and 18-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Tang et al. (US 2018/0046773) in view of Czarnecki et al. (WO 2018083671 A1).
Re Claims 1 and 11: Tang et al. teaches medical system and method for providing medical prediction, which includes obtaining training data related to an interaction system, the interaction system interacting with a reinforcement learning agent {herein the reinforcement learning algorithm utilizes training data set with known disease diagnosis and known symptoms to train the third prediction model}, the reinforcement learning agent being configured for selecting sequential actions, the training data comprising a medical record indicating a relationship between a diagnosed disease and diagnosed symptoms related to the diagnosed disease (¶ 12+, 84-89+); training a neural network model to maximize cumulative rewards collected by the reinforcement learning agent in response to the sequential actions (¶ 48+, 80-89+), wherein the neural network model is utilized by the reinforcement learning agent for selecting the sequential actions from a set of candidate actions, the sequential actions comprises a plurality of symptom inquiry actions and a result prediction action (¶ 84-88+); and during training of the neural network model, providing auxiliary rewards {herein the examiner interprets the prediction model MDL3 as the auxiliary reward} of the cumulative rewards to the reinforcement learning agent according to a comparison between the symptom inquiry actions and the diagnosed symptoms (see ¶ 40+, 80+), and providing a main reward of the cumulative rewards to the reinforcement learning agent according to a comparison {herein the examiner interprets the prediction module 124 configure to generate a result prediction, or/and at least one medical department suggestion matching the possible disease, see 40+. Also the analysis engine 120 will evaluate or select the symptom inquiry and make the result prediction, see ¶ 52+} between the result prediction action and the diagnosed disease (¶ 12+, 84-88+).
Tang et al. fails to specifically teach providing auxiliary rewards of the cumulative rewards to the reinforcement learning agent according to a comparison between the symptom inquiry actions and the diagnosed symptoms.
Czarnecki et al. teaches reinforcement learning with auxiliary tasks, which includes providing auxiliary rewards of the cumulative rewards to the reinforcement learning agent according to a comparison between the symptom inquiry actions and the diagnosed symptoms (see ¶ 7-12+, 25-27+).
In view of Czarnecki et al.’s teachings, it would have been obvious to an artisan of ordinary skill in the art at the time the invention was made to employ into the teachings of Tang et al. providing auxiliary rewards of the cumulative rewards to the reinforcement learning agent according to a comparison between the symptom inquiry actions and the diagnosed symptoms so as to promote an action policy for improving performances on certain tasks while keeping the optimal reinforcement learning policy unchanged.  
Re Claim 2 and 12: Tang et al. teaches system and method, wherein the operation of providing the auxiliary rewards comprises: comparing each one of the symptom inquiry actions with the diagnosed symptoms in the training data; and in response to one of the symptom inquiry actions matching one of the diagnosed symptoms in the training data, providing a positive auxiliary reward; and in response to the one of the symptom inquiry actions failing to match any one of the diagnosed symptoms in the training data, providing a negative auxiliary reward (¶ 12+).
Re Claims 3 and 13: Tang et al. teaches system and method, wherein the operation of providing the auxiliary rewards comprises: determining whether a currently-selected one of the symptom inquiry actions and a previously-selected one of the symptom inquiry actions direct to one same symptom; and in response to the currently-selected one of the symptom inquiry actions and the previously-selected one of the symptom inquiry actions directing to the same symptom, providing the negative auxiliary reward (¶ 40-46+, 52+).
Re Claims 8 and 18: Tang et al. teaches system and method, wherein the sequential actions selected by the reinforcement learning agent cause the interaction system to move from one state to another state, state data of the interaction system comprises symptom data bits and context data bits, the symptom data bits indicate a positive status, a negative status or an unconfirmed status of symptoms occurred to a patient in the medical record, the context data bits indicate related information of the patient in the medical record (¶ 50+, 98+).
Re Claims 9 and 19: Tang et al. teaches system and method, wherein the result prediction action comprises at least one of a disease prediction action and a medical department recommendation action corresponding to the disease prediction action (¶ 53-60+, 99+).
Re Claims 10 and 20: Tang et al. teaches system and method, wherein after the neural network model is trained, the control method further comprising: collecting an initial symptom by the interaction system from a user as an initial state to the reinforcement learning agent; selecting the sequential actions according to the neural network model; and providing a disease prediction or a medical department recommendation to the user according to the result prediction action of the sequential actions (¶ 60-654, 90-106+).
Re Claims 4-5 and 14-15: The teachings of Tang et al. have been discussed above.
Tang et al. fails to specifically teach the auxiliary rewards in a sequential order are provided with gradually increasing discounts, a first auxiliary reward of the auxiliary rewards is provided at an earlier state than a second auxiliary reward of the auxiliary rewards, and the second auxiliary reward is provided with a discount factor.
Czarnecki et al. teaches reinforcement learning with auxiliary tasks, which includes auxiliary rewards in a sequential order are provided with gradually increasing discounts (see ¶ 100- 101+), a first auxiliary reward of the auxiliary rewards is provided at an earlier state than a second auxiliary reward of the auxiliary rewards, and the second auxiliary reward is provided with a discount factor {herein one or more respective auxiliary task rewards}  (¶ 46+, 64-72+).

In view of Czarnecki et al.’s teachings, it would have been obvious to an artisan of ordinary skill in the art at the time the invention was made to employ into the teachings of Tang et al. a first auxiliary reward and a second auxiliary reward so as to produce a simulator output and promote an action policy for improving the performance on certain tasks.

Allowable Subject Matter
Claims 6-7 and 16-17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  the prior art of record fails to specifically teach the neural network model comprises a common neural network portion, a first branch neural network portion and a second branch neural network portion, the first branch neural network portion and the second branch neural network portion are respectively connected to the common neural network portion, a first result state generated by the first branch neural network portion is utilized to select the symptom inquiry actions or the result prediction action, a second result state generated by the second branch neural network portion is utilized to reconstruct a possibility distribution of symptom features. These limitations in conjunction with other limitations in the claimed invention were not shown by the prior art of record.

Response to Arguments
Applicant’s arguments with respect to claim(s) 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Aggarwal et al. (US 2020/0341976) teaches interactive search experience using machine learning.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EDWYN LABAZE whose telephone number is (571)272-2395. The examiner can normally be reached Monday through Friday 8:30AM to 5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mr. Steve Paik can be reached on 571-272-2404. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/EDWYN LABAZE/Primary Examiner, Art Unit 2887