DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-3, 6-10, 12-17 and 19 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Arora et al. 2021/0397859
Regarding claims 1, 14 and 19, Arora discloses a computerized apparatus having a processor and coupled memory (see fig.5, element 540/500, 548/500, fig.6B-C, elements 602/610, 604/610, paragraphs [0009-0010], [0121-0122] and descriptions) the processor being adapted to perform the steps of: determining a plurality of subsets of features, each of which is a subset of a set of features, whereby obtaining a plurality of different subsets of the set of features (see fig.5, element 540, 544/546, fig.6B-C, elements 602, paragraphs [0009-0010], [0002-0003], [0041-0042], [0052-0061] and its description); for each subset of features of the plurality of subsets of features, determining a policy, wherein the policy is a function defining an action based on valuation of the subset of features, wherein the policy is determined using a Markov Decision Process (MDP), whereby obtaining a plurality of policies (see abstract, fig.5, fig.6A-B, element 600, paragraphs [0002-0003], [0064], [0083], [0097] and its description); obtaining a state, wherein the state comprises a valuation of each feature of the set of features (see abstract, fig.5, fig.6A-B, element 600, paragraphs [0002-0003], [0064], [0083], [0097] and descriptions); applying the plurality of policies on the state, whereby obtaining a plurality of suggested actions for the state, based on different projections of the state onto different subsets of features (see abstract, e fig.5, fig.6A-B, element 600, paragraphs [0002-0003], [0064], [0083], [0097] and its description); determining, for the state, one or more actions and corresponding scores thereof based on the plurality of suggested actions (see abstract, fig.5, fig.6A-B, element 600, paragraphs [0002-0003], [0064], [0083], [0097] and its description); and training a reinforcement learning model using the state and the one or more actions and corresponding scores thereof (see abstract, fig.5, fig.6B, element 600, paragraphs [0064], [0078], [0097], [0139] and its description). 
Regarding claims 2 and 15, Arora further discloses obtaining the state comprises generating the state by generating the valuation of at least a portion of the set of features (see paragraphs [0021], [0059]).
Regarding claims 3 and 16, Arora further discloses obtaining the state and said determining for the state are performed a plurality of times for different states in a training dataset, wherein said training is performed using the training dataset (see paragraph [0061]).
Regarding claim 6, Arora further discloses the reinforcement learning model is a deep reinforcement learning model (see paragraph [0084], [0150]).
Regarding claim 7 and 17, Arora further discloses obtaining a new state; and applying the reinforcement learning model to determine an action for the new state (see abstract, fig.5, fig.6A-B, element 600, paragraphs [0002-0003], [0064], [0083], [0097] and its description).
Regarding claim 8, Arora further discloses applying the reinforcement learning model is performed without consulting with the plurality of policies (see abstract, fig.5, fig.6A-B, element 600, paragraphs [0002-0003], [0064], [0083], [0097] and its description).
Regarding claim 9, Arora further discloses determining the plurality of subsets of features comprises randomly determining the subsets of features (see paragraphs [0099-0100]).
Regarding claim 10, Arora further discloses a unification of the plurality of subsets of features comprises all features of the set of features (see paragraphs [0099-0100]).
Regarding claim 12, Arora further discloses the reinforcement learning model is configured to provide a recommendation action for a state representing information about a user (see paragraph [0139]).
Regarding claim 13, Arora further discloses the MDP is a Constrained MDP (CMDP) (see paragraph [0097]).
Allowable Subject Matter
Claims 4-5, 11 and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Examiner's Note: Examiner has cited particular columns and line numbers in the references applied to the claims above for the convenience of the applicant. Although the specified citations are representative of the teachings of the art and are applied to specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
When responding to this Office Action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111(c). 
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CongVan Tran whose telephone number is (571) 272-7871. The examiner can normally be reached on Mon-Th.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Yuwen Pan can be reached on (571) 272-7855. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

    PNG
    media_image1.png
    125
    125
    media_image1.png
    Greyscale
UNITED STATES PATENT AND TRADEMARK OFFICE
/CONGVAN TRAN/Primary Examiner, Art Unit 2647