Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


EXAMINER’S AMENDMENT
	An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
	Authorization for the examiner’s amendment was given in an email to the Examiner sent by Peter Yi on February 14, 2022. The application has been amended as follows:


(Currently Amended) A computer-readable storage medium storing instructions which, when executed by a hardware processor, cause the hardware processor to: 
receive content information regarding available contents; 
receive user information regarding a user; 
input content vectors based on the content information and a user vector based on the user information to a reinforcement learning model; 
recommend a set of personalized contents for the user from the available contents, the set of personalized contents being output by thto the user; 
train the reinforcement learning model by: 
calculating a reward value for the user action based on a reward function that includes a monetization term and an engagement term, the monetization term including a monetization tuning parameter that is manually set as a weight for targeting a monetization business goal, the engagement term including an engagement tuning parameter that is manually set as a weight for targeting an engagement business goal; and adapting the reinforcement learning model using the reward value to increase a probability of future occurrences of the user Application No.: 16/834,815 Attorney Docket No.: 407780-US-NPPage 2 of 11action that help achieve the monetization business goal and the engagement business goal.  

(Canceled)  


(Currently Amended) The computer-readable storage medium of claim 1, wherein the instructions further cause the hardware processor to: generate the content vectors associated with the available contents based on the content information; and generate the user vector associated with the user based on the user information, wherein the set of personalized contents is selected by the reinforcement learning model based on the content vectors and the user vector.  

(Currently Amended) The computer-readable storage medium of claim 1, wherein the reinforcement learning model selects the set of personalized contents that maximize the reward value.  



(Currently Amended) A system, comprising: 
a hardware processor; 
and storage having instructions which, when executed by the hardware processor, cause the hardware processor to: 
receive game information regarding available games; 
generate game vectors associated with the available games based on the game information; 
receive user information regarding a user; 
generate a user vector associated with the user based on the user information; Application No.: 16/834,815 Attorney Docket No.: 407780-US-NP 
Page 3 of 11input the game vectors and the user vector to a machine learning model; 
recommend a personalized set of games for the user from the available games, the personalized set of games being output by the machine learning model; 
receive a user action associated with a selected game from the personalized set of games; 
calculate a reward value for the user action using a reward function that includes terms associated with business goals, the terms having tuning parameters that are manually set as weights for targeting the associated business goals; 
and train the machine learning model using the reward value as feedback to improve future recommendations of the available games that promote the business goals.  

(Currently Amended) The system of claim 5, wherein the machine learning model uses a reinforcement learning algorithm to select the personalized set of games.  


(Original) The system of claim 5, wherein the personalized set of games includes ranking.  

(Original) The system of claim 7, wherein the personalized set of games is displayed using heterogenous sizes that depend on the ranking of the personalized set of games.  


(Currently Amended) A method, comprising: Application No.: 16/834,815 Attorney Docket No.: 407780-US-NP Page 4 of 11
receiving user information about a user; 
inputting a user vector based on the user information to a reinforcement learning model; 
recommending personalized contents for the user from available contents, the personalized contents being output by the reinforcement learning model; 
receiving an action relating to a selected content from the personalized contents; calculating a reward value for the action by using a reward function that includes terms associated with goals and tuning parameters associated with the terms, the tuning parameters being manually set as weights for targeting the goals; 
and training the reinforcement learning model using the reward value as feedback to select future personalized contents that further the goals.  

(Currently Amended) The method of claim 9, further comprising: 
monitoring content features associated with the available contents including the selected content; 
and automatically adjusting the reward function based on a particular monitored content feature associated with the selected content.  

(Currently Amended) The method of claim 10, wherein the reward function is based on an average of a particular monitored feature associated with the available contents.  

(Currently Amended) The method of claim 9, further comprising: Application No.: 16/834,815 Attorney Docket No.: 407780-US-NP Page 5 of 11

monitoring user features associated with the user; and 
automatically adjusting the reward function based on a particular monitored user feature associated with the user.  

(Canceled)  

(Currently Amended) The method of claim 9, wherein the goals include one or more of monetization, engagement, inclusiveness, safety, or toxicity.  

(Canceled)  

(Currently Amended) The method of claim 9, wherein the tuning parameters are automatically adjusted based on time using a machine learning model. 

 
(Original) The method of claim 9, wherein the reward function is based on a probability of the user who performed the action will perform a subsequent action.  

(Original) The method of claim 17, wherein the subsequent action includes one or more of purchasing the selected content, purchasing another content, or playing the selected content.  


19. (Canceled)  


20. (Currently Amended) The method of claim 9, wherein training the reinforcement learning model includes adapting the reinforcement learning model to increase an occurrence of the action.  

21. (New) The computer-readable storage medium of claim 1, wherein the Application No.: 16/834,815 Attorney Docket No.: 407780-US-NP Page 6 of 11reinforcement learning model selects the set of personalized contents based at least on randomness.  



23. (New) The system of claim 7, wherein the personalized set of games is displaying using heterogenous levels of interaction that depends on the ranking of the personalized set of games.  

24. (New) The method of claim 9, wherein the reward function includes one or more of: an estimated value of the action for a particular content, a probability of the action converting to a particular goal, a utility of the particular goal for the particular content, or an average utility of the particular goal for the available contents.



Reasons for Allowance

Claims 1, 3-12, 14, 16-18, 20-24 are allowed.

The following is a statement of reasons for the indication of allowable subject matter: Claims 1, 3-12, 14, 16-18, 20-24 are eligible under 35 USC 101.  The following combination of claimed limitations of the independent claims that are included in the latest claim language integrate the recited judicial exception into a practical application: adapting the reinforcement learning model based on the reward value to increase a probability of future occurrences of the user action that help achieve the monetization business goals and the engagement business goals/train the machine learning model based on the reward value as .

Therefore, the Examiner understands the claimed subject matter to thereby be patent eligible.

Arora (20170061528) teaches receiving content information regarding available contents/receiving user information/selecting a set of personalized contents for the user from the available contents using a reinforcement learning model/recommending personalized contents for the user from available contents by using a reinforcement learning model. However, it lacks the remaining claimed features of the independent claims.
Grimes (20140242847) teaches receiving a user action in response to a presentation of the set of personalized contents/calculate a reward value for the user action based on a reward function/adapt the reinforcement learning model based on the reward value. However, it lacks the remaining claimed features of the independent claims.
	Wang (20160188725) teaches receiving game information regarding available games/receiving user information regarding a user/recommend a set of personalized games to the user from the available games using a machine learning model/receiving a user action associated with a selected game from the personalized set of games. However, it lacks the remaining claimed features of the independent claims.

4.	When taken as a whole, the claims are not rendered obvious as the available prior art does not suggest or otherwise render obvious the noted features nor does the available prior art suggest or otherwise render obvious further modification of the evidence at hand. Such modifications would require substantial reconstruction relying solely on improper hindsight bias, and thus would not be obvious.

5.	Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee. Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”



Conclusion
6.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDRU CIRNU whose telephone number is (571)272-7775.  The examiner can normally be reached on 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ilana Spar, can be reached on (571) 270-7537.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


2/15/2022
/ALEXANDRU CIRNU/
Primary Patent Examiner, Art Unit 3622